andscape

joined 2 years ago
[–] andscape@feddit.it 1 points 2 years ago (1 children)

Social media and torrents are pretty damn different. There's a reason no federated platform has implemented automatic discovery, even ones with much more resources than Lemmy, like Mastodon.

I don't know why you folks keep pointing at missing features and saying "Lemmy doesn't have this pretty advanced network feature, so it's not really decentralized", or "it cannot organize", or "it's useless"... It's basically two people's passion project that only blew up in the past month because reddit fucked up. You're not paying for it, are you? So I really don't see how this attitude is warranted.

[–] andscape@feddit.it 1 points 2 years ago

Yeah what's being described there is basically a P2P model. I still think it wouldn't make a huge difference in the chattiness of the protocol. At best it would redistribute the load for outgoing federation messages, but not for incoming ones. An instance still has to receive each message individually, regardless of where they comes from.

[–] andscape@feddit.it 1 points 2 years ago (2 children)

I understand the logic, and you're right to think about how improve Lemmy's scalability. But I'm not sure if this is the way to go.

If you build a dedicated federation proxy for an instance, you've really just slightly moved the problem. The federation proxy is going to have the same scalability issues, and if anything the total load goes up.

If you build multi-instance hubs, you suddenly introduce a lot of new issues.

  • Security: I think Lemmy checks the source of an update to verify that it comes from the legitimate host. You would have to introduce some kind of signatures to verify that the activity originated from the legitimate host.
  • Privacy: now your users have to trust the hub owners with their data, not just the instance.
  • Motive: who would be running the hubs, and why? They would have to be even bigger that the instances, and there would be much less incentive to do it.
[–] andscape@feddit.it 2 points 2 years ago* (last edited 2 years ago) (4 children)

Other people in the thread have already made this point: even with a full mesh network, the number of remote calls made for a single activity is equal to the number of instances subscribing to that activity (plus one if the activity originates from an instance that's not the host of the activity).

A hub/spoke model doesn't change this, it just moves the load from the host instance to the hub. The number of connections is still the same: if N instances need to receive the activity, N calls will have to be made. If anything this adds 1 more call from the host instance to the hub.

Even peer-to-peer distribution of activities, mentioned by @hazelnoot@beehaw.org, wouldn't actually change the amount of calls being made. You still have N servers that have to receive the activity, so you need at least N calls overall. What this would do is redistribute the load better over instances, so the host doesn't have to make all N calls. It would definitely be an improvement, but it would not be easy to implement successfully, and it would almost surely break ActivityPub compatibility.

The only thing I can think of that would actually reduce the overall network load, though, is batching: sending multiple activities/updates together in a single message. AFAIK this is not supported by ActivityPub, though, so implementing it would mean breaking compatibility, and also implementing an entirely updated version of the protocol (which is a massive undertaking).

[–] andscape@feddit.it 1 points 2 years ago (3 children)

Sure, but now this system has a dependency on the "centralized" lemmyverse.net service. And also your instance now has to receive and store a copy of almost the entire network's content. Lots of instances are already struggling to sustain the load, this would make the problem even worse.

If a single instance decides that it can sustain the increased load and doesn't mind depending on lemmyverse.net sure, nothing's stopping them. But it shouldn't be the default behavior for all instances.

[–] andscape@feddit.it 0 points 2 years ago (10 children)

In order to avoid this restriction you would need a global instance discovery mechanism, which is extremely hard to implement without a central server that keeps a list of all instances in the network. And if you do implement instance discovery through a central server you really are losing the whole point of decentralization.

Additionally, it's good that each instance does not federate with everyone else by default. If it did, it would have to process all activity and keep a local copy of all the content in the entire network. This would be insanely inefficient, and make it prohibitively expensive to run even a tiny instance with 1 user and no communities.

Decentralization isn't useless if you can't immediately see everything in the network, come on... We're just spoiled by centralized services.

view more: ‹ prev next ›