overview for RoundSparrow

If everyone was spread out onto different instances, and communities were based all over the fediverse, the decisions of one instance would be less impactful. in c/memes@lemmy.ml

[–] RoundSparrow@lemmy.ml 5 points 2 years ago (3 children)

If everyone was spread out onto different instances

Each instance with an owner/operator making rules... that the average social media user walks in, orders a drink, and starts smoking without any concern that neither one may be allowed. People can be loyal to their media outlets even when it is beyond obvious they are bad. People raised on storybooks that endorse bad behaviors and values, HDTV networks, and social media too. Audience desire to "react comment" to images and not actually read what others have commented - nor learn about the venue operators and reasons for rules is pretty much the baseline experience in 2023.

Lemmy since the reddit collapse in c/memes@lemmy.ml

[–] RoundSparrow@lemmy.ml 11 points 2 years ago (1 children)

When it comes to media attraction, what they call themselves (labels) don't really matter that much. It's the praise of strong men, authority, that crosses all mythological media systems. Be it bowing down to a burning bush story, Fox News, or Kremlin.

Community discovery on self hosted Lemmy instances in c/selfhosted@lemmy.world

[–] RoundSparrow@lemmy.ml 6 points 2 years ago

Keep in mind that you’re going to be retrieving and storing a huge amount of data running these scripts

And you are adding to the overload of lemmy.world, beehaw, lemmy.ml, etc who have all the popular content communities. Federation has a lot of overhead, as does having to distribute a community one vote at a time to 500 subscribed servers.

Community discovery on self hosted Lemmy instances in c/selfhosted@lemmy.world

[–] RoundSparrow@lemmy.ml 1 points 2 years ago

pend my time on Lemmy scrolling “All”, which I think is a pretty common thing.

There was a lot of advice handed out back in June that the answer to scaling Lemmy was to go create instances. The reason it works is because "All" is empty on a virgin system ;) With no data in the database, the logic Lemmy hands to PostgreSQL works real fast ;)

Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure in c/postgresql@programming.dev

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (1 children)

I found the total table update wasn't as bad performing as I thought and the API gateway was timing out. I'm still generating larger amounts of test data to see how it performs in edge worst-case situations.

API Documentation for those of you that are looking to create your own app in c/lemmydev@lemm.ee

[–] RoundSparrow@lemmy.ml 2 points 2 years ago (1 children)

@Prefix@lemm.ee - maybe pin this for a while?

Maximum number of user-blockable communities? in c/lemmy_support@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (1 children)

otherwise that would be an easy exploit, just creating a practically infinitely long block list to crash the instance.

Perhaps you are unaware of just how unstable Lemmy has been since late May.

Maximum number of user-blockable communities? in c/lemmy_support@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (4 children)

I’m asking because I used to run into the maximum number of blocks on my Reddit client

I think they used the multi-reddit list as the back-end for subreddit block list. I too ran into the limits.

I don't think Lemmy has any limit, but performance will likely degrade and it is entirely possible that your personal block list causes overload of servers with lots of data in them.

Upvote and Comment amount is inconsistent between my instance and other public ones in c/lemmy_support@lemmy.ml

[–] RoundSparrow@lemmy.ml 5 points 2 years ago* (last edited 2 years ago) (1 children)

I just set up

just... it does not backfill previous content and votes. What time window are you talking about here?

The whole design is not what I would describe as robust and chock full of features. A lot of people have pushed federation as the key to scalability (now with over 1500 lemmy instances online ) when it is one of the least-mature parts of the code and carries a lot of overhead. Having 500 servers all subscribing to the same communities on lemmy.world for a dozen readers is causing problems with outbound distribution.

Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure in c/postgresql@programming.dev

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (4 children)

I agree there is potential to reuse the child_count from child/grandchild rows. But there has to be some sense to the order they are updated in so that the deepest child gets count updated first?

Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure in c/postgresql@programming.dev

[–] RoundSparrow@lemmy.ml 1 points 2 years ago

So it turns out that the query was finishing within minutes and the API gateway was timing out. Too many Lemmy SQL statements in my head. On a test system update of all the comment second run just took under 17 seconds for 313617 rows that has some decent reply depth, so it isn't as bad as I thought.

Lemmy server mass update of comment reply (child) count with PostgreSQL ltree structure in c/postgresql@programming.dev

[–] RoundSparrow@lemmy.ml 1 points 2 years ago* (last edited 2 years ago) (6 children)

given it traverses all the comment X comment space every time a comment is added.

The second query I shared is only referenced for maintenance rebuild. The routine update of count does target only the tree that the reply is to:

 select c.id, c.path, count(c2.id) as child_count from comment c
 join comment c2 on c2.path &lt;@ c.path and c2.path != c.path
 and c.path  &lt;@ '0.1'
group by c.id

I found a particularly complex tree with 300 comments. In production database (with generated test data added for this particular comment tree), it is taking .371 seconds every time a new comment is added, here is the result of the SELECT pulled out without the UPDATE:

Obviously with the UPDATE it will take longer than .371 seconds to execute.

29

PSA: The Lemmy federation convention of hotlinking images to other peer federation servers makes it easy for a rogue instance to collect end-user IP addresses & browser strings, don't assume otherwise (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmy_support@lemmy.ml

25 comments fedilink

If you visit a popular community like /c/memes@lemmy.ml with your web browser, the images shown are hotlinked from the Lemmy instance that the person posting the image utilized. This means that your browser makes a https request to that remote server, not your local instance, giving that server your IP address and web browser version string.

Assume that it is not difficult for someone to compile this data and build a profile of your browsing habits and patterns of image fetching - and is able to identify with high probability which comments and user account is being used on the remote instance (based on timestamp comparison).

For example, if you are a user on lemmy.ml browsing the local community memes, you see postings like these first two I see right now:

You can see that the 2nd one has a origin of pawb.social - and that thumbnail was loaded from a sever on that remote site:

https://pawb.social/pictrs/image/fc4389aa-bd4f-4406-bfd6-d97d41a3324e.webp?format=webp&thumbnail=256

Just browsing a list of memes you are giving out your IP address and browser string to dozens of Lemmy servers hosted by anonymous owner/operators.

6

People Can Be Convinced They Committed a Crime That Never Happened (Also known as "original sin" in Levant mythology) (www.psychologicalscience.org)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/atheism@lemmy.ml

1 comments fedilink

3

Deleting a user with lots of comments (1,59k) modding a few large communities (~20) kills the backend · Github Issue #3165 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

I think it would be ideal if we discuss strategies to find these kind of database failures in the logs or even share tips on better logging design to make these kind of database failures surface to instance operators/admins.

0

Scaling federation · GitHub Issue #3062 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

This closed issue has some insight into performance issues with 100+ peering partners in federation.

3

Proposal: Lemmy incoming federation data, queue to database INSERT into comments, votes and other large tables (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

2 comments fedilink

I suggest Lemmy incoming federation inserts of votes, comments, and possibly postings - be queued so that concurrent INSERT operations into these very large database tables be kept linear so that local-instance interactive web and API (app) users are given performance priority.

This could also be a way to keep server operating costs more predictable with regard to using cloud-services for PostgreSQL.

There are several approaches that could be taken: Message Queue systems, queue to disk files, queue to an empty PostgreSQL table, queue to another database system such as SQLite, etc.

This would also start the basis for being able to accept federation incoming data while the PostgreSQL is down / website is offline for upgrades or whatever.

I would also suggest code for incoming federation data be moved to a different service and not run in-process of lemmy_server. This would be a step towards allowing replication integrity checks, backfill operations, firewall rules, CDN bypassing, etc

EDIT: And really much of this applies to outgoing, but that has gotten more attention in 0.17.4 time period - but ultimately I was speculating that the incoming backend transactions are a big part of why outbound queues are bunching up so much.

3

Lemmy Explorer (lemmyverse.net) to view instances and communities (lemmyverse.net)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmy_instances@lemmy.ml

0 comments fedilink

0

webapp performance concern - option for admins (and users?) to turn off home page right-column "trending communities" and "subscribed communities" (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

Every page hit on the posting listings / page is dynamically rendering these, doing a database query?

1

As of the time of this posting, lemmy.ml is throwing nginx 500 errors about 1 out of 10 page refreshes for me in the past hour. It is always fast to respond, so it isn't a timeout... (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/meta@lemmy.ml

0 comments fedilink

Using a desktop web browser...

It doesn't seem like a timeout, it is always fast to respond with the error and also fast on successful loading pages. It seems to me there is some kind of resource/parameter starvation in nginx or the nginx bridges to the NodeJS app...

I'm also seeing problems with static content loading, images and css files not loading and getting a mangled half-generated page or missing icons for upvote/downvote that I have to refresh the browser with. This happens at least 1 in 8 refreshes of the page.

1

Database: Large amount of timeouts observed no matter how large pool_size is or max_connections. · GitHub Issue #3112 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

0

Scheduled tasks create database deadlocks when running multiple loadbalanced lemmy_server processes · GitHub Issue #3076 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

1

PostgreSQL for Lemmy instance installers/operators/upgraders (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

1 comments fedilink