RoundSparrow

joined 2 years ago
MODERATOR OF
[–] RoundSparrow@lemmy.ml 1 points 2 years ago (5 children)

It makes sense, but there are indexes.

As the subject of the post says... it is JOIN behavior that's the problem. The queries work perfectly fine when you ask for posts without doing JOIN to a bunch of empty tables.

[–] RoundSparrow@lemmy.ml 3 points 2 years ago* (last edited 2 years ago)

Do you have any opinions on how they should work?

given lemmy's heavy focus on purging a user's content on account delete, I think comment links should include the username.

lemmy.world/the_username@home_instance.tld/comment/xxxx

This also gives self-promotion to the person, a sense of identity on their content.

[–] RoundSparrow@lemmy.ml 3 points 2 years ago (2 children)

A couple weeks ago there was a pull request to make it default, https://github.com/LemmyNet/lemmy/pull/3743

But i don't think it's in now.

[–] RoundSparrow@lemmy.ml 3 points 2 years ago (1 children)

cool. I'm in the camp that the feature causes more problems than is useful. Entirely hiding posts of another language would not be my choice, it should at least say "there are 5 comments you do not see because of your language preferences".

[–] RoundSparrow@lemmy.ml 2 points 2 years ago (3 children)

it does show the comments on incognito

That means it is tied to your login, something is set in your preferences so that you aren't seeing content.

Picking "undetermined" isn't the same as having never touched it. There is a steady stream of people who accidentally touch the language settings and say they don't see anything.

People have called out how confusing the whole thing is: https://lemmy.world/post/523012

Sorry I can't be specific on how to click and what works, I just know you aren't the first to get their account where it stops showing routine content.

[–] RoundSparrow@lemmy.ml 3 points 2 years ago (7 children)

Without more details, it's hard to know what you are describing. Do you mean comments from other people on posts you make? or your own comments?

I assume you are talking about the main webapp, lemmy-ui, and not a smartphone app or other front-end? The first thing to try is anonymous reading of the same post - are the comments there for incognito mode?

I'ts not uncommon for people on Lemmy to set their languages to something odd and and Lemmy will hide a lot of content.

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (7 children)

Again though, that sounds like something that can be indexed.

I don't get what you are suggesting. There are INDEX.

The problem being addressed is that there is no WHERE clause that actually limits the posts.

JOIN is done on a table without first eliminating rows... and that worked OK when there was only 50,000 posts in the database, but now that it is over 1 million rows - it is causing major performance problems.

[–] RoundSparrow@lemmy.ml 3 points 2 years ago

Better file a bug about that

Bug has been open for weeks. Doesn't seem to be a priority to undo the added HTML sanitation problems.

The code I pasted here on Lemmy is also on GitHub comment without being munged: https://github.com/LemmyNet/lemmy/pull/3865#issuecomment-1683324467

It's still mostly like the original function it is copied from, just some of the logic has been successfully stripped out. But it's probably worth looking at the bigger picture of just how much of the Rust conditional logic goes into building this SQL statement.

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (9 children)

You might have to denormalize the comment count (put it in a column in the threads table

When browsing a community, the concern is 'post', not 'comment' or threads. So this doesn't really come into play. If anything, Reddit/Lemmy style system is focused on the latest vote count, not the number of comments....

[–] RoundSparrow@lemmy.ml 1 points 2 years ago* (last edited 2 years ago)

A core design issue of either approach is that server operators can modify the building of this data without needing to modify or restart the lemmy_server Rust code.

Using a smallint also gives some flexibility (or a new field if going with the id min max approach).... if page greater than 10 for a particular sort, go to include > 1 and fall into tiers.

[–] RoundSparrow@lemmy.ml 1 points 2 years ago* (last edited 2 years ago) (1 children)

An even less-intrusive approach is to not add any new field to existing tables. Establish a reference table say called include_range. There is already an ENUM value for each sort type, so include_range table with these columns: sort_type ENUM, lowest_id BigInt, highest_id BigInt

Run a variation of this to populate that table:

FROM
  (
     SELECT id, community_id, published,
        rank() OVER (
           PARTITION BY community_id
           ORDER BY published DESC, id DESC
           )
     FROM post_aggregates) ranked_recency
WHERE rank <= 1000

Against every sort order, including OLD. Capture only two BigInt results: the MIN(id) and the MAX(id) - that will give a range over the whole table. Then every SELECT on post_aggregates / post table includes a WHERE id >= lowest_id AND id <= highest_id

That would put in a basic sanity check that ages-out content, and it would be right against the primary key!

 

My concern is that Lemmy is not scaling and was not tested with enough postings in the database. These "nice to have" slick UI features might have worked when the quantity of postings was much smaller, but it puts a heavy real-time load on the database to search postings that keep growing in table size every day.

I also suggest that this kind of feature be discussed with smartphone app and laternate webapp creators - as it can really busy up a server dong text pattern matches on all prior posting content.

 

Often the error messages are not of the full-page "Error" format ,but toast messages on the bottom left corner of the webpage saying that JSON can not parse "Time..." responses from the server.

 

In June, the word form the project has generally been "go create empty new instances to solve scaling", and now there are a lot of instances that may be giving up or shutting down in the coming months.

A procedure and code for moving a community to a new home is a wish.

 

At minimum, the Instance installer guides and documentation needs to reflect that your own server doing real-time searches should be a conscious choice.

Lemmy.ml is erroring out already on searches, GitHub issue: https://github.com/LemmyNet/lemmy/issues/3296

And if I'm seeing things right, 0.18 changed the loging from 0.17 where searching for a community https://lemmy.ml/communities now does a full-text posting/comment search? It gives a ton of output I don't want, and it's hitting the database even harder. Instead of just searching a single database table, now it is going into comments and postings.

Reddit was legendary for not having their own in-build comment and posting search engine up to date - and also used PostgreSQL (at least they did when still open source). It is a huge amount of data and I/O processing.

 

Can we get some Rust guys to chime in on the SQL wrapping that Rust is doing?

Can I butcher the Lemmy code to get the Rust code to measure the time it is taking for each SQL activity and return it in the API calls?

Also the outbound and inbound federation connections to peer servers. Can we hack in some logging on how long each of these is taking, outside the RUST_LOG stuff going on?

 

The first on the list is indeed "active", but the next 3 are all pretty stale.

view more: ‹ prev next ›