overview for RoundSparrow

Will federated material be updated, after a reinstall? in c/lemmy_support@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (6 children)

GitHub issue about this topic: https://github.com/LemmyNet/lemmy/issues/3782

Some of it might be avoided by tweaking the PostgreSQL database with higher key values after the new install, but the whole situation isn't really planned for or recognized in the code

Is it possible to partially disable federation to not pollute the server and keep focus? in c/lemmy_support@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago (8 children)

You really have blind faith that federation even works, when I've been validating data and highlighting that delivery is not reliable when it has so much overhead it crashes servers.

*Permanently Deleted* in c/selfhosted@lemmy.world

[–] RoundSparrow@lemmy.ml 14 points 2 years ago* (last edited 2 years ago) (3 children)

This basically shuts my idea down

it's not very difficult to modify the code for something like this.... and closing off registration wont' let anyone else login and create new content form your istance.

Personally the load on the major servers by having one more instance that subscribes to everything is why I think people should back off from creating more than the 1500 instances Lemmy network already has. Delivery of every single vote, comment, post 24 hours a day just so one person can read content for an hour or two a day.

That makes sense for email systems where all that content doesn't have to be sent, but for Lemmy it's a huge amount of overhead.

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago

ok, I'm going to delete this post. People actually aren't discussing privacy and are just debating if they think Lemmy needs Multi-Reddit. And I just want to get the code finished. I am probably moving ahead on the code with ZERO sharing of any existing data.

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 0 points 2 years ago

dismissing client-side techniques as nonsense without seeming to understand why they were being discussed in the first place.

I'm the one who started a post about a server-side solution that entirely is based on Reddit's code for a server-side solution. YOU are the one coming along with this wild idea that a server change isn't needed at all. yet, you have not demonstrated this wild claim you made!

I’m not interested in any multireddit feature that reduces sub privacy. I’d consider it a net loss for lemmy.

It does NOT require it. I will repeat it: IT IS NOT REQUIRED! It is a sub-feature that facilities better openness that I am suggesting be added as part of the core feature I'm developing.

On Reddit, multi-reddits personal in nature.

10 years ago Reddit announced it as entirely not being personal! That sharing them was the whole point. I again question if you even understand what multi-reddit is!

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago

shouldn’t require relaxing privacy constraints in any case.

It isn't at all essential to the feature.

I have already coded it so that it does NOT require sharing of anyone's data, at all. No way shape or form. I'm proposing it as a discussion topic because it's easy to implement and goes along with the whole spirit of a public forum where people share their public stuff. That people might actually want an easy way to help others out...

But, it's easier for me just to avoid any privacy topic entirely and not allow sharing of anything. Just build the whole design with opt-in only empty list.

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago (2 children)

I’m suggesting that multireddits are a “local” function. Theu are so local that they’re possible without server-side support at all,

Again, how? If I want a blend of 50 different communities, how can Reddit or Lemmy do that without 50 API calls if you do not add server-side MultiReddit code?

50 API calls is the overhead and nonsense that is being avoided here....

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 1 points 2 years ago

It could also be a filtered view based on the subscribed/all feed which provides a single API call that can return material from multiple communities.

"that can return material from multiple communities" - that's exactly how Reddit does multi-reddit, what feature do you think multi-reddit is?

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago

But it should definitely be off by default and have a clear warning when you try to enable it.

I was afraid people would say that. The easier way is to just not touch it at all, as adding new code to opt in/opt out is more Rust code programming that is in rare supply with developers.

The easiest solution is to avoid it and not introduce sharing of personal communities at all. Which was what I was afraid this discussion would yield. So we start fresh with empty MultiPass lists and build them up from scratch.]

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago

the amount of low-effort drive by comments and off-topic posts communities gets just because they are similarly named is bad enough as it is.

which is why I actually want it.

I think a well-cultivated list of quality communities that people share is a means to escape the heavy amount of noise that grew out of the explosion in the number of low-effort barely-any-moderation instances.

Another way to look at this feature is really simple: multiple subscribe lists, the ability to organize what you subscribe to into your cultivated groups. I don't see why anyone thinks a limitation of having only one community list per login is beneficial in organizing the duplicate choices all over the place.

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago* (last edited 2 years ago) (1 children)

why does a multi-reddit need multiple instances to collaborate to create the feed?

by "create the feed", I assume you mean "provide posts" when API call post/list is called?

content is replicated in all federated instances. You only need to use the local copy and merge all the communities of the multi-reddit.

Yes, that is what MultiPass would do, query the local PostgreSQL database. Right now Lemmy only allows this for a single Subscribe/Follow list per user... you have to create 3 different logins if you want 3 different lists of communities. For example, a "games" list, "music" list, "news" list.... Plus, the current design does not accommodate logged-out users, they have no way to list multiple communities (other than "All", local or merged remote+local).

*Permanently Deleted* in c/lemmy@lemmy.ml

[–] RoundSparrow@lemmy.ml 2 points 2 years ago* (last edited 2 years ago) (6 children)

Multi-reddits as they exist on Reddit itself could be implemented entirely client-side, the server side stuff just syncs the behavior of multiple client apps.

Can you explain how? As the only way I can see this is if you did 50 different API requests for all 50 subreddits, merged the results, and then sorted them again by the desired order.

53

Lemmy Rust code needs help, Diesel ORM quick change, adding a select field to SiteAggregates::read (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/rust@lemmy.ml

8 comments fedilink

We have an urgent performance problem to get finished. The SQL changes are fine, but it seems the Lemmy test code in Rust is defective. This test is failing after we fixed a faulty stored procedure function in PostgreSQL: https://github.com/LemmyNet/lemmy/blob/13a866aeb0c24f20ed18ab40c0ea5616ef910676/crates/db_schema/src/aggregates/site_aggregates.rs#L157

The underlying Rust code needs to be enhanced to query the SQL table with SELECT site_aggregates WHERE site_id = 1, hard-coded 1 is fine, that is always the local site in Lemmy.

Can you please detail all the code changes so that the read method takes an integer parameter for site_id field?

https://github.com/LemmyNet/lemmy/blob/13a866aeb0c24f20ed18ab40c0ea5616ef910676/crates/db_schema/src/aggregates/site_aggregates.rs#L10C7-L10C7

Right now the query has no WHERE clause, pulling the first row it gets. Thank you.

7

Lemmy Server and language choices on every individual comment, many rows in the database per-community, per-site, etc. Overheard of a comment INSERT SQL (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

8 comments fedilink

Over a short period of time, this is my incoming federation activity for new comments. pg_stat_statements output being show. It is interesting to note these two INSERT statements on comments differ only in the DEFAULT value of language column. Also note the average execution times is way higher (4.3 vs. 1.28) when the language value is set, I assume due to INDEX updates on the column? Or possibly a TRIGGER?

About half of the comments coming in from other servers have default value.

WRITES are heavy, even if it is an INDEX that has to be revised. So INSERT and UPDATE statements are important to scrutinize.

7

REQUEST community review of Lemmy Server Performance on Post Votes, Comment Votes - the most frequent database writes. Optimize? (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

2 comments fedilink

Given how frequent these records are created, every vote by a user, I think it is important to study and review how it works.

The current design of lemmy_server 0.18.3 is to issue a SQL DELETE before (almost?) every INSERT of a new vote. The INSERT already has an UPDATE clause on it.

This is one of the few places in Lemmy that a SQL DELETE statement actually takes place. We have to be careful triggers are not firing multiple times, such as decreasing the vote to then immediately have it increase with the INSERT statement that comes later.

For insert of a comment, Lemmy doesn't seem to routinely run a DELETE before the INSERT. So why was this design chosen for votes? Likely the reason is because a user can "undo" a vote and have the record of them ever voting in the database removed. Is that the actual behavior in testing?

pg_stat_statements from an instance doing almost entirely incoming federation activity of post/comments from other instances:

DELETE FROM "comment_like" WHERE (("comment_like"."comment_id" = $1) AND ("comment_like"."person_id" = $2)) executed 14736 times, with 607 matching records.
INSERT INTO "comment_like" ("person_id", "comment_id", "post_id", "score") VALUES ($1, $2, $3, $4) ON CONFLICT ("comment_id", "person_id") DO UPDATE SET "person_id" = $5, "comment_id" = $6, "post_id" = $7, "score" = $8 RETURNING "comment_like"."id", "comment_like"."person_id", "comment_like"."comment_id", "comment_like"."post_id", "comment_like"."score", "comment_like"."published" executed 15883 times - each time transacting.
update comment_aggregates ca set score = score + NEW.score, upvotes = case when NEW.score = 1 then upvotes + 1 else upvotes end, downvotes = case when NEW.score = -1 then downvotes + 1 else downvotes end where ca.comment_id = NEW.comment_id TRIGGER FUNCTION update executing 15692 times.
update person_aggregates ua set comment_score = comment_score + NEW.score from comment c where ua.person_id = c.creator_id and c.id = NEW.comment_id TRIGGER FUNCTION update, same executions as previous.

There is some understanding to gain by the count of executions not being equal.

5

Lemmy convention for linking individual posts and comments include the username of post/comment creator, giving people credit in the URL itself (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmywishlist@lemmy.ml

1 comments fedilink

a lemmysever .social / LemmyFanatic / post / xxxxx

Same with comments

219

GREAT NEWS about Lemmy Server Performance, another major SQL mistake has been discovered today: every single comment & post create (INSERT) is updating ~1700 rows in the site_aggregates table (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

31 comments fedilink

Details here: https://github.com/LemmyNet/lemmy/issues/3165

This will VASTLY decrease the server load of I/O for PostgreSQL, as this mistaken code is doing writes of ~1700 rows (each known Lemmy instance in the database) on every single comment & post creation. This creates record-locking issues given it is writes, which are harsh on the system. Once this is fixed, some site operators will be able to downgrade their hardware! ;)

13

FYI: For weeks, lemmy deleting of comments by the user who created the comment has not been working properly. When users were on just a couple severs it worked, but now with over 1,000 servers... (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/mods@lemmy.world

2 comments fedilink

Cite: 1,3xx Lemmy servers federating https://join-lemmy.org/instances

The code in Lemmy does not seem to federate to all the subscribed servers when a end-user on a remote server deletes their own comment. So far, I can't determine if it has always been this way and nobody noticed, or if something broke at some point in Lemmy code changes. Has it been gong on for weeks, or did it always fail?

Please be kind to users, they may have intended to delete their own comment and Lemmy isn't properly removing it on over 1000 servers and only removes it on 2 servers.

This happens when a community is homed on a server different from the server the end-user is commenting (and deleting that comment) on.

Growing pains.

GitHub issue: https://github.com/LemmyNet/lemmy/issues/3625

6

Trying to add a second logging target for Lemmy 0.18.2 Rust code (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/rustlang@lemmyrs.org

1 comments fedilink

I need some help here trying to add a second logging subscriber for a specific target to the Lemmy server Rust code.

Here is the default logging in the app: https://github.com/LemmyNet/lemmy/blob/main/src/lib.rs

Step 1 ==============
I know I have to add another library to log to a file.
cargo add tracing-appender

Step 2 ===============
I know I have to specify how I want the files to work, I found this pile of code:

  let env_filter = EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info"));
  let formatting_layer = fmt::layer().pretty().with_writer(std::io::stderr);
  let log_file_name = Local::now().format("%Y-%m-%d").to_string() + "-apub.log";
  let file_appender = rolling::daily("/home/lemmy/logs", log_file_name);
  let (non_blocking_appender, _guard) = non_blocking(file_appender);
  let file_layer = fmt::layer()
      .with_ansi(false)
      .with_writer(non_blocking_appender);

  Registry::default()
      .with(env_filter)
      .with(ErrorLayer::default())
      .with(formatting_layer)
      .with(file_layer)
      .init();

Now this isn't right, because it registers itself as "default", and I want it to be a Target - and I still want the normal Lemmy logging behavior to exist.

I want the macros to work like:

warn!("this is how normal Lemmy server log entries are created in the current code");  
warn!(target: "apubfile", "this logging entry only goes to the apub file logging using tracing-appender);

Can someone work this out? How to have two subscribers, not just the single default, and how to specify the target: string on the subscriber?

Thank you.

EDIT: ok, I found an example of how to have two logs at the same time, one to file and one to console: https://stackoverflow.com/questions/76042603/how-to-unify-the-time-in-the-console-and-the-file-when-using-tracing-appender -- I still need to figure out how to get this into Lemmy's structure and attach to a "target".

8

Trying to get a better error message out of Lemmy 0.18.1 unmatched enum (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by RoundSparrow@lemmy.ml to c/rustlang@lemmyrs.org

6 comments fedilink

I don't know Rust, but trying to hack on Lemmy 0.18.1 enough to get a better error message out.

error: data did not match any variant of untagged enum AnnouncableActivities

where: crates/apub/src/activities/community/announce.rs, line: 46

https://github.com/LemmyNet/lemmy/blob/0c82f4e66065b5772fede010a879d327135dbb1e/crates/apub/src/activities/community/announce.rs#L46

That seems to be the function parameters themselves?

Is the error caused by RawAnnouncableActivities not matching the enum AnnouncableActivities and the try_into?

  warn!("zebratrace receive {:?}", self);

Works for adding logging, but I'd like the code to log self only when the enum does not match (errors). Thank you.

6

Moderator removal of off-topic posts may not be working on all instances subscribed to a community - anyone confirm? (lemmy.ml)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/mods@lemmy.world

11 comments fedilink

I've been spot-checking consistency of replication between servers by looking at the same community on two Lemmy instances, sorting posts by new.

!asklemmy@lemmy.ml is one of the most active communities and people frequently use it incorrectly, asking technical questions about Lemmy instead of "asking the Lemmy community" for their feedback on a topic.

In spot-checking Lemmy servers, it seems that the moderator removal of the posting is only working on the "home" server, Lemmy.ml - and other servers still show the posting in the New sort list (and I assume other sort choices? unverified).

I opened a GitHub issue about the problem: https://github.com/LemmyNet/lemmy/issues/3535

I also thought this would be a good place to raise the topic. Did anyone recall ever testing and spot-checking that moderation removal of a posting worked on multiple servers before?

Thank you.

8

Lemmy scaling/performance: Move expensive PostgreSQL triggers to scheduled jobs. · GitHub Issue #3528 · LemmyNet/lemmy (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

2 comments fedilink

7

Information Overload - Beehaw style - Beehaw (beehaw.org)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

0 comments fedilink

8

lemmy_server Rust code now exposes internal metrics via Prometheus endpoint (github.com)

submitted 2 years ago by RoundSparrow@lemmy.ml to c/lemmyperformance@lemmy.ml

1 comments fedilink