Good results with this approach. I hadn't considered the RANK OVER PARTITION BY criteria_a values and it works like a champ. It moves the ORDER BY into the realm of focus (criteria_a) and performance seems decent enough... and it isn't difficult to read the short statement.
SELECT COUNT(ranked_recency.*) AS post_row_count
FROM
(
SELECT id, post_id, community_id, published,
rank() OVER (
PARTITION BY community_id
ORDER BY published DESC, id DESC
)
FROM post_aggregates) ranked_recency
WHERE rank <= 1000
;
Gives me the expected results over the 5+ million test rows I ran it against.
If you could elaborate on your idea of TOP, please do. I'm hoping there might be a way to wall the LIMIT 1000 into the inner query and not have the outer query need to WHERE filter rank on so many results?
3 hours later... I put it into code and am experimenting with it. Some proof of concept results: https://github.com/LemmyNet/lemmy/files/12373819/auto_explain_list_post_community_0_18_4_dullbananas_with_inclusion_run0a.txt