It's a big problem with reddit like forums where the comments are highlighted and much more densely presented than the article itself.
Older forums avoided this (unintentionally) by not having any sorting on replies and keeping the density of replies, low.
I think there are graph/tree based solutions for rag, which ideally should have the books at leaf nodes, and their overlapping summaries as parent nodes.
Kinda like BST for books to rag from.