“The dataset is too large to be of any realistic use as part of any effort to crack a given hash — it's simply too much low-quality data to successfully use in attacks — and the value of the data is negligible compared to good prepared wordlists and rulesets in the hands of a capable actor," says Darren James, a senior product manager at Specops Software.
At 10 billion lines, you might have better luck just brute forcing the old fashioned way...
Even if the writer signed off on this, this should be illegal.
Actually make every AI generated poop have to list their sources. Corps wanted strong copyright laws, here it is.