this post was submitted on 03 Nov 2025
14 points (100.0% liked)

TechTakes

2276 readers
145 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
 

Want to wade into the sandy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

you are viewing a single comment's thread
view the rest of the comments
[–] sc_griffith@awful.systems 6 points 6 hours ago (3 children)

wild article about content scraping nonprofit common crawl

https://www.theatlantic.com/technology/2025/11/common-crawl-ai-training-data/684567/?gift=iWa_iB9lkw4UuiWbIbrWGQv84IP0_-K67yuVC013Fx4

tl;dr they've been faking deleting data upon request (in ways that I find very funny) and their head is noxious even for a tech bro

also is it just me or does SV have a particular gift for perverting the nonprofit concept

[–] froztbyte@awful.systems 2 points 2 hours ago

wasn't common crawl the one that pulled a similar trick to goog's "if you label a thing as $x we won't include you"[0]? I could swear I heard their name in association with some derpshit intake management stuff above and beyond the typical fundamental "free/open scraper set" problems

[0] - a tactic google first pulled with Streetview cars pulling in a pile of wifi beacons and tying it to location - "if you don't want it just rename your AP to '{prefix} - {apname}'". a reply that was just dumb and aggravating but also it fucking sucks that basically no standards have taken this problem to heart in the ~15y hence

[–] fullsquare@awful.systems 4 points 4 hours ago

He said that Common Crawl is “making an earnest effort” to remove content but that the file format in which Common Crawl stores its archives is meant “to be immutable. You can’t delete anything from it.”

makes me wonder if it's some crypto hangover

In 2023, he sent a letter urging the U.S. Copyright Office not “to hinder the development of intelligent machines” and included two illustrations of robots reading books.

cheerleaders for creepiest weirdos in sv try to deflect criticism by becoming impossible to parody

[–] fullsquare@awful.systems 4 points 5 hours ago

sv does have for some time a peculiar understanding of this and also some other terms, like "consent", "ownership", "privacy", "safety",