FaceDeer

joined 2 years ago
[–] FaceDeer@fedia.io 59 points 7 months ago (2 children)
  • Dementia-addled brain latches onto a stupid obsession for random reasons and everyone ends up trying to figure out what 4-D chess move is really behind it.

It's still a problem the world needs to deal with, but I don't think it's as deep as everyone thinks.

[–] FaceDeer@fedia.io 2 points 7 months ago

Who's paying for the electricity that it is using, then?

[–] FaceDeer@fedia.io 20 points 7 months ago (1 children)

I was reading the other day about advances in zinc ion batteries as a possible replacement for lithium ion batteries in applications like this. They're heavier than lithium ion, which is just fine for energy storage facilities like this, but they retain their capacity through a lot more charge/discharge cycles (the article I was reading said they drop to 80% capacity after 100,000 cycles - if that's one cycle a day then that's nearly 300 years) and most importantly for this specific situation they're not flammable.

[–] FaceDeer@fedia.io 6 points 7 months ago (1 children)

No, at best the genocide switched out of "fast mode" and back into "slow mode" again.

[–] FaceDeer@fedia.io 1 points 7 months ago

The site producing the nonsense has to produce lots of it any time a bot comes along, the trainers only have to filter it once. As others have pointed out it's likely easy for an automated filter to spot. I don't see it as being a clear win.

[–] FaceDeer@fedia.io 3 points 7 months ago

It's a blow to the big closed-source AI companies, sure, but hardly a knockout one. If a small company can use a million dollars to produce a neat model perhaps a big company can use those same techniques and a billion dollars to produce a really neat model. Or at least build a lot more of the infrastructure that goes around those models and makes use of them. Code Copilot isn't just selling a raw LLM API, they're selling its integration into the Microsoft coding ecosystem. They may have wasted some money on their current-generation AIs but that's just sunk cost. They've got more money to spend on future AIs.

The main problem will be if Western AI companies are prevented from adapting the techniques being used by these Chinese AI companies. If, for example, there are lots of onerous regulations on what training data can be used or requiring extreme "safety guardrails." The United States seems likely to be getting rid of a lot of those sorts of obstructions over the next few years, though, so I wouldn't count the West out yet.

[–] FaceDeer@fedia.io 9 points 7 months ago

I think it was the 1B model

Well there you go, you took a jet ski and then complained that it was having difficulty climbing steep inclines in mountains.

Small models like that are not going to "know" much. Their purpose is generally to process whatever information you give them. For example you could use one to quickly and cheaply categorize documents based on their contents, or use one as a natural-language interface you could use to ask it to execute commands on other tools.

[–] FaceDeer@fedia.io 2 points 7 months ago

I'd be happy with it. It means that the universe is ours for the taking and the future will belong to our descendants.

If there are already intelligent aliens "out there" then they've got millions or billions of years' head start on us and we'll never catch up, we'd be completely at their mercy.

[–] FaceDeer@fedia.io 2 points 7 months ago (2 children)

No, a few million hits from bots is routine for anything that's facing the public at all. Others have posted on this thread (or others like it, this article's been making the rounds a lot in the past few days) that even the most basic of sites can get that sort of bot traffic, and that it's just a simple recursion depth limit setting to avoid the "infinite maze" aspect.

As for AI training, the access log says nothing about that. As I said, AI training sets are not made by just dumping giant piles of randomly scraped text on AIs any more. If a trainer scraped one of those "infinite maze" sites the quality of the resulting data would be checked, and if it was generated by anything remotely economical for the site to be running it'd almost certainly be discarded as junk.

[–] FaceDeer@fedia.io 1 points 7 months ago

An even easier way to hide stuff is to not put it online in the first place.

view more: ‹ prev next ›