brucethemoose

joined 1 year ago
MODERATOR OF
[–] brucethemoose@lemmy.world 1 points 7 hours ago* (last edited 7 hours ago)

16GB

Is it 3000 series or newer?

If so, with exllamav3, you can squeeze 32Bs in that 16GB card with relatively little loss. For instance: https://huggingface.co/turboderp/EXAONE-4.0-32B-exl3/tree/3.0bpw

The 3bpw weights are 13 GB, say another 1.5GB for some q5_q4 context, and you are looking at 14.5GB-15GB or so. It will be tight, but it will be leagues smarter than 14Bs.

24B Mistral models will fit much more easily. No need to CPU offload those on a 16GB card, you just need to be careful with your settings.

[–] brucethemoose@lemmy.world 3 points 9 hours ago* (last edited 9 hours ago)

The link is in the webpage, but it might be bugged? It's not visible in vanilla chrome or firefox for me (or maybe just not visible on linux).

https://webinstallers.gog-statics.com/download/GOG_Galaxy_2.0.exe

That's from the page's source.

[–] brucethemoose@lemmy.world 2 points 1 day ago

It’s a GitHub link, what did you expect?

[–] brucethemoose@lemmy.world 1 points 1 day ago* (last edited 1 day ago)

It should work in any generic cuda container, but yeah it’s more of a hobbyist engine. Honestly I just run it raw since it’s dependency free, except for system CUDA.

Vllm absolutely cannot CPU offload AFAIK, but small models will fit in your vram with room to spare.

[–] brucethemoose@lemmy.world 3 points 1 day ago* (last edited 1 day ago)

I have not had to really mess with CachyOS for over a year, while “stable” distros were a nightmare for me.

…Yeah it just depends what you’re trying to get your system to do. Arch can range from incredibly hazardous to “it just works” depending on the person and thing, and so can Mint. I think most distros should be viewed that way.

[–] brucethemoose@lemmy.world 3 points 2 days ago

I have trouble looking at my life when things are bad, TBH, especially dull/isolated kind of bad. It is not pleasant to write.

Writing some fiction for the sake of writing (and weaving personal issues in) has actually lead to some insights, though. And maybe some feeling/reality processing going forward. And happiness.

And when things get better, I want to journal (which I have failed to make time for in the past).

[–] brucethemoose@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

We will eventually have to change our economic system and adapt one with a much much lower consumption rate, figure ways to limit our population growth, or more than likely both.

Of course. 100% agree with this, even if better technology helps. It will have to be pretty soon.

But in the very short term? This is going to be a disaster, and the human population is shooting itself in the foot by not accepting immigration from extreme birth rate countries (where overpopulation is indeed an issue).

[–] brucethemoose@lemmy.world 11 points 2 days ago

I get the joke, but how does onlyfans slide by?

https://www.reuters.com/world/us/us-whistleblower-says-mastercard-visa-failed-stop-payments-child-sex-abuse-2025-01-24/

WTF? Do they have dirt on finance execs or something?

…Actually that would make a lot of sense…

[–] brucethemoose@lemmy.world 4 points 2 days ago

Thanks, I nearly choked on my drink imagining that…

[–] brucethemoose@lemmy.world 5 points 2 days ago

Thanks for the TED talk (really)

[–] brucethemoose@lemmy.world 2 points 2 days ago* (last edited 2 days ago)

You have to “unlock” them with a lot of tweaks. And to be clear, I’m just saying they’re better than Windows. Ugh, trying to compile anything on Windows…

Hardware wise, they’re far better for local code assistants, too, with the exception of a few exotic AMD laptops just now coming out.

[–] brucethemoose@lemmy.world 1 points 2 days ago* (last edited 2 days ago) (2 children)

I cannot concentrate with music playing.

Literally anything else? Like a truck crashing through the window or someone literally trying to engage me? Zero lost focus, to the extent I was initially diagnosed with a hearing disorder.

Not that that helps get stuff done…

 

"We're seeing a unifying moment. The band is back together," MAGA podcaster Jack Posobiec told Axios.

"He gets attacked just relentlessly by the Wall Street Journal in such an uncalled for way, and we have his back 100% against this smearing and this slandering," Charlie Kirk added on his show.

 

Similar to: https://lemmy.world/post/32961209

But I find the extra quotes interesting:

Two sources told Axios the plan would include long-range missiles that could strike deep inside Russia.

Trump said Monday that whenever he speaks to Putin, "I always hang up and say, 'Well, that was a nice phone call.' And then missiles are launched into Kyiv or some other city. And after that happens three or four times, you say, 'Talk doesn't mean anything.'"

A bill circulating in the Senate would impose 500% tariffs on countries that buy Russian oil, but Trump suggested that number was too high and that he could impose 100% tariffs without Senate approval.

 

As to why it (IMO) qualifies:

"My children are 22, 25, and 27. I will literally fight ANYONE for their future," Greene wrote. "And their future and their entire generation's future MUST be free of America LAST foreign wars that provoke terrorists attacks on our homeland, military drafts, and NUCLEAR WAR."

Hence, she feels her support is threatening her kids.

"MTG getting her face eaten" was not on my 2025 bingo card, though she is in the early stage of face eating.

 

"It's not politically correct to use the term, 'Regime Change' but if the current Iranian Regime is unable to MAKE IRAN GREAT AGAIN, why wouldn't there be a Regime change??? MIGA!!

 

Video is linked. SFW, but keep your volume down.

 
  • The IDF is planning to displace close to 2 million Palestinians to the Rafah area, where compounds for the delivery of humanitarian aid are being built.
  • The compounds are to be managed by a new international foundation and private U.S. companies, though it's unclear how the plan will function after the UN and all aid organizations announced they won't take part
 

Qwen3 was apparently posted early, then quickly pulled from HuggingFace and Modelscope. The large ones are MoEs, per screenshots from Reddit:

screenshots

Including a 235B/22B active and a 30B/3B active.

Context appears to 'only' be 32K unfortunately: https://huggingface.co/qingy2024/Qwen3-0.6B/blob/main/config_4b.json

But its possible they're still training them to 256K:

from reddit

Take it all with a grain of salt, configs could change with the official release, but it appears it is happening today.

 

This is one of the "smartest" models you can fit on a 24GB GPU now, with no offloading and very little quantization loss. It feels big and insightful, like a better (albeit dry) Llama 3.3 70B with thinking, and with more STEM world knowledge than QwQ 32B, but comfortably fits thanks the new exl3 quantization!

Quantization Loss

You need to use a backend that support exl3, like (at the moment) text-gen-web-ui or (soon) TabbyAPI.

 

"It makes me think that maybe he [Putin] doesn't want to stop the war, he's just tapping me along, and has to be dealt with differently, through 'Banking' or 'Secondary Sanctions?' Too many people are dying!!!", Trump wrote.

 

The U.S. expects Ukraine's response Wednesday to a peace framework that includes U.S. recognition of Crimea as part of Russia and unofficial recognition of Russian control of nearly all areas occupied since the 2022 invasion, sources with direct knowledge of the proposal tell Axios.

What Russia gets under Trump's proposal:

  • "De jure" U.S. recognition of Russian control in Crimea.
  • "De-facto recognition" of the Russia's occupation of nearly all of Luhansk oblast and the occupied portions of Donetsk, Kherson and Zaporizhzhia.
  • A promise that Ukraine will not become a member of NATO. The text notes that Ukraine could become part of the European Union.
  • The lifting of sanctions imposed since 2014.
  • Enhanced economic cooperation with the U.S., particularly in the energy and industrial sectors.

What Ukraine gets under Trump's proposal:

  • "A robust security guarantee" involving an ad hoc group of European countries and potentially also like-minded non-European countries. The document is vague in terms of how this peacekeeping operation would function and does not mention any U.S. participation.
  • The return of the small part of Kharkiv oblast Russia has occupied.
  • Unimpeded passage of the Dnieper River, which runs along the front line in parts of southern Ukraine.
  • Compensation and assistance for rebuilding, though the document does not say where the funding will come from.

Whole article is worth a read, as it’s quite short/dense as Axios usually is. For those outside the US, this is an outlet that’s been well sourced in Washington for years.

 

Seems there's not a lot of talk about relatively unknown finetunes these days, so I'll start posting more!

Openbuddy's been on my radar, but this one is very interesting: QwQ 32B, post-trained on openbuddy's dataset, apparently with QAT applied (though it's kinda unclear) and context-extended. Observations:

  • Quantized with exllamav2, it seems to show lower distortion levels than nomal QwQ. Its works conspicuously well at 4.0bpw and 3.5bpw.

  • Seems good at long context. Have not tested 200K, but it's quite excellent in the 64K range.

  • Works fine in English.

  • The chat template is funky. It seems to mix up the and <|think|> tags in particular (why don't they just use ChatML?), and needs some wrangling with your own template.

  • Seems smart, can't say if it's better or worse than QwQ yet, other than it doesn't seem to "suffer" below 3.75bpw like QwQ does.

Also, I reposted this from /r/locallama, as I feel the community generally should going forward. With its spirit, it seems like we should be on Lemmy instead?

 

So I had a clip I wanted to upload to a lemmy comment:

  • Tried it as an (avc) mp4... Failed.
  • OK, too big? I shrink it to 2MB, then 1MB. Failed.
  • VP9 Webm maybe? 2MB, 1MB, failed. AV1? Failed.
  • OK, fine, no video. Lets try an animated AVIF. Failed. It seems lemmy doesn't even take static AVIF images
  • WebP animation then... Failed. Animated PNG, failed.

End result, I have to burden the server with a massive, crappy looking GIF after trying a dozen formats. With all due respect, this is worse than some aging service like Reddit that doesn't support new media formats.

For reference, I'm using the web interface. Is this just a format restriction of lemmy.world, or an underlying software support issue?

view more: next ›