this post was submitted on 27 Jan 2025
201 points (90.4% liked)

196

18159 readers
296 users here now

Be sure to follow the rule before you head out.


Rule: You must post before you leave.



Other rules

Behavior rules:

Posting rules:

NSFW: NSFW content is permitted but it must be tagged and have content warnings. Anything that doesn't adhere to this will be removed. Content warnings should be added like: [penis], [explicit description of sex]. Non-sexualized breasts of any gender are not considered inappropriate and therefore do not need to be blurred/tagged.

If you have any questions, feel free to contact us on our matrix channel or email.

Other 196's:

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] BB84@mander.xyz 3 points 6 months ago (11 children)

@jerryh100@lemmy.world Wrong community for this kind of post.

@BaroqueInMind@lemmy.one Can you share more details on installing it? Are you using SGLang or vLLM or something else? What kind of hardware do you have that can fit the 600B model? What is your inference tok/s?

[–] BaroqueInMind@lemmy.one 3 points 6 months ago* (last edited 6 months ago) (3 children)

I'm using Ollama, a single GPU with 10Gb of VRAM

[–] BB84@mander.xyz 1 points 6 months ago (1 children)

You're probably running one of the distillations then, not the full thing?

[–] BaroqueInMind@lemmy.one 1 points 6 months ago (1 children)

What's the difference? Does the full thing not have censorship?

[–] BB84@mander.xyz 1 points 6 months ago

That's why I wanted to confirm what you are using lol. Some people on Reddit were claiming the full thing, when run locally, has very little censorship. It sounds somewhat plausible since the web version only censors content after they're generated.

load more comments (1 replies)
load more comments (8 replies)