this post was submitted on 28 Aug 2025
527 points (99.8% liked)

Technology

74585 readers
3951 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] brucethemoose@lemmy.world 2 points 2 hours ago* (last edited 2 hours ago) (1 children)

I am on mobile and can be more detailed later, if you want but the jist is to sign up (with a payment method) to some API service. There are many. Some neat ones include:

  • Openrouter (a gateway to many, many models from many providers, I’d recommend this first)
  • Cerebras API (which is faster than anything and has a generous free tier)
  • Google Gemini, which is free to just try this out on with no credit card.

Some great models to look out for, that you may not know of:

  • GLM 4.5 (my all-around favorite)

  • Deepseek (and its uncensored finetunes)

  • Kimi

  • Jamba Large

  • Minimax

  • InternLM for image input

  • Qwen Coder for coding

  • Hermes 405B (which is particularly ‘uncensored’)

  • Gemini Pro/Flash, which is less private but free to try.

Most (in exchanges for charging pennies for each request) do not log your prompts. If you are really, really concerned, you can even rent your own GPU instance on demand.

Anyway, they will give you a key, which is basically a password.

Paste that key into the LLM frontend of your choice, like Open Web UI, LM Studio, or even web apps like:

Or even the Openrouter web interface.

[–] ArmchairAce1944@discuss.online 2 points 55 minutes ago (1 children)

Damn! That is good shit! I'll definitely look into it.

[–] brucethemoose@lemmy.world 1 points 46 minutes ago* (last edited 41 minutes ago)

Yep!

Also, I'm going to plug the AI Horde, which is basically the Fediverse for AI self hosting: https://aihorde.net/

It's awesome! Though a bit sparsely populated, like Lemmy, heh.

Ping me, and I can host a medium-sized model to try for a few hours (via those linked web UIs), if you want. The options are limitless, from something STEM-focused like Nemotron 49B, to a long context model like Bytedance's new 36B, to, dungeonmaster finetunes, to horny as heck roleplaying models, lol. But they should be significantly better than whatever 8B ollama downloads by default.