this post was submitted on 14 Feb 2024

149 points (93.1% liked)

Technology

73698 readers

3411 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

149

NVIDIA’s new AI chatbot runs locally on your PC (www.engadget.com)

submitted 1 year ago by catculation@lemmy.zip to c/technology@lemmy.world

30 comments fedilink hide all child comments

• NVIDIA released a demo version of a chatbot that runs locally on your PC, giving it access to your files and documents.

• The chatbot, called Chat with RTX, can answer queries and create summaries based on personal data fed into it.

• It supports various file formats and can integrate YouTube videos for contextual queries, making it useful for data research and analysis.

all 32 comments

sorted by: hot top controversial new old

[–] General_Effort@lemmy.world 49 points 1 year ago (1 children)

That was an annoying read. It doesn't say what this actually is.

It's not a new LLM. Chat with RTX is specifically software to do inference (=use LLMs) at home, while using the hardware acceleration of RTX cards. There are several projects that do this, though they might not be quite as optimized for NVIDIA's hardware.

Go directly to NVIDIA to avoid the clickbait.

Chat with RTX uses retrieval-augmented generation (RAG), NVIDIA TensorRT-LLM software and NVIDIA RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs. Users can quickly, easily connect local files on a PC as a dataset to an open-source large language model like Mistral or Llama 2, enabling queries for quick, contextually relevant answers.

Source: https://blogs.nvidia.com/blog/chat-with-rtx-available-now/

Download page: https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generative-ai/

[–] GenderNeutralBro@lemmy.sdf.org 13 points 1 year ago (3 children)

Pretty much every LLM you can download already has CUDA support via PyTorch.

However, some of the easier to use frontends don't use GPU acceleration because it's a bit of a pain to configure across a wide range of hardware models and driver versions. IIRC GPT4All does not use GPU acceleration yet (might need outdated; I haven't checked in a while).

If this makes local LLMs more accessible to people who are not familiar with setting up a CUDA development environment or Python venvs, that's great news.

[–] General_Effort@lemmy.world 5 points 1 year ago

I'd hope that this uses the hardware better than Pytorch. Otherwise, why the specific hardware demands? Well, it can always be marketing.

There are several alternatives that offer 1-click installers. EG in this thread:

AGPL-3.0 license: https://jan.ai/

MIT license: https://ollama.com/

MIT license: https://gpt4all.io/index.html

(There's more.)

[–] CeeBee@lemmy.world 2 points 1 year ago

Ollama with Ollama WebUI is the best combo from my experience.

[–] Oha@lemmy.ohaa.xyz 1 points 1 year ago (1 children)

Gpt4all somehow uses Gpu acceleration on my rx 6600xt

[–] GenderNeutralBro@lemmy.sdf.org 1 points 1 year ago (1 children)

Ooh nice. Looking at the change logs, looks like they added Vulkan acceleration back in September. Probably not as good as CUDA/Metal on supported hardware though.

[–] Oha@lemmy.ohaa.xyz 1 points 1 year ago

getting around 44 iterations/s (or whatever that means) on my gpu

[–] furzegulo@lemmy.dbzer0.com 39 points 1 year ago (3 children)

i have no need to talk to my gpu, i have a shrink for that

[–] whodatdair@lemm.ee 21 points 1 year ago

Idk I kinda like the idea of a madman living in my graphics card. I want to be able to spin them up and have them tell me lies that sound plausible and hallucinate things.

[–] femboy_bird@lemmy.blahaj.zone 4 points 1 year ago

Gpu is cheaper (somehow)

[–] gaifux@lemmy.world 1 points 1 year ago

Your shrink renders video frames?

[–] femboy_bird@lemmy.blahaj.zone 23 points 1 year ago* (last edited 1 year ago)

it gives the chatbot access to your files and documents

I'm sure nvidia will be trustworthy and responsible with this

[–] BertramDitore@lemmy.world 19 points 1 year ago (1 children)

They say it works without an internet connection, and if that’s true this could be pretty awesome. I’m always skeptical about interacting with chatbots that run in the cloud, but if I can put this behind a firewall so I know there’s no telemetry, I’m on board.

[–] halfwaythere@lemmy.world 7 points 1 year ago

You can already do this. There are plenty of vids that show you how and it's pretty easy to get started. Expanding functionality to get it to act and respond how you want is a bit more challenging. But definitely doable.

[–] RobotToaster@mander.xyz 11 points 1 year ago (4 children)

Shame they leave GTX owners out in the cold again.

[–] anlumo@lemmy.world 2 points 1 year ago

The whole point of the project was to use the Tensor cores. There are a ton of other implementations for regular GPU acceleration.

[–] CeeBee@lemmy.world 2 points 1 year ago

Just use Ollama with Ollama WebUI

[–] Coldgoron@lemmy.world 11 points 1 year ago (2 children)

I recommend jan.ai over this, last I heard it mentioned it was a decent option.

[–] FaceDeer@kbin.social 4 points 1 year ago (1 children)

There's also GPT4All that I'm aware of.

[–] Hawk@lemmy.dbzer0.com 3 points 1 year ago

Or ollama.ai

[–] PlexSheep@feddit.de 4 points 1 year ago

I use https://huggingface.co/chat , you can also easily host open source models on your local machine

[–] militaryintelligence@lemmy.world 2 points 1 year ago

AI is a data harvesting free-for-all

[–] ElHijoDelPilote@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

I'm a bit of a noob here. Can someone please give me a few examples how I would use this on my local machine?

[–] Poggervania@kbin.social 1 points 1 year ago (2 children)

Can I sing the NVIDIA song with it?

[–] femboy_bird@lemmy.blahaj.zone 1 points 1 year ago

I had almost forgotten that existed

Thanks

[–] PipedLinkBot@feddit.rocks 1 points 1 year ago

Here is an alternative Piped link(s):

NVIDIA song

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.