this post was submitted on 28 Jan 2025
310 points (96.7% liked)

Technology

73567 readers
3490 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
top 34 comments
sorted by: hot top controversial new old
[–] Capsicones@lemmy.blahaj.zone 119 points 6 months ago (2 children)

There seems to be some confusion here on what PTX is -- it does not bypass the CUDA platform at all. Nor does this diminish NVIDIA's monopoly here. CUDA is a programming environment for NVIDIA GPUs, but many say CUDA to mean the C/C++ extension in CUDA (CUDA can be thought of as a C/C++ dialect here.) PTX is NVIDIA specific, and sits at a similar level as LLVM's IR. If anything, DeepSeek is more dependent on NVIDIA than everyone else, since PTX is tightly dependent on their specific GPUs. Things like ZLUDA (effort to run CUDA code on AMD GPUs) won't work. This is not a feel good story here.

[–] eager_eagle@lemmy.world 13 points 6 months ago* (last edited 6 months ago) (1 children)

I don't think anyone is saying CUDA as in the platform, but as in the API for higher level languages like C and C++.

PTX is a close-to-metal ISA that exposes the GPU as a data-parallel computing device and, therefore, allows fine-grained optimizations, such as register allocation and thread/warp-level adjustments, something that CUDA C/C++ and other languages cannot enable.

[–] Capsicones@lemmy.blahaj.zone 22 points 6 months ago (1 children)

Some commenters on this post are clearly not aware of PTX being a part of the CUDA environment. If you know this, you aren't who I'm trying to inform.

[–] eager_eagle@lemmy.world 7 points 6 months ago

aah I see them now

[–] Gsus4@mander.xyz 1 points 6 months ago (1 children)

I thought CUDA was NVIDIA-specific too, for a general version you had to use OpenACC or sth.

[–] remotelove@lemmy.ca 3 points 6 months ago (1 children)

CUDA is NVIDIA proprietary, but may be open to licensing it? I think?

https://www.theregister.com/2021/11/10/nvidia_cuda_silicon/

[–] KingRandomGuy@lemmy.world 2 points 6 months ago

I think the thing that Jensen is getting at is that CUDA is merely a set of APIs. Other hardware manufacturers can re-implement the CUDA APIs if they really wanted to (especially since AFAIK, Google v Oracle ruled that APIs cannot be copyrighted). In fact, AMD's HIP implements many of the same APIs as CUDA, and they ship a tool (HIPIFY) to convert code written for CUDA for HIP instead.

Of course, this does not guarantee that code originally written for CUDA is going to perform well on other accelerators, since it likely was implemented with NVIDIA's compute model in mind.

[–] filister@lemmy.world 54 points 6 months ago (1 children)

What is amazing in this case is that they achieved spending a fraction of the inference cost that OpenAI is paying.

Plus they are a lot cheaper too. But I am pretty sure that the American government will ban them in no time, citing national security concerns, etc.

Nevertheless, I think we need more open source models.

Not to mention that NVIDIA also needs to be brought to earth.

[–] demesisx 19 points 6 months ago (3 children)

Even if they get banned, any startup could replicate their work if it is truly open source. The best thing about their solution is that it breaks the CUDA monopoly that NVDA has enjoyed. Buy your puts when NVDA bounces because that stock is GOING DOWN. There’s no world where a company that makes GPU’s is worth more than both Apple and Microsoft. It’s inevitable.

[–] toffi@feddit.org 19 points 6 months ago (1 children)

Never forget kids the market can stay irrational much longer than you can stay solvent.

[–] demesisx 13 points 6 months ago

True. Thats why I tend to make small plays instead of being an absolute degenerate gambler.

[–] Pieisawesome@lemmy.world 13 points 6 months ago

It’s written in nvidia instruction set PTX which is part of CUDA ecosystem.

Hardly going to affect nvidia

[–] eager_eagle@lemmy.world 3 points 6 months ago (1 children)

I wish that was true, but this doesn't threaten any monopoly

[–] demesisx 10 points 6 months ago* (last edited 6 months ago) (2 children)

~~It certainly does.~~

~~Until last week, you absolutely NEEDED an NVidia GPU equipped with CUDA to run all AI models.~~

~~Today, that is simply not true. (watch the video at the end of this comment)~~

~~I watched this video and my initial reaction to this news was validated and then some: this video made me even more bearish on NVDA.~~

Edit: corrected and redacted.

[–] eager_eagle@lemmy.world 6 points 6 months ago* (last edited 6 months ago) (1 children)

mate, that means they are using PTX directly. If anything, they are more dependent to NVIDIA and the CUDA platform than anyone else.

to simplify: they are bypassing the CUDA API, not the NVIDIA instruction set architecture and not CUDA as a platform.

[–] demesisx 11 points 6 months ago

Ahh. Thanks for this insight.

[–] eager_eagle@lemmy.world 5 points 6 months ago* (last edited 6 months ago) (1 children)

Until last week, you absolutely NEEDED an NVidia GPU equipped with CUDA to run all AI models.

also not true

[–] demesisx 10 points 6 months ago

Thanks for the corrections.

[–] Australis13@fedia.io 37 points 6 months ago (3 children)

The big win I see here is the amount of optimisation they achieved by moving from the high-level CUDA to lower-level PTX. This suggests that developing these models going forward can be made a lot more energy-efficient, something I hope can be extended to their execution as well. As it stands currently, "AI" (read: LLMs and image generation models) consumes way too many resources to be sustainable.

[–] KingRandomGuy@lemmy.world 5 points 6 months ago* (last edited 6 months ago)

What I'm curious to see is how well these types of modifications scale with compute. DeepSeek is restricted to H800s instead of H100s or H200. These are gimped cards to get around export controls, and accordingly they have lower memory bandwidth (~2 vs ~3 TB/s) and most notably, much slower GPU to GPU communication (something like 400 GB/s vs 900 GB/s). The specific reason they used PTX in this application was to help alleviate some of the bottlenecks due to the limited inter-GPU bandwidth, so I wonder if that would still improve performance on H100 and H200 GPUs where bandwidth is much higher.

[–] Dkarma@lemmy.world 3 points 6 months ago

Yeah I'd like to see size comparisons too. The cuda stack is massive.

[–] Knock_Knock_Lemmy_In@lemmy.world -5 points 6 months ago (2 children)

PTX also removes NVIDIA lock-in.

[–] mholiv@lemmy.world 16 points 6 months ago (2 children)

Kind of the opposite actually. PTX is in essence nvidia specific assembly. Just like how arm or x86_64 assembly are tied to arm and x86_64.

At least with cuda there are efforts like zluda. Cuda is more like objective-c was on the mac. Basicly tied to platform but at least you could write a compiler for another target in theory.

[–] KingRandomGuy@lemmy.world 3 points 6 months ago

IIRC Zluda does support compiling PTX. My understanding is that this is part of why Intel and AMD eventually didn't want to support it - it's not a great idea to tie yourself to someone else's architecture you have no control or license to.

OTOH, CUDA itself is just a set of APIs and their implementations on NVIDIA GPUs. Other companies can re-implement them. AMD has already done this with HIP.

[–] Knock_Knock_Lemmy_In@lemmy.world 1 points 6 months ago

Ah, I hoped it was cross platform, more like Opencl. Thinking about it, a lower level language would be more platform specific.

[–] sunbeam60@lemmy.one 12 points 6 months ago (1 children)

Wtf, this is literally the opposite of true. PTX is nvidia only.

[–] Knock_Knock_Lemmy_In@lemmy.world 4 points 6 months ago (1 children)

Google was giving me bad search results about PTX so I just posted am opinion and hoped Cunningham's Law would work.

[–] accideath@lemmy.world 4 points 6 months ago

How cunning.

[–] Imgonnatrythis@sh.itjust.works 8 points 6 months ago (1 children)

They said this is close to metal. Wake me up when they've achieved metal.

[–] paraphrand@lemmy.world 9 points 6 months ago

I thought everyone liked to hate on Metal.

[–] mesamunefire@lemmy.world 7 points 6 months ago (1 children)

Reminds me of the Bitcoin mining and how askii miners overtook graphic card mining practically overnight. It would not surprise me if this goes the same way.

[–] codexarcanum@lemmy.dbzer0.com 4 points 6 months ago

It's already happening. This article takes a long look at many of the rising threats to nvidia. Some highlights:

  • Google has been running on their own homemade TPUs (tensor processing units) for years, and say they on the 6th generation of those.

  • Some AI researchers are building an entirely AMD based stack from scratch, essentially writing their own drivers and utilities to make it happen.

  • Cerebras.ai is creating their own AI chips using a unique whole-die system. They make an AI chip the size of entire silicon wafer (30cm square) with 900,000 micro-cores.

So yeah, it's not just "China AI bad" but that the entire market is catching up and innovating around nvidia's monopoly.

[–] sinceasdf@lemmy.world -3 points 6 months ago (1 children)

This is why Nvidia stock has been hit so hard. CUDA is their moat

[–] massive_bereavement@fedia.io 9 points 6 months ago

Aw, CUDA see this happening...