this post was submitted on 29 Oct 2023
30 points (94.1% liked)

Stable Diffusion

4981 readers
1 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago
MODERATORS
 

Casual hobbyist, not an expert here.

It WAS working... About eight months ago, I trained a bunch of embeddings and hypernetworks and it all worked great.

Cut to the present, I want to do some more training. I've updated Automatic1111 several times, but nothing else about my setup has changed. However, whenever I try to train anything (embeddings, hypernetworks or loras), loss is NaN for 4 out of 5 steps right from the get go. As the training progresses, loss becomes NaN for 9 out of 10 steps, then 19 out of 20 steps around step 3,000, which is as far as I've gotten. Hypernetworks just don't work at that point and embeddings produce garbage.

I have googled like crazy, and found

A few threads, where the best hint is that (at least 8-9 months ago) xformers broke training. Well, I've messed around with xformers, uninstalled and reinstalled xformers, eaten xformers for breakfast. Behavior is the same.

Lower training rate I have set my training rate to 0.0000000000000005. Behavior is identical.

My system is on the low end for VRAM (8G). I have TWO 8G cards, so I wish I could train on both like I can for Llama. But I also think that's not it, because my OLD embeddings and hypernetworks came out great and still work.

Any thoughts here?

you are viewing a single comment's thread
view the rest of the comments
[–] Even_Adder@lemmy.dbzer0.com 3 points 2 years ago