Even to the extent that they are "prompting it wrong" it's still on the AI companies for calling this shit "AI". LLMs fundamentally do not even attempt to do cognitive work (the way a chess engine does by iterating over possible moves).
Also, LLM tools do not exist. All you can get is a sales demo for the company stock (the actual product being sold), built to impress how close to AGI the company is. You have to creatively misuse these things to get any value out of them.
The closest they get to tools is "AI coding", but even then, these things plagiarize code you don't even want plagiarized (because its MIT licensed and you'd rather keep up with upstream fixes).
Tbh whenever I try to read anything on decision theory (even written by people other than rationalists), I end up wondering how do they think a redundant autopilot (with majority vote) would ever work. In an airplane, that is.
Considering just the physical consequences of a decision doesn’t work (unless theres a fault, consequences don’t make it through the voting electronics, so the alternative decisions made for the alternative that there is no fault, never make it through).
Each one simulating the two or more other autopilots is scifi-brained idiocy. Requiring that autopilots are exact copies is stupid (what if we had two different teams write different implementations, I think Airbus actually sort if did that).
Nothing is going to be simulating anything, and to make matters even worse for philosophers amateur and academic alike, the whole reason for redundancy is that sometimes there is a glitch that makes them not compute the same values, so any attempt to be clever with “ha, we just treat copies as one thing” doesn’t cut it either.