this post was submitted on 18 Jan 2026
3 points (100.0% liked)

blueteamsec

627 readers
37 users here now

For [Blue|Purple] Teams in Cyber Defence - covering discovery, detection, response, threat intelligence, malware, offensive tradecraft and tooling, deception, reverse engineering etc.

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] halfdane@piefed.social 2 points 4 weeks ago* (last edited 4 weeks ago)

That's a pattern I see everywhere LLMs are being used: they spread.

  • Scanning the input of the LLM for suspicious stuff? Use another LLM
  • scanning the output of the LLM for compliance or nsfl content? Use another LLM
  • if you've multiple specialized LLMs and need to decide which one to use? Another LLM makes the decision
  • make sure tge activities of your agent aren't malicious? Guess what: it's another LLM

People who are much deeper into this tell me that the LLM checking for prompt injections isn't itself vulnerable to prompt injections, but I remain unconvinced.

It's ~~turtles~~ LLMs all the way down