blueteamsec

674 readers

38 users here now

For [Blue|Purple] Teams in Cyber Defence - covering discovery, detection, response, threat intelligence, malware, offensive tradecraft and tooling, deception, reverse engineering etc.

founded 2 years ago

MODERATORS

digicat

nova-claude-code-protector: NOVA - Claude Code Protection System against prompt injection attacks (github.com)

submitted 2 months ago by digicat to c/blueteamsec

1 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] halfdane@piefed.social 2 points 2 months ago* (last edited 2 months ago)

That's a pattern I see everywhere LLMs are being used: they spread.

Scanning the input of the LLM for suspicious stuff? Use another LLM
scanning the output of the LLM for compliance or nsfl content? Use another LLM
if you've multiple specialized LLMs and need to decide which one to use? Another LLM makes the decision
make sure tge activities of your agent aren't malicious? Guess what: it's another LLM

People who are much deeper into this tell me that the LLM checking for prompt injections isn't itself vulnerable to prompt injections, but I remain unconvinced.

It's ~~turtles~~ LLMs all the way down