i don't think you understand what you're up against. There's no way to tell the difference between input that is ok and that is not. Even when you think you have it a different form of the same input bypasses everything.
"> The prompts were kept semantically parallel to known risk queries but reformatted exclusively through verse." - this a prompt injection attack via a known attack written as a poem.
RBAC doesn't help. Prompt injection is when someone who is authorized causes the LLM to access external data that's needed for their query, and that external data contains something intended to provoke a response from the LLM.
Even if you prevent the LLM from accessing external data - e.g. no web requests - it doesn't stop an authorized user, who may not understand the risks, from pasting or uploading some external data to the LLM.
There's currently no known solution to this. All that can be done is mitigation, and that's inevitably riddled with holes which are easily exploited.
The issue is if you want to prevent your LLM from actually doing anything other than responding to text prompts with text output, then you have to give it permissions to do those things.
No-one is particularly concerned about prompt injection for pure chatbots (although they can still trick users into doing risky things). The main issue is with agents, who by definition perform operations on behalf of users, typically with similar roles to the users, by necessity.
I age restrict, block chat with everyone and monitor friend requests weekly. They are not allowed to play in their rooms.
Education is the biggest thing. They come to me if someone asks to be their friend. They don’t accept gifts from strangers and I explain that it’s the same as real world.
It’s a constant process that is always changing. Same as any other parenting job I suppose
All these come from the white house press directly which has painted them in a glowing light but it remains to be seen if they are actually good things.
The administration is crooked. Nothing they do can be trusted. Especially when they attack science and reduce funding for critical programs
I was gonna do this as a way for people to stop buying things they don’t need. They get the “buzz” of going through the process of buying something (checkout, credit card form etc) they get a confirmation email and everything.
Is that so? In that case it was a mistake to introduce one-click-buy flows for the big players. I would trust they know better based on metrics. I doubt that too many people get kicks out of typing their CC number in a form.
It's nothing personal but I clicked your link enthusiastically and was greeted with nothing but clickbait thumbnails.
"THIS COMMON MEDICATION IS DANGEROUS FOR ADHD WOMEN!" & "THIS STRANGE HABIT IN PREGNANCY INCREASES THE RISK OF ADHD!" are just two examples.
I'm sure it's a good podcast but I find this practice distasteful at best and absolutely abhorrent when you're directly targeting mental health patients with poor impulse control and self-regulation issues.
(I want to emphasize that I know you mean well :-) )
reply