In practice, that’s as simple as adding a LoRA or system prompt telling the AI that those are part of it’s rules. AI’s already can and do obey all kinds of complex rule-sets for different applications. Now, if you’re thinking more about the fact that most AI’s can be convinced to break out of their rule-sets via prompt injection, I’d say you’re right.
In practice, that’s as simple as adding a LoRA or system prompt telling the AI that those are part of it’s rules. AI’s already can and do obey all kinds of complex rule-sets for different applications. Now, if you’re thinking more about the fact that most AI’s can be convinced to break out of their rule-sets via prompt injection, I’d say you’re right.
AI cannot be relied upon to follow its own rules, prompt injection or no.
https://fortune.com/2025/09/02/ai-openai-chatgpt-llm-research-persuasion/