AI Security
Llama Guard
Meta safety model family and recipes for classifying human-AI conversation safety risks.
aillmsafetymoderationclassifiermeta
Best For
Good for studying how model-based input/output moderation can support, but not replace, app security controls.
Responsible Use
Use this tool only in owned environments, classroom labs, CTFs, or engagements where you have explicit written permission. Keep notes focused on findings, risk, and remediation.
Official Resource
https://github.com/meta-llama/llama-recipes/tree/main/recipes/responsible_ai/llama_guard