We conducted a safety and security evaluation of LLaMA models across generations—from Llama-2 to Llama-3.1. The paper dives into key vulnerabilities, how the models have evolved, and further enhancements to be made. Written by our AI engineers from Stanford, this paper provides a comprehensive evaluation of the evolution of Llama.