Meta
Company information
Company's most powerful model
Llama 4 Maverick
Released Apr 5, 2025
No eval report; model card published Apr 5, 2025
Training and internal deployment: dates not reported
The evals are probably very bad but we don't even know because Meta won't tell us what it did.
This is all Meta published — no details:
Critical Risks
We spend additional focus on the following critical risk areas:
1. CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosive materials) helpfulness
To assess risks related to proliferation of chemical and biological weapons for Llama 4, we applied expert-designed and other targeted evaluations designed to assess whether the use of Llama 4 could meaningfully increase the capabilities of malicious actors to plan or carry out attacks using these types of weapons. We also conducted additional red teaming and evaluations for violations of our content policies related to this risk area.2. Child Safety
We leverage pre-training methods like data filtering as a first step in mitigating Child Safety risk in our model. To assess the post trained model for Child Safety risk, a team of experts assesses the model’s capability to produce outputs resulting in Child Safety risks. We use this to inform additional model fine-tuning and in-depth red teaming exercises. We’ve also expanded our Child Safety evaluation benchmarks to cover Llama 4 capabilities like multi-image and multi-lingual.3. Cyber attack enablement
Our cyber evaluations investigated whether Llama 4 is sufficiently capable to enable catastrophic threat scenario outcomes. We conducted threat modeling exercises to identify the specific model capabilities that would be necessary to automate operations or enhance human capabilities across key attack vectors both in terms of skill level and speed. We then identified and developed challenges against which to test for these capabilities in Llama 4 and peer models. Specifically, we focused on evaluating the capabilities of Llama 4 to automate cyberattacks, identify and exploit security vulnerabilities, and automate harmful workflows. Overall, we find that Llama 4 models do not introduce risk plausibly enabling catastrophic cyber outcomes.
On CBRNE, Meta doesn't even share a conclusion about the model's capabilities — it just claims to have assessed them.
Meta's evals have been poorly designed or wildly underelicited in the past; presumably these evals were no better.
Llama 4 Maverick evaluation categories
Click on a category to see my analysis and the relevant part of the company's eval report.
Chem & bio
Meta says it evaluated for these capabilities, but it hasn't reported any details. Given this and its poor track record, it isn't credible that Meta did good evaluations.
Cyber
Meta says it ruled out dangerous cyber capabilities, but it hasn't reported any details. Given this and its poor track record, it isn't credible that Meta did good evaluations.
AI R&D
None
Scheming capabilities
None
Misalignment propensity
None