About
I'm Zach Stein-Perlman. In this website, I collect and assess the public information on five AI companies' model evals for dangerous capabilities. In June, I may add analysis of companies' security and deployment safeguards. This site is in beta, but I endorse the content and you should share it freely.
This site is related to my main project, AI Lab Watch, where I collect safety recommendations for AI companies and information about what they are doing, then evaluate them on safety. I also maintain collections of information and blog about these topics. I focus on what companies should do to prevent extreme risks such as AI takeover and human extinction.
AI Lab Watch is funded by SFF and me. It also depends on the many people who volunteer their expertise.
So what?
I don't know. Maybe model evals for dangerous capabilities don't really matter. The usual story for them is that they inform companies about risks so the companies can respond appropriately. Good evals are better than nothing, but I don't expect companies' eval results to affect their safeguards or training/deployment decisions much in practice. Maybe evals are helpful for informing other actors, like government, but I don't really see it.
I don't have a particular conclusion. I'm just making this site because evals are a crucial part of companies' preparations to be safe when models might have very dangerous capabilities, but I noticed that the companies are doing and interpreting evals poorly (and are being misleading about this, and aren't getting better), and some experts are aware of this but nobody has written it up before.
Contact me (using the button in the bottom-right corner) to:
- Ask me about what frontier AI companies should do, what they are doing, what you should read, etc.
- Give feedback on the content
- Give feedback on how you use this site or how it's helpful for you or what would make it more helpful
- Help this project find an institutional home
- Pitch me projects
- Express interest in collaborating
- For this project, I'm especially interested in getting toward a state where there's rapid "peer review" of eval reports by people who (1) would produce good reviews, (2) have legible expertise (viz. ideally professors), and (3) don't have big conflicts of interest. If you are such a person, or you know someone who'd be good, please let me know.
- Express interest in donating