AI innovation especially for popular AI systems today such as ChatGPT has been heavily influenced by open research, from the foundational work on the transformers architecture to open releases of some of the most popular language models today. Making AI more open and accessible, including not only machine learning models but also the datasets used for training and the research breakthroughs, cultivates safe innovation. Broadening access to artifacts such as models and training datasets allows researchers and users to better understand systems, conduct audits, mitigate risks, and find high value applications.
The tensions of whether to fully open or fully close an AI system grapples with risks on either end; fully closed systems are often inaccessible to researchers, auditors, and democratic institutions and can therefore obscure necessary information or illegal and harmful data. A fully open system with broader access can attract malicious actors. All systems regardless of access can be misused and require risk mitigation measures. Our approach to ethical openness acknowledges these tensions and combines institutional policies, such as documentation; technical safeguards, such as gating access to artifacts; and community safeguards, such as community moderation. We hold ourselves accountable to prioritizing and documenting our ethical work throughout all stages of AI research and development.
Open systems foster democratic governance and increased access, especially to researchers, can help to solve critical security concerns by enabling and empowering safety research. For example, the popular research on watermarking large language models by University of Maryland researchers was conducted using OPT, an open-source language model developed and released by Meta. Watermarking is an increasingly popular safeguard for AI detection and openness enables safety research via access. Open research helps us understand these techniques’ robustness and accessible tooling, which we worked on with the University of Maryland researchers, and can encourage other researchers to test and improve safety techniques. Open systems can be more compliant with AI regulation than their closed counterparts; a recent Stanford University study assessed foundation model compliance with the EU AI Act and found while many model providers only score less than 25%, such as AI21 Labs, Aleph Alpha, and Anthropic, Hugging Face’s BigScience was the only model provider to score above 75%. Another organization centered on openness, EleutherAI, scored highest on disclosure requirements. Openness bolsters transparency and enables external scrutiny.
The AI field is currently dominated by a few high-resource organizations who give limited or no open access to novel AI systems, including those based on open research. In order to encourage competition and increase AI economic opportunity, we should enable access for many people to contribute to increasing the breadth of AI progress across useful applications, not just allow a select few organizations to improve the depth of more capable models.
Full testimony: https://twitter.com/ClementDelangue/status/1673349227445878788