Considering how powerful AI systems are and the role they increasingly play in making important decisions about our lives, homes and societies, they receive surprisingly little formal oversight.
That’s starting to change, thanks to the thriving field of AI audits. When they work well, these audits allow us to reliably monitor how well a system is working and how we can mitigate potential bias or harm.
It is known that a 2018 audit of commercial facial recognition systems by AI researchers Joy Buolamwini and Timnit Gebru found that the system did not recognize dark-skinned people as well as whites. For dark-skinned women, the error rate was up to 34%. As AI researcher Abeba Birhane points out in a new essay in Nature, the audit has “launched a series of critical work that has exposed the bias, discrimination and oppressive nature of facial analysis algorithms.” The hope is that through these kinds of audits on different AI systems, we will be better able to root out problems and have a broader conversation about how AI systems affect our lives.
Supervisors are catching up and that is partly driving the demand for audits. A new law in New York City will require all AI-powered recruiting tools to be checked for bias from January 2024. In the European Union, major tech companies will be required to conduct annual audits of their AI systems from 2024, and the upcoming AI Act will require audits of “risky” AI systems.
It’s a big ambition, but there are some huge obstacles. There is no common understanding of what an AI audit should look like, and there aren’t enough people with the right skills to conduct it. The few audits that take place today are mostly ad hoc and vary widely in quality, Alex Engler, who studies AI governance at the Brookings Institution, told me. One example he gave is from HireVue, an AI hiring company, which suggested in a press release that a third-party audit found that its algorithms are biased. It turned out to be nonsense: The audit hadn’t really examined the company’s models and was subject to a nondisclosure agreement, meaning there was no way to verify what was found. It was essentially nothing more than a PR stunt.
One way the AI community is trying to address the lack of auditors is through bias bounty competitions, which work in a similar way to cybersecurity bug bounties — that is, they call on people to use tools. to identify and reduce algorithmic biases in AI models. One such contest was launched last week, organized by a group of volunteers, including Rumman Chowdhury, Twitter’s ethical leader in AI. The team behind it hopes it will be the first of many.
It’s a good idea to create incentives for people to learn the skills needed to conduct audits — as well as start building standards for what audits should look like by showing which methods work best. You can read more about it here.
The growth of these audits suggests that one day we could see warnings in cigarette packs that AI systems could be harmful to your health and safety. Other industries, such as chemicals and food, have regular audits to ensure products are safe to use. Could something like this become the norm in AI?