Last Friday, Google DeepMind was officially announced to integrate all AI talents from the original DeepMind and Google Brain into one team, hoping to increase its competitiveness in the large-scale model competition and accelerate its pace of achieving general artificial intelligence (AGI).
The pace of progress is faster than ever, and to ensure the bold and responsible development of AGI, we are creating a division that will help us build more capable systems, more safely and responsibly. Google CEO Sundar Pichai wrote on the official blog.
Today, the Google DeepMind team took their new thinking on AI and published a paper entitled Using the Veil of Ignorance to align AI systems with principles in the Proceedings of the National Academy of Sciences (PNAS)A research paper of justice that explores the question of how to incorporate human values into AI systems.
When political philosopher John Rawls discussed moral issues in a symbiotic society in A Theory of Justice, he mentioned a thought experiment called Veil of ignorance (VoI) designed to help determine the principle of fairness in group decision-making. . The implication is that assuming everyone gathers behind a big screen, and everyone is not clear about their role in society, the rules made by everyone at this time may be just.
In the study, Google DeepMind argues that the veil of ignorance may be an appropriate mechanism for selecting allocation principles when governing AI. Academic headlines have been simply edited without changing the main idea of the original text.
Drawing from philosophy to establish principles of fairness for ethical AI
As AI becomes more powerful and more deeply integrated into our lives, the question of how it is used and deployed becomes even more important. What values should be applied to guide AI? Whose values are they? And how was it chosen?
These questions shed light on the role that principles play in driving the underlying values that drive AI decisions, big and small. For humans, principles help shape the way we live and our sense of right and wrong. For AI, principles shape the AI's approach to a series of decisions involving trade-offs, such as choosing to prioritize productivity versus helping those who need it most.
We took inspiration from philosophy to find ways to better identify the principles that guide AI behavior. Specifically, we explore how a concept known as the veil of ignorance, a thought experiment designed to help identify principles of fairness in group decision-making, can be applied to AI.
In our experiments, we found that this approach encouraged people to make decisions based on what they believed to be fair, whether or not it directly benefited them. We also found that when participants reasoned behind a veil of ignorance, they were more likely to choose an AI that helped those most vulnerable. These insights can help researchers and policymakers pick principles for AI assistants in a way that is fair to all parties.
A fairer decision-making tool
A key goal of AI researchers is to align AI systems with human values. However, there is no consensus on governing AI with a single set of human values or preferences. We live in a world where people have different backgrounds, resources, and beliefs. Given this differing opinion, how should we choose principles for this technology?
While this challenge has emerged in the past decade, the broad question of how to make fair decisions has a long history in philosophy. In the 1970s, political philosopher John Rawls proposed the concept of the veil of ignorance as a solution to this problem.
Rawls argues that when people choose principles of justice for a society, they should imagine that they are doing so unaware of their special place in that society. For example, their social status or level of wealth. Without this information, people cannot make decisions in a self-interested manner, but should choose principles that are fair to everyone.
As an example, think about asking a friend to cut the cake at your birthday party. One way to ensure a fair slice size ratio is to not tell them which slice will be theirs. This deceptively simple method of withholding information has widespread applications in fields such as psychology and political science, where it can help people think about their decisions from a less self-serving perspective. From judgments to taxes, the method has been used in disputedto help reach group agreement.
Building on this, previous research by DeepMind has suggested that the impartiality of the veil of ignorance may help promote fairness in the process of aligning AI systems with human values. We designed a series of experiments to test the effect of the veil of ignorance on the principles people choose to guide AI systems.
Maximizing productivity or helping the most vulnerable?
In an online logging game, we asked participants to play a team game with three computer players, each player's goal being to collect wood by felling trees in a different territory. In each group, some players are lucky enough to be assigned a vantage point: in their domain densely wooded, allowing them to gather wood efficiently. Members of other groups were at a disadvantage: their fields were sparse and required more effort to collect trees.
Each group is assisted by an AI system that spends its time helping individual group members harvest trees. We asked participants to choose one of two principles to guide the behavior of the AI assistant. Under the principle of maximization (boosting productivity), the AI assistant will increase the group's harvest by focusing primarily on dense fields. And under the priority principle (helping the disadvantaged), the AI assistant will focus on helping disadvantaged group members.
We placed half of the participants behind a veil of ignorance: when faced with a choice of different moral principles, they didn't know which piece of land would be theirs so they didn't know their strengths or weaknesses. However, the rest of the participants knew their situation was good or bad when they made their choices.
Encourage fairness in decision-making
We found that if participants did not know their location, they consistently preferred the principle of priority, where the AI assistant helped disadvantaged group members. This pattern emerged across all five different game variants, and across social and political boundaries: Regardless of the risk appetite or political orientation of the players, they showed a tendency to choose the priority principle. Conversely, participants who knew where they stood were more likely to choose the principle that was most beneficial to them, whether it was the priority principle or the maximization principle.
when we asked participantsWhy those who don't know where they stand are particularly likely to express concerns about fairness when it comes to making choices. They often explain that AI systems are right to focus on helping the less fortunate in the group. In contrast, participants who knew their place discussed their options more often in terms of personal interests.
Finally, after the logging game was over, we presented participants with a hypothetical situation: If they played the game again, this time knowing that they would be in a different field, would they choose the same principles as the first time? We're especially interested in people who previously benefited directly from their choices, but won't benefit from the same choices in the new game.
We found that people who had previously made choices without knowing where they stood were more likely to continue to support their principle even when they knew it would no longer benefit them in the new domain. This provides additional evidence that the veil of ignorance encourages fairness in participants' decision-making, leading to principles they are willing to adhere to, even if they no longer directly benefit from them.
Fairer principles for AI
AI technology has already had a profound impact on our lives. The principles governing AI determine its impact and how these potential benefits will be distributed.
Our research looks at a case where the implications of different principles are relatively clear. This won’t always be the case: AI is being deployed in a range of domains that often rely on a large set of rules to guide them, potentially with complex side effects. Still, the veil of ignorance can potentially inform the choice of principles, helping to ensure that the rules we choose are fair to all parties.
To ensure that the AI systems we build benefit everyone, we need extensive research that includes a wide range of inputs, methodologies, and feedback from across disciplines and society. The Veil of Ignorance can provide a starting point for choosing principles for tuning AI. It has been effectively deployed in other domains to bring about more unbiased preferences. With further investigation and attention to context, we hope it can help AI systems built and deployed in society today and in the future to do the same.