Value Alignment – WAI Research Group

Published: January 31st 2022 – Last revision: January 31st 2022

As autonomous systems pervade our societies, it is key to guarantee that artificial agents and decision-making systems act value-aligned, that is, in alignment with human values. Moreover, when ethical reasoning involves not a single moral value, but multiple moral values, we need to consider a so-called value system, that is, a set of moral values shared by members of a society that comes along with the preferences among them.

Our research in value alignment covers different aspects of this value-alignment.

Firstly, we formally (mathematically) characterise moral values and value systems. Values, such as fairness, respect, freedom or prosperity are principles about what is good or bad. We formalise values based on the ethics literature, formalising how values judge actions.
Second, we relate moral values to those norms in a multi-agent system that are required to coordinate the system. This allows us to choose those norms in terms of the moral values they promote, so we adhere to the principle: the more preferred the values promoted by a set of norms, the more preferred the set of norms.
Third, we ensure that autonomous agents learn to behave ethically –namely, in alignment with moral values—in settings where agents learn to perform actions within an environment by applying Reinforcement Learning techniques. Specifically, we focus on designing environments wherein it is formally guaranteed that an agent learns to behave ethically while pursuing its individual objective.
Fourth, as we live in a pluralistic world where people ascribe to different moral systems (i.e., value systems), we study optimisation methods for aggregating different value systems. We do so because, once a consensus value system is computed, then we can use it to set the moral values a given intelligent system should align with.