“Value alignment”, roughly the problem of ensuring that the outputs of Artificial Intelligence and other machine systems align with human values, has become an urgent problem as computer technologies begin to encroach on central domains of human decision-making. Existing strategies for alignment make no allowance for the possibility of ‘hard choices’ – distinct from cases of uncertainty, incompleteness, and indeterminacy – but assume that in a choice between A and B, machine outputs must fall into one three categories: choose A, choose B, or arbitrarily select between them. But human life is not so neat. If we are to achieve value alignment, we need a different approach to AI design that makes room for the existence of hard choices. In this talk, I present an alternative framework for AI design that allows machines to recognize hard choices and puts humans ‘in the loop’ in a novel way. Progress in building such systems is underway.
Acte en anglès.
Taula de conversa amb Ruth Chang.