Hiring in life sciences? Share your open positions with our professional community. Read more Close

Advertisement

Machines that halt resolve the undecidability of artificial intelligence alignment.

Created on 05 May 2025

Authors

Gabriel A Melo, Marcos R O A Máximo, Nei Y Soma, Paulo A L Castro

Published in

Scientific reports. Volume 15. Issue 1. Pages 15591. May 04, 2025. Epub May 04, 2025.

Abstract

The inner alignment problem, which asserts whether an arbitrary artificial intelligence (AI) model satisfices a non-trivial alignment function of its outputs given its inputs, is undecidable. This is rigorously proved by Rice's theorem, which is also equivalent to a reduction to Turing's Halting Problem, whose proof sketch is presented in this work. Nevertheless, there is an enumerable set of provenly aligned AIs that are constructed from a finite set of provenly aligned operations. Therefore, we argue that the alignment should be a guaranteed property from the AI architecture rather than a characteristic imposed post-hoc on an arbitrary AI model. Furthermore, while the outer alignment problem is the definition of a judge function that captures human values and preferences, we propose that such a function must also impose a halting constraint that guarantees that the AI model always reaches a terminal state in finite execution steps. Our work presents examples and models that illustrate this constraint and the intricate challenges involved, advancing a compelling case for adopting an intrinsically hard-aligned approach to AI systems architectures that ensures halting.

PMID:
40320467
Bibliographic data and abstract were imported from PubMed on 05 May 2025.

Read full publication at:
Please sign in to see all details.

Advertisement

Stats

  • Community rating n/a 0 votes
  • Reviewers' rating n/a 0 votes
  • Your rating

1-terrible, 9-excellent. How would you rate this publication? Sign in in to submit your rating.

  • Recommendations n/a n/a positive of 0 vote(s)
  • Views 47
  • Comments 0

Recommended by

  • No recommendations yet.

Post a comment

You need to be signed in to post comments. You can sign in here.

Comments

There are no comments yet.

Advertisement