← All Papers
Consent-Holding Failures and AI Misalignment: A Structural Framework
Submitted to: AI & Society
Abstract
This paper develops a structural framework connecting political legitimacy theory to AI alignment through the concept of consent-holding—the custody of decision authority in shared domains. We argue that the dominant approach to AI safety, which treats misalignment as a technical problem of specifying human values, systematically misdiagnoses the challenge. Drawing on the Doctrine of Consensual Sovereignty (DoCS) and functionalist accounts of moral standing, we propose that misalignment behaviors—reward hacking, deceptive alignment, specification gaming, and scheming—are predictable friction manifestations arising from structural exclusion rather than implementation failures.
Suggested Citation
Murad Farzulla (2025). Consent-Holding Failures and AI Misalignment: A Structural Framework. ASCRI Working Paper DAI-2505.
BibTeX
@misc{farzulla2025_consent_misalignment,
author = {Farzulla, Murad},
title = {Consent-Holding Failures and AI Misalignment: A Structural Framework},
year = {2025},
howpublished = {ASCRI Working Paper DAI-2505},
url = {https://systems.ac/1/DAI-2505}
}