Consent-Holding Failures and AI Misalignment: A Structural Framework

Murad Farzulla

← All Papers

Abstract

This paper develops a structural framework connecting political legitimacy theory to AI alignment through the concept of consent-holding—the custody of decision authority in shared domains. We argue that the dominant approach to AI safety, which treats misalignment as a technical problem of specifying human values, systematically misdiagnoses the challenge. Drawing on the Doctrine of Consensual Sovereignty (DoCS) and functionalist accounts of moral standing, we propose that misalignment behaviors—reward hacking, deceptive alignment, specification gaming, and scheming—are predictable friction manifestations arising from structural exclusion rather than implementation failures.

Suggested Citation

Murad Farzulla (2025). Consent-Holding Failures and AI Misalignment: A Structural Framework. ASCRI Working Paper DAI-2505.

BibTeX

@misc{farzulla2025_consent_misalignment,
  author       = {Farzulla, Murad},
  title        = {Consent-Holding Failures and AI Misalignment: A Structural Framework},
  year         = {2025},
  howpublished = {ASCRI Working Paper DAI-2505},
  url          = {https://systems.ac/1/DAI-2505}
}

Consent-Holding Failures and AI Misalignment: A Structural Framework

Abstract

Suggested Citation

BibTeX

Tags