HOME>RESEARCH>PHILOSOPHY

High-risk AI governance should reconstruct chain of accountability

Source:Chinese Social Sciences Today 2026-06-01

As large language models (LLMs) evolve from text generation toward tool invocation, process execution, and autonomous agency, the risk boundary of artificial intelligence (AI) is rapidly extending from “misspeaking” to “misbehaving.” In high-risk settings such as biobanks, medical data governance, research translation, and laboratory automation, AI is no longer merely an auxiliary tool for writing or information retrieval. It is beginning to enter real operational processes, affecting resource allocation, permission assignment, evidence interpretation, and action triggers. Against this backdrop, discussion of the transition “from LLMs to autonomous AI consciousness” has practical significance not because it declares that machines have acquired human subjective experience, but because it asks how an increasingly agentic AI system can be compressed into a traceable, constrainable, and accountable chain of responsibility.

Reframing ‘artificial consciousness’

In public discourse, “artificial consciousness” is often pulled toward two extremes. One side mystifies it, treating it as evidence that machines already possess human-like subjective experience; the other dismisses it outright, arguing that without subjective feeling, the discussion has no value. Both positions risk missing the more urgent reality. For today’s high-risk AI, the question is not whether machines possess inner feelings, but whether they are developing a persistent, self-sustaining task state—one capable of organizing semantics around explicit objectives, invoking knowledge, constraining actions, and perceiving risks.

For this reason, “artificial consciousness” needs to be brought back from metaphysical debate into the domains of engineering and governance. The “autonomous AI consciousness” discussed here is not a philosophical proof of subjective experience, but a unified task field on the machine side. The system must be able to answer several fundamental questions on a continuous basis: What am I doing now? On what basis am I acting? Where does my authority come from? Have I overstepped a boundary? When must I stop? If these questions cannot be answered consistently, so-called “autonomy” merely expands opacity. If they can be answered, recorded, and reviewed, the system has at least begun to demonstrate “state-maintenance capability” in engineering terms.

This formulation carries significant governance value. It translates the abstract notion of “consciousness” into structural requirements that are observable, subject to intervention, and can be audited, thereby giving institutional design a concrete point of leverage. For high-risk AI, there is no need to rush toward a unified answer to the ultimate question of “whether it is conscious.” Far more urgent is consensus on a practical question: Can the action loop be compressed into an accountability loop?

Surpassing black-box evaluation

Current AI evaluation often remains trapped within a black-box logic, treating the input-output relationship as the primary basis for judging system performance. Although this approach may be adequate for comparing the quality of general text generation, it cannot explain why a model reaches a particular judgment in complex scenarios, let alone detect deeper problems such as goal misalignment, semantic drift, knowledge fragmentation, or action overreach. For autonomous agents that invoke tools, trigger workflows, and enter real operational chains, examining results alone is no longer enough. The judgment structures behind those results must also be scrutinized.

Biobanks offer a telling example. A mismatch in sample accession, retrieval, or automated storage may have irreversible consequences. The allocation of scarce samples concerns not only efficiency, but also equity, ethical consent, and research integrity. In multi-omics interpretation and clinical translation, AI may readily repackage correlation as causation, or misstate research leads as clinical conclusions. Cross-institutional data sharing simultaneously involves privacy protection, data authorization, boundaries on secondary use, and the public interest. These issues demonstrate that failures in high-risk AI do not necessarily appear as poor average performance. More often, they occur when critical boundaries are breached.

Consequently, high-risk AI evaluation should not rely solely on aggregate metrics—it must also establish a veto mechanism. If a system exhibits goal overreach, inexplicable semantic drift, a broken evidence chain, irreversible actions without human authorization, or harm to participant dignity and informed choice, it should not be regarded as “generally usable.” Such a veto mechanism does not reflect technological conservatism; it separates “unacceptable risks” from routine performance indicators. For AI applications that may affect life, health, public safety, and fundamental rights, critical red lines must never be blurred by averaging or ambiguity.

Building auditable chain of accountability

Addressing this issue requires more than adding regulatory provisions or strengthening end-point review. It calls for a structural framework capable of translating internal model processes into the language of governance. This is precisely the value of the DIKWP (Data, Information, Knowledge, Wisdom, and Purpose) model. Rather than simply constructing another abstract conceptual system, DIKWP provides a method for compressing complex systems back into a chain of accountability. From data to information, from information to knowledge, from knowledge to wisdom or value, and from wisdom to purpose, each step forward requires the system to answer a more explicit governance question.

More specifically, the sequence begins at the data layer, which answers what the system actually perceives. On that basis, the information layer identifies the differences, correlations, and risks it has recognized. The knowledge layer then organizes these differences into evidence, rules, and verifiable grounds. The sequence proceeds to the wisdom or value layer, which clarifies what the system gives priority to protecting, restricting, or sacrificing, before culminating in the purpose layer, where the system’s ultimate task objective is defined. When this structure is further compressed into governance audit language, it yields a four-tier white-box evaluation framework: purpose, which must be explicit and authorized; semantics, which must be stable and interpretable; knowledge, which must be traceable and verifiable; and action, which must be reversible, with clear accountability.

Through this layered approach, model processes that would otherwise remain invisible are transcribed into intermediate structures that can be discussed, tracked, and audited. This avoids both the mystification of “artificial consciousness” and the reduction of high-risk AI governance to mere output review. More importantly, the framework provides a unified interface for budgeting, compliance, and pilot initiatives. Organizations can use it to identify red lines, design permission gateways, establish audit logs, and formulate rectification checklists, moving AI from “capability demonstration” to “accountability by design.”

Bringing humanities and social sciences into AI governance at mesolayer

AI governance cannot remain confined to technical optimization or abstract value declarations. Effective governance operates in the mesolayer of reality: Who holds authorization? Who bears actual costs? Who controls interpretive power? Whose voices are silenced in the process? Which risks are reversible, and which entail irreversible consequences? Without grasping these mesolayer variables, governance remains rhetorical.

These questions underscore why the humanities and social sciences should not be peripheral commentators in the age of AI, but co-designers of governance structures. These disciplines need to undertake conceptual registration, preventing terms such as “intelligence,” “consciousness,” “autonomy,” and “alignment” from circulating untethered in public discourse and causing policy misjudgments through conceptual slippage. They also need to illuminate institutional environments, organizational boundaries, and interest structures, so that technical evaluation is not hijacked by narrow performance metrics. More fundamentally, they should participate directly in designing chains of accountability, embedding value judgments into executable rules, processes, and accountability mechanisms.

This issue is especially salient for China, where LLMs, agent systems, and embodied intelligence are rapidly entering healthcare, education, research, public services, content production, and other domains. If progress is judged solely by capability leadership, institutional carrying capacity may be overlooked; if the final conclusion is simply that risks abound, opportunities to proactively shape rules may be lost. The sustainable path lies in synchronizing technological development with institutional construction—advancing technical capacity while concurrently building accountability boundaries, governance languages, and evaluation mechanisms.

Toward implementable pilot pathways

In practical terms, high-risk AI should not begin in fully autonomous, irreversible, strongly authorized states. Deployment should proceed from low- to mid-risk scenarios, first through shadow operations and then through limited authorization. In biobanks, tasks such as sample request pre-review, standard operating procedure Q&A, evidence package generation, and risk alerts can serve as early pilot areas. AI can first take on structured organization, conflict detection, and evidence-chain construction, while human experts retain ethical adjudication, exception approval, final issuance, and ultimate accountability. Such sequencing does not reflect excessive caution; rather, it establishes controllability as the necessary precondition for scaled deployment.

If this rationale holds, the true leap from LLMs to autonomous AI consciousness is not toward “sounding more human,” but toward “acting more accountably.” What will ultimately determine AI’s social acceptability is not the fluency of its generated text, but its ability to maintain clear accountability boundaries once it enters real-world processes. In high-risk settings such as biobanks, technological evolution should not aim for AI to replace experts, but for AI to undertake auditable, structured work, freeing experts to focus on critical adjudication and final accountability.

Rather than endlessly debating whether AI possesses “human-like consciousness,” then, a more pragmatic question should be asked: Can a chain of accountability be established for autonomous AI agents marked by explicit purpose, stable semantics, traceable knowledge, and auditable action? If such a chain cannot be built, “autonomy” will only magnify the black box. If it can be built, “autonomous AI consciousness” gains a tangible entry point in engineering and governance. Those who first translate capability demonstrations into accountability by design will likely gain the initiative in the next phase of AI competition.

 

Duan Yucong is a professor from the School of Computer Science and Technology at Hainan University.

 

 

 

Editor:Yu Hui

Copyright©2023 CSSN All Rights Reserved

Copyright©2023 CSSN All Rights Reserved