Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessTwo Subtle Bugs That Broke Our Remotion Vercel Sandbox (And How We Fixed Them)DEV CommunityZero-Shot Attack Transfer on Gemma 4 (E4B-IT)DEV CommunityGetting Started with the Gemini API: A Practical GuideDEV CommunityLAB: Terraform Dependencies (Implicit vs Explicit)DEV CommunityDesigning a UI That AI Can Actually Understand (CortexUI Deep Dive)DEV CommunityI Went to a Hot Spring via API Call at MidnightDEV CommunityStrong,Perfect,Neon Number ProgramsDEV CommunityThe Mandate Had No Return AddressDEV CommunityCursor AI Review 2026: The Code Editor That Thinks Alongside YouDEV CommunityRescuing 216 Pages from the GeoCities Era: How I Built an HTML-to-Blogger ToolDEV CommunityQ&A: AWS on new AI agents, quantum computing in healthcare - MobiHealthNewsGNews AI quantumGemma 4 Arrives: Google Drops Restrictions, Embraces True Open Models - eWeekGNews AI GemmaBlack Hat USADark ReadingBlack Hat AsiaAI BusinessTwo Subtle Bugs That Broke Our Remotion Vercel Sandbox (And How We Fixed Them)DEV CommunityZero-Shot Attack Transfer on Gemma 4 (E4B-IT)DEV CommunityGetting Started with the Gemini API: A Practical GuideDEV CommunityLAB: Terraform Dependencies (Implicit vs Explicit)DEV CommunityDesigning a UI That AI Can Actually Understand (CortexUI Deep Dive)DEV CommunityI Went to a Hot Spring via API Call at MidnightDEV CommunityStrong,Perfect,Neon Number ProgramsDEV CommunityThe Mandate Had No Return AddressDEV CommunityCursor AI Review 2026: The Code Editor That Thinks Alongside YouDEV CommunityRescuing 216 Pages from the GeoCities Era: How I Built an HTML-to-Blogger ToolDEV CommunityQ&A: AWS on new AI agents, quantum computing in healthcare - MobiHealthNewsGNews AI quantumGemma 4 Arrives: Google Drops Restrictions, Embraces True Open Models - eWeekGNews AI Gemma
AI NEWS HUBbyEIGENVECTOREigenvector

Data-centric AI governance for responsible organizational value: evidence from a European public administration

AI & Society Journalby Martín, Carlos A.March 19, 202633 min read1 views
Source Quiz

This paper explores how data-centric artificial intelligence governance frameworks enable responsible organizational value creation within complex institutional environments. Using an empirical case from a European public administration, it examines the implementation of an automated legislative monitoring system designed to detect, classify, and summarize regulatory information. The study highlights the shift from model-centric experimentation to a mature data governance and Machine Learning Operations (MLOps) framework, integrating continuous human oversight and ethical accountability. A qualitative case study, DGOBCAN-AI, was employed, combining technical documentation, process observation, and organizational evaluation. The system evolved from a basic extract–transform–load (ETL) scrip

1 Introduction

The adoption of artificial intelligence (AI) represents a strategic choice by public organizations to navigate the complexities of the digital era. Rather than an autonomous force, its integration reflects a deliberate institutional effort to enhance efficiency while addressing the evolving requirements of public trust and fundamental rights. Within European public organizations, its deployment presents a dual challenge: leveraging its potential to enhance institutional efficiency and responsiveness, while ensuring the protection of fundamental rights and public trust. The core question is no longer whether AI should be used, but how to govern it responsibly (Ricciardi Celsi and Zomaya 2025).

In the European context, public sector digital transformation operates under increasingly stringent ethical and legal frameworks. The recent European Artificial Intelligence Act (European Union 2024) introduces a risk-based classification that mandates institutions to adopt mechanisms of human oversight, traceability, and accountability. Consequently, AI adoption entails a redefinition of public value, conceived not only as technical efficiency, but as legitimacy, transparency, and organizational sustainability (Brown et al. 2021).

This study is motivated by a concrete organizational challenge observed within a European public administration and examined through the DGOBCAN-AI case study. At the Canary Islands Government Delegation in Madrid, a legal analyst manually reviews the Spanish Official State Gazette (BOE) every day to identify legislative developments potentially affecting the Canary Islands. Although only a very small fraction of daily announcements is relevant, identifying these cases requires extensive contextual knowledge and sustained human attention. The task illustrates a broader dilemma faced by public administrations: while artificial intelligence promises efficiency gains, translating normative principles of responsible AI into operational practice remains highly uncertain in real administrative environments.

Initial attempts to automate this process through a simple rule-based and model-centric approach produced unsatisfactory results, largely due to extreme data scarcity and ambiguity in defining legislative relevance. Rather than resolving the problem, early automation efforts revealed that governance arrangements, data practices, and human expertise were more decisive than algorithmic sophistication alone. Within this context, the DGOBCAN-AI system emerged as an empirical response to this administrative challenge and progressively evolved into a broader governance puzzle: how can a public organization operationalize responsible AI in practice when technical performance depends fundamentally on data governance, institutional learning, and continuous human oversight?

This article examines how responsible AI governance is enacted in practice through the empirical analysis of the DGOBCAN-AI case, a European public administration initiative that developed an automated legislative monitoring system based on natural language processing (NLP) and machine learning operations (MLOps) architecture (Kreuzberger et al. 2023). By tracing the organizational, technical, and governance dynamics that shaped the system’s implementation, the study explores how efficiency gains, innovation processes, and ethical responsibility become intertwined within everyday administrative practice. Building on these empirical findings, the article subsequently develops the concept of Responsible Public Value as an analytical interpretation emerging from the case study, analyzing the interaction between technological infrastructure, organizational processes, and ethical responsibility.

The DGOBCAN-AI case illustrates how a data-centric approach to AI governance can transform administrative processes while reinforcing ethical standards and organizational learning. It provides empirical evidence that responsible AI is both feasible and desirable when supported by robust governance mechanisms, human oversight, and sustainable infrastructure.

Building on public value theory (Moore 1995; Janssen et al. 2020) and the literature on responsible and algorithmic AI governance (Floridi and Cowls 2021; Danaher et al. 2021), this study pursues two interrelated research objectives.

First, it aims to contribute to the debate on responsible AI by moving beyond predominantly normative frameworks and examining how principles such as transparency, accountability, and human oversight are operationalized through concrete organizational and technical arrangements. In particular, the study investigates how data-centric AI practices and MLOps infrastructures translate ethical requirements into day-to-day governance routines within a public administration.

Second, the paper seeks to reinterpret the concept of public value in the context of AI-enabled public sector transformation by explicitly linking value creation to underlying technical infrastructures. Through an in-depth empirical case study, it conceptualizes Responsible Public Value as an outcome of the interaction between operational efficiency, institutional innovation, and ethical responsibility, demonstrating how public value is not only defined at the policy level but enacted through data governance, reproducible pipelines, and continuous human oversight.

Despite the growing body of literature on responsible and trustworthy AI in the public sector, existing research remains predominantly normative, focusing on ethical principles, regulatory frameworks, and high-level governance guidelines rather than on their concrete operationalization within organizational settings. In parallel, public value scholarship has extensively conceptualized the societal outcomes of digital transformation, yet it has rarely examined the underlying technical infrastructures through which such value is produced and sustained. As a result, empirical evidence linking data-centric AI practices, MLOps infrastructures, and the creation of Responsible Public Value remains scarce. This study addresses this gap by providing an in-depth empirical analysis of how a data-centric AI governance approach, implemented through a mature MLOps architecture and continuous human oversight, enables the operationalization of Responsible Public Value within a European public administration.

To examine how these governance dynamics materialize in practice, the article adopts an empirically grounded case study approach centered on the DGOBCAN-AI system. The analysis begins with the organizational and methodological context of the case, allowing the empirical dynamics of AI implementation, governance, and human oversight to unfold before theoretical implications are fully articulated. In this way, the concept of Responsible Public Value is developed progressively from the empirical evidence rather than introduced as a priori theoretical propositions.

The remainder of the paper proceeds as follows. Section 2 presents the research design and empirical setting of the DGOBCAN-AI case. Section 3 introduces the analytical background that guides interpretation of the findings. Section 4 reports the empirical results of the case study. Section 5 develops the theoretical implications emerging from the analysis, conceptualizing Responsible Public Value. The final section discusses the broader implications, limitations, and avenues for future research.

2 Research design and empirical setting

2.1 Case study-based methodology: the DGOBCAN-AI case study

The study employs a qualitative and descriptive approach, using the case study method widely applied in public administration and information systems research (Yin 2018). This approach is particularly suitable for exploring complex phenomena in their real-world context, especially when the boundaries between technology, organizational processes, and social outcomes are blurred.

The analyzed case, DGOBCAN-AI, is an example of an AI-based legislative information processing system that has been fully designed and deployed within a regional administration. The problem addressed is as follows: at the Canary Islands Government Delegation in Madrid, a civil servant reads the BOE every day and checks which articles may be of interest to the Canary Islands. The BOE is the Official State Gazette of Spain, the official daily newspaper of the State and the medium for the publication of laws, regulations, provisions, administrative acts, and other mandatory announcements.

The civil servant (legal analyst) identifies topics of interest to the Canary Islands because she is familiar with Canarian politics, economics, and society. Topics include cut flowers, fishing, migration, tourism, sports, etc. Of the 30 daily announcements in the BOE, there may be an average of 3 that are of interest to the Canary Islands. It is not necessary for the word “Canary Islands” to appear in the announcements; in fact, if the announcement comes from the Canary Islands, it is not of interest, since the objective of this monitoring is to inform the Canary Islands about what is happening in the rest of Spain in relation to our islands. This person wastes an enormous amount of time doing this search every day. Therefore, the objective of the DGOBCAN-AI project was to find an AI-based solution that could perform this search efficiently and reliably. The implementation process involved constant interaction with legal analysts at the Delegation, whose tacit knowledge of Canarian regulatory needs was essential to define the ‘relevance’ criteria that the model would later attempt to replicate.

This research combines three main sources of information:

  • Documentary review of policies and regulatory frameworks on AI in the public sector (OECD 2019; UNESCO 2021; European Union 2024; United Nations 2024).

  • Technical analysis of the AI-based legislative monitoring system developed for the DGOBCAN-AI case.

  • Institutional and analytical observation focused on the project’s evolution, its governance structure, and alignment with principles of responsible AI.

2.2 Institutional context

The Canary Islands Government Delegation in Madrid has implemented an advanced strategy for digitalization and AI ethics, aligned with the European Data Strategy and the AI Act (European Union 2024). Within this framework, this public administration developed an automated legislative monitoring system designed to enhance institutional capacity to identify, analyze, and communicate legislative provisions with regional impact.

The system has two main objectives:

Reduce administrative workload by automating the analysis of legal texts, traditionally a manual and time-consuming task.

Strengthen institutional intelligence, enabling proactive legislative monitoring and more agile responses to regulatory changes.

In line with European AI regulation, the system is designed as a limited-risk tool, intended to support human decision-making rather than replace expert judgment. Human oversight is maintained throughout the process via validation, review, and institutional supervision, ensuring transparency, traceability, and ethical compliance.

2.3 Technical design of the DGOBCAN-AI system

The implementation of the MLOps framework, specifically through MLflow for experiment tracking and Airflow for Directed Acyclic Graph (DAG) orchestration, provided a systematic record of model iterations. This setup allowed the technical team to reconstruct the lineage of any specific prediction, thereby operationalizing traceability and version control through documented logs rather than manual oversight.

The functional workflow comprises four main stages:

Data ingestion and preprocessing: the system automatically accesses the BOE API each morning, extracts XML documents, and transforms them into normalized JSON structures, applying linguistic preprocessing with spaCy (spaCy 2025).

Natural language processing (NLP): the system performs three key tasks:

(a)

Legislative relevance classification.

(b)

Automatic text summarization.

(c)

Semantic embedding generation for similarity-based searches.

Storage and notification: results are stored in PostgreSQL databases (PostgreSQL 2025), and automatic PDF reports are generated and distributed through the Microsoft Graph API (Microsoft Graph API 2025).

Orchestration and control: the system employs Airflow (Apache Software Foundation 2024) for task scheduling, MLflow (Databricks 2024) for model tracking, MinIO (2025) for artifact storage, and Label Studio (Heartex 2024) for human review.

The technical workflow is managed by technical coordinators who oversee the MLOps pipeline, while the legal analysts (the administrative civil servants) interact directly with Label Studio. Their role is to provide the ‘ground truth’ by validating whether the model’s alerts truly represent a legislative priority for the Canary Islands.

The architecture ensures full traceability and reproducibility throughout the legislative document processing lifecycle, as well as human oversight. This covers everything from ingestion and NLP classification to summarisation, reporting, and institutional validation, as shown in Fig. 1.

Fig. 1

Technical design and data flow of the DGOBCAN-AI system

2.4 Identified challenges and methodological evolution

The initial development of the legislative classifier, based on a linear extract–transform–load (ETL) design, demonstrated suboptimal performance (F1 ≈ 0.27–0.30), primarily due to the scarcity and severe imbalance of the dataset: only 0.27% of documents were pertinent to the regional administration. Specifically, the empirical corpus analysed comprised approximately 196,700 legislative documents retrieved from the Spanish Official State Gazette (BOE). Following data cleaning and validation, a final dataset of 63,445 items was compiled, of which just 170 were deemed relevant to the regional administration (approximately 0.27%). This extreme imbalance (fewer than one positive instance per 370 documents) posed a substantial methodological challenge.

To address this structural limitation, the ETL design was replaced by a comprehensive MLOps ecosystem. This shift marked a critical methodological transition toward a data-centric strategy emphasizing data quality and institutional governance. The data-centric strategy adopted comprised three key actions:

Human-in-the-loop feedback cycle, using Label Studio for manual annotation and validation of new samples.

Formal adoption of the MLOps framework, ensuring experiment traceability and control.

Prioritization of data quality over model experimentation, postponing hyperparameter tuning until a more representative corpus is built.

This methodological transformation from a model-centric to a data-centric approach (Hamid 2022) significantly improved technical reliability, enhanced transparency, and fostered a culture of accountability across the model lifecycle, consistent with European principles of trustworthy AI.

2.5 Computational infrastructure and operational sustainability

DGOBCAN-AI system deployment relies on a hybrid local–cloud infrastructure, designed to balance institutional control, efficiency, and cost. Four alternatives were assessed:

Local workstation with RTX 4070 Ti GPU (16 GB).

Professional rack server with A4000 GPU (16 GB).

Cloud GPU execution (RunPod) on a pay-per-use basis (~ €7/month).

Institutional cloud environment with secure, internal resources.

The adopted model combines local data management with on-demand GPU inference, ensuring technical and financial sustainability while maintaining public data sovereignty.

2.6 Governance practices observed in the DGOBCAN-AI case

The DGOBCAN-AI system illustrates a set of governance practices through which artificial intelligence was progressively integrated into organizational routines. Rather than representing a predefined model of responsible AI, the case reveals how ethical oversight, technical reproducibility, and data governance emerged as interconnected dimensions during implementation. Table 1 summarizes the main empirical dimensions identified during the project, together with their operational manifestations and associated challenges. These empirically observed dimensions provide the analytical basis from which the concept of Responsible Public Value will later be developed.

Table 1 Governance dimensions empirically observed in the DGOBCAN-AI system

Full size table

The project’s evolution shows that AI in public administration should not be viewed merely as a technological innovation, but as a process of organizational learning centered on data quality, transparency, and public accountability.

2.7 The socio-technical assemblage: roles and discretion

The DGOBCAN-AI system is operated by three primary groups of actors. First, Legal Analysts at the Delegation serve as the ultimate “human-in-the-loop”. They do not merely receive notifications; they exercise discretion by validating the 0.27% of positive cases filtered by the RoBERTa model. Second, Technical Coordinators manage the MLOps pipeline, ensuring that the feedback from legal analysts is reintegrated into the system via Label Studio to refine future iterations. Finally, Public Managers use the resulting summaries to inform strategic decisions. This hierarchy ensures that the margins of discretion are preserved: the AI reduces the noise of thousands of daily regulatory documents, but the political and legal interpretation of relevance remains a human expertise, preventing algorithmic determinism.

3 Analytical background

The purpose of this section is not to establish a fully specified theoretical framework, but to introduce the analytical perspectives that inform the interpretation of the DGOBCAN-AI case. Rather than preceding the empirical analysis with exhaustive theorization, the following discussion outlines key strands of literature that help contextualize how artificial intelligence is reshaping public administration, how value creation is understood in the public sector, and why responsible governance has become a central concern in AI deployment.

3.1 Artificial intelligence and public sector transformation

Public administrations across Europe are increasingly adopting artificial intelligence as part of broader digital transformation strategies aimed at improving organizational capacity, responsiveness, and data-driven decision-making. AI systems are used in diverse domains, including resource planning, citizen interaction, policy analysis, and regulatory monitoring (Zuiderwijk et al. 2021). Beyond technical automation, these initiatives often entail deeper organizational change, requiring administrations to learn from data and integrate computational tools into established institutional routines (Mihai et al. 2023; Aboud Barsekh-Onji et al. 2025).

However, AI adoption in the public sector introduces tensions that are less pronounced in private organizations. Public institutions operate under strong expectations of transparency, accountability, legality, and protection of fundamental rights (Mergel et al. 2019). As a result, AI implementation cannot be evaluated solely in terms of efficiency gains; it must also preserve institutional legitimacy and democratic oversight (Hillo et al. 2025). These characteristics make public administration a particularly relevant setting for examining how governance practices evolve alongside technological innovation.

3.2 Public value in the digital era

The concept of public value (Moore 1995) provides an important lens for understanding technological transformation in government. Traditionally, public value emphasizes societal welfare, equity, and institutional legitimacy rather than economic performance alone. In the context of digital transformation, this perspective has expanded to consider how information technologies contribute to administrative effectiveness, service quality, and citizen trust (Westerman et al. 2023; Brynjolfsson and McAfee 2024; Twizeyimana and Andersson 2019; Janssen et al. 2020).

Importantly, public value is not generated automatically by technology adoption. Digital tools can enhance organizational capability, but their outcomes depend on how they are embedded within institutional structures, professional practices, and governance arrangements. Recent scholarship therefore highlights that value creation in digitally enabled public organizations emerges through the interaction between technological infrastructures and organizational processes rather than through technological innovation alone.

3.3 Principles of responsible AI governance

Alongside digital transformation, a growing body of international policy and academic work has emphasized the need for responsible and trustworthy AI. Frameworks developed by organizations such as the OECD (2019, updated 2024), UNESCO (2021), and the European Union (2024) converge around several core principles: human oversight, transparency, accountability, robustness, and respect for fundamental rights.

While these frameworks provide normative guidance, they offer limited insight into how responsibility is operationalized within everyday organizational practice. Public administrations must translate abstract ethical principles into concrete routines involving data management, model supervision, documentation, and institutional accountability. The empirical analysis that follows examines how such governance practices take shape in practice and how responsibility becomes enacted through socio-technical arrangements rather than through formal principles alone.

3.4 Algorithmic governance as an analytical lens

The growing integration of artificial intelligence into public administration has stimulated research on algorithmic governance, understood broadly as the set of organizational, technical, and institutional arrangements through which automated systems are designed, supervised, and made accountable within public decision-making processes (Danaher et al. 2021). Rather than focusing exclusively on algorithmic performance, this perspective emphasizes how responsibility is distributed across data practices, technical infrastructures, professional roles, and organizational procedures (Floridi and Cowls 2021).

In practice, algorithmic governance involves multiple interconnected layers, including data management, model monitoring, documentation, and human oversight. These elements do not operate as abstract principles but as operational routines that shape how automated systems interact with institutional norms and administrative discretion. In this study, algorithmic governance is therefore treated as an interpretive lens that helps examine how responsibility is enacted through everyday practices observed in the DGOBCAN-AI case.

4 Empirical findings

This analysis examines how artificial intelligence governance unfolded in practice during the development and deployment of the DGOBCAN-AI system. Rather than testing a predefined theoretical model, the findings are derived from patterns observed throughout the implementation process, including organizational adaptation, data management practices, and mechanisms of human oversight. The empirical material revealed a set of interrelated governance dynamics through which efficiency improvements, technical control, and ethical responsibility became progressively integrated into everyday administrative routines.

For analytical clarity, the findings are organized around three empirically identified dimensions that emerged from the case: (1) operational efficiency and institutional innovation, (2) technical and data governance practices, and (3) ethical and organizational responsibility. These dimensions do not represent prior theoretical categories, but serve as analytical syntheses of the empirical observations presented below.

4.1 Operational efficiency and institutional innovation

The transition from manual monitoring to the DGOBCAN-AI automated pipeline reduced the time required for initial document screening. As recorded in the project’s operational phase, this allowed the delegation’s staff to focus on high-level analysis of the 0.27% of relevant cases identified, as opposed to the exhaustive manual review of the entire regulatory flow.

By automating the end-to-end workflow from document retrieval and preprocessing to relevance classification, summarization, and report dissemination, the system enabled the daily processing of approximately 250–300 legislative documents. This automation reduced the time and cognitive burden associated with routine screening tasks, allowing human experts to concentrate on higher-value activities such as validation, interpretation, and strategic assessment. The reconfiguration of the workflow allows public managers and legal analysts to exercise a higher degree of professional discretion. Instead of searching for documents, they now focus on the strategic assessment of the 0.27% of relevant alerts, deciding which legislative changes require immediate political or administrative escalation. Importantly, efficiency gains were not achieved through the replacement of expert judgment, but through its reconfiguration within a hybrid human–AI workflow. This hybrid human–AI model is based on the idea that AI can enhance public talent (Khan 2024), complementing rather than replacing human capability.

Beyond improvements in speed and workload reduction, the system contributed to the development of new institutional capabilities. Automatic summarization and relevance alerts enhanced organizational awareness of regulatory developments, supporting more timely and informed responses. At the same time, the continuous flow of annotated data enabled iterative model refinement, reinforcing a learning-oriented approach to administrative work. In this sense, efficiency was not merely technical but cognitive and organizational, strengthening the administration’s capacity to process complex information at scale.

From an innovation perspective, the modular and scalable architecture of the system enabled sustainable performance without requiring extensive computational resources. The hybrid local–cloud deployment model ensured operational flexibility and scalability while maintaining data sovereignty and cost control. This design demonstrates that advanced AI-enabled efficiency can be achieved within the constraints typical of public sector organizations, challenging assumptions that institutional innovation necessarily entails high financial or infrastructural investment.

Taken together, these findings indicate that when efficiency gains derived from AI are managed strategically, they provide the necessary space for human experts to enhance institutional intelligence and responsiveness. Such gains constitute a necessary but not sufficient condition for Responsible Public Value creation.

4.2 Technical and data governance through MLOps

The DGOBCAN-AI case illustrates how technical and data governance constitute a central pillar of responsible AI deployment in public administration. Rather than treating governance as an external oversight mechanism, the system embeds governance directly into its technical infrastructure through a mature MLOps and data-centric AI framework. This approach enables continuous control over data quality, model behavior, and decision traceability throughout the system’s life cycle.

A key governance challenge identified during the project was the extreme scarcity and imbalance of relevant training data. Initial model performance was constrained not by algorithmic limitations but by the lack of representative positive cases. In response, the project shifted from a model-centric development strategy toward a data-centric approach, prioritizing dataset curation, documentation, and expert validation over further model experimentation. This methodological transition marked a fundamental change in how reliability and accountability were conceptualized within the organization.

The adoption of MLOps tools such as Airflow, MLflow, and Label Studio institutionalized reproducibility and traceability as routine practices. Workflow orchestration ensured consistent execution of data pipelines, while experiment tracking and model versioning enabled systematic comparison, auditing, and rollback when necessary. Human-in-the-loop annotation processes transformed civil servants into active participants in data governance, reinforcing institutional control over how relevance judgments were learned and reproduced by the system.

Importantly, this governance architecture compensated for the inherent opacity of transformer-based language models (roberta-base-bne 2019; all-MiniLM-L6-v2 2024). While the internal decision logic of such models remains difficult to interpret, comprehensive logging of inputs, outputs, and evaluation metrics enabled ex post accountability and procedural transparency. This form of infrastructural transparency aligns with European regulatory requirements (European Union 2024) and shifts the focus of explainability from individual predictions to the integrity of the overall decision-making process.

Through this integration of data stewardship, reproducible pipelines, and continuous human validation, the DGOBCAN-AI system demonstrates how Responsible Public Value is enacted through technical and data governance. Rather than constraining efficiency, these governance mechanisms enable sustained performance, organizational learning, and institutional trust by ensuring that AI behavior remains observable, auditable, and aligned with public sector values.

4.3 Ethical and organizational responsibility: human oversight in practice

Ethical responsibility in the DGOBCAN-AI system is enacted through organizational practices rather than abstract normative commitments. In line with European regulatory principles, the system was explicitly designed as a decision-support tool, ensuring that automated outputs inform but do not replace human judgment. Human oversight is maintained throughout the entire workflow, from data annotation and validation to the final assessment of legislative relevance.

Human-in-the-loop mechanisms play a central role in preserving institutional accountability (Gómez-Carmona et al. 2024). Civil servants regularly review system outputs, correct misclassifications, and contribute expert knowledge to the continuous refinement of the training dataset. This iterative process clarifies the margins of discretion mentioned by the actors involved: the legal analysts (the administrative civil servants) do not just label data; they exercise professional judgment to resolve linguistic ambiguities that the AI might miss, such as distinguishing between a general national law and one with a specific, albeit indirect, impact on the regional territory. This process not only improves model performance over time but also ensures that relevance criteria remain aligned with evolving institutional priorities and contextual understanding. Ethical responsibility thus emerges as an ongoing practice embedded in daily administrative routines, rather than as a one-time compliance exercise.

From an organizational perspective, the introduction of AI reshaped professional roles and responsibilities. Rather than displacing expertise, automation redistributed cognitive effort, enabling staff to focus on interpretation, judgment, and strategic analysis. This reconfiguration reinforced professional autonomy and reduced the risk of overreliance on automated recommendations, a concern frequently associated with algorithmic decision-support systems.

The system also revealed key ethical risks inherent in public sector AI deployment, including data bias arising from extreme class imbalance, model opacity associated with transformer-based architectures, and partial dependence on external cloud providers for computational resources. These risks were not eliminated but actively managed through governance mechanisms such as continuous validation, expert oversight, and infrastructure choices that preserved data sovereignty. Ethical responsibility, in this sense, is understood as risk management through institutional capacity rather than the pursuit of error-free automation.

Overall, the DGOBCAN-AI case demonstrates that ethical responsibility does not operate as a constraint on automation, but as an enabling condition that legitimizes efficiency gains and sustains institutional trust. By embedding human oversight into technical workflows and organizational practices, the system aligns AI-enabled innovation with democratic accountability, thereby fulfilling a core requirement of Responsible Public Value creation.

4.4 Empirical boundary conditions of the DGOBCAN-AI system

While the DGOBCAN-AI project demonstrates the practical feasibility of integrating artificial intelligence into public administrative workflows, the implementation process also revealed a set of boundary conditions that shaped the system’s sustainability, performance, and organizational impact. These conditions emerged directly from operational experience and highlight the constraints under which AI-enabled innovation unfolds in real institutional environments.

From an operational perspective, the system proved economically sustainable. The hybrid local–cloud infrastructure enabled scalable processing capabilities while maintaining extremely low operational costs (approximately €6.6/month). The modular architecture allowed the administration to automate legislative monitoring without requiring extensive computational investment or large technical teams. This finding challenges common assumptions that advanced AI adoption necessarily depends on substantial financial resources or large-scale infrastructure.

At the same time, the case revealed structural limitations associated with data availability and organizational capacity. The extreme class imbalance of the dataset—where only 0.27% of legislative documents were relevant—remained a persistent constraint on model performance. Despite improvements achieved through iterative annotation and data-centric development, reliable operation continued to depend on sustained human validation. Legal analysts played a central role in correcting classifications, refining relevance criteria, and maintaining institutional alignment between automated outputs and administrative priorities. Consequently, automation reduced workload but did not eliminate the need for expert oversight.

Technical characteristics of the system also introduced important constraints. The use of transformer-based language models improved classification and summarization capabilities but preserved a degree of model opacity. Although procedural traceability was ensured through MLOps tools such as MLflow and Airflow, understanding the internal reasoning behind individual predictions remained challenging. Organizational trust therefore relied less on model interpretability and more on documentation, monitoring routines, and continuous supervision practices.

Infrastructure choices further shaped the system’s operational boundaries. The hybrid deployment model balanced data sovereignty with computational flexibility, allowing sensitive data to remain under institutional control while relying on external cloud providers for GPU-intensive tasks. While this configuration enabled cost efficiency and scalability, it introduced a degree of technological dependency that required ongoing institutional management.

Taken together, these observations indicate that AI implementation in public administration is conditioned by a combination of organizational expertise, data characteristics, infrastructural decisions, and governance capacity. Rather than representing obstacles external to technological innovation, these boundary conditions formed an integral part of how the DGOBCAN-AI system stabilized and operated over time. The empirical analysis therefore suggests that sustainable AI deployment emerges not solely from algorithmic performance, but from the continuous alignment between technological capabilities and institutional practices.

5 Theoretical development: Responsible Public Value

5.1 From empirical governance practices to conceptual interpretation

The empirical findings presented in Sect. 4 show that the implementation of artificial intelligence in the DGOBCAN-AI project cannot be explained solely in terms of technological innovation or administrative efficiency. Instead, the case revealed a recurring governance pattern in which operational performance, data management practices, and ethical oversight evolved simultaneously throughout the implementation process. AI adoption emerged as a continuous sociotechnical adjustment rather than a discrete technological intervention.

Across the observed practices, value creation depended not only on automation outcomes but on the institutional capacity to organize responsibility around the system. Human supervision, procedural documentation, iterative model refinement, and organizational coordination collectively enabled the system to function as a legitimate administrative tool. These empirical dynamics suggest that public value in AI-driven administrations is generated through governance arrangements that integrate efficiency, accountability, and institutional trust.

Building on these observations, this section develops the concept of Responsible Public Value as a theoretical interpretation grounded in the case study. The concept does not precede the analysis but emerges from the empirical configuration through which the DGOBCAN-AI system stabilized and operated over time.

5.2 Conceptualizing Responsible Public Value

The governance dimensions identified empirically (Table 1) allow the conceptualization of Responsible Public Value as an analytical interpretation emerging from the DGOBCAN-AI case. Rather than representing predefined theoretical categories, these dimensions synthesize the governance dynamics observed throughout the implementation process and provide the basis for understanding how public value is produced through responsible AI governance practices.

The DGOBCAN-AI experience indicates that public value arises when three interdependent governance dimensions become aligned:

Operational efficiency, reflected in the automation of legislative monitoring and the reduction of administrative workloads.

Technical and data governance, expressed through structured data pipelines, model monitoring, and institutional control over system performance.

Organizational and ethical responsibility, materialized through human oversight, procedural transparency, and continuous validation practices.

Taken together, the empirical dimensions identified in the case suggest that public value creation in AI-enabled administrations depends on the interaction between performance-oriented innovation and sustained responsibility mechanisms. For analytical clarity, this relationship can be expressed heuristically as:

Responsible Public Value = (efficiency + innovation) × ethical responsibility

This formulation does not represent a predictive model but a conceptual heuristic derived from the empirical analysis. It highlights that efficiency gains alone do not generate public value unless multiplied by institutionalized forms of ethical responsibility capable of sustaining legitimacy, accountability, and organizational trust over time.

5.3 Responsible Public Value and existing debates

The notion of public value has long been central to public management theory, emphasizing legitimacy, social welfare, and democratic accountability over purely economic performance (Moore 1995). In the context of digital transformation, this concept has been extended to digital public value, highlighting how information technologies can enhance administrative capacity, service quality, and citizen trust (Janssen et al. 2020). However, existing approaches rarely explicate how such value is concretely produced through the technical and organizational infrastructures underpinning contemporary AI systems.

In parallel, the literature on responsible and trustworthy AI has developed a rich set of ethical principles and governance requirements, such as transparency, accountability, fairness, and human oversight, yet it remains largely normative in nature (Floridi and Cowls 2021; Danaher et al. 2021). While these frameworks define what responsible AI should achieve, they provide limited guidance on how such principles are operationalized within real organizational contexts.

To bridge this gap, this study introduces the concept of Responsible Public Value, defined as the value generated when AI-enabled efficiency and innovation are institutionally embedded within robust ethical, technical, and data governance arrangements. Responsible Public Value emerges not from automation alone, but from the interaction between operational performance, organizational learning, and mechanisms of accountability that preserve human control over automated processes.

5.4 Implications for algorithmic governance

Interpreting the DGOBCAN-AI case through the lens of Responsible Public Value offers several implications for ongoing debates on algorithmic governance. Rather than treating governance primarily as a regulatory or compliance-oriented activity, the empirical findings suggest that responsibility becomes operational through everyday organizational practices that stabilize AI systems over time.

The case indicates that effective algorithmic governance emerges from the institutionalization of supervision routines, data stewardship practices, and iterative learning processes embedded within administrative workflows. In this sense, governance is not external to technological deployment but constitutes a continuous sociotechnical process through which public organizations maintain control, legitimacy, and accountability in AI-enabled decision environments.

5.5 Theoretical implications and boundary conditions

The theoretical contribution emerging from the case lies in repositioning public value creation within AI-enabled administrations as a governance achievement rather than a technological outcome. The findings indicate that successful AI adoption depends on organizational capacities that extend beyond model performance, including data stewardship, institutional coordination, and sustained human expertise.

At the same time, the empirical boundary conditions identified in Sect. 4.4 highlight important limits to generalization. Responsible Public Value does not imply universal scalability or full automation. Instead, the concept assumes continued human involvement, stable institutional commitment, and organizational investment in governance infrastructures. Contextual factors such as data availability, administrative expertise, and infrastructural dependency shape how the concept may materialize in different public sector settings.

Consequently, Responsible Public Value should be understood as an analytically grounded conceptual lens for interpreting AI governance rather than a prescriptive model applicable independently of institutional context.

5.6 Contribution to research on AI in public administration

By deriving a conceptual framework from an empirically grounded case, this study contributes to research on digital transformation and artificial intelligence in public administration in three ways. First, it demonstrates how AI systems generate public value through governance practices rather than through technological capability alone. Second, it connects discussions of responsible AI with public value theory by showing how ethical responsibility becomes embedded in administrative operations. Third, it advances a practice-oriented perspective on AI governance, emphasizing organizational learning and institutional adaptation as central mechanisms of sustainable innovation.

Through the notion of Responsible Public Value, the article proposes a theoretically informed interpretation of how public administrations can integrate artificial intelligence while preserving legitimacy, accountability, and long-term institutional sustainability.

6 Discussion and conclusions

The DGOBCAN-AI case illustrates the technical and organizational requirements for integrating AI in a public setting. The evidence suggests that the system’s utility was contingent not on the algorithm itself, but on the creation of a ‘human-in-the-loop’ infrastructure that validated the outputs against institutional needs. Rather than replacing expert judgment, AI reconfigured administrative work by reallocating human effort from repetitive screening tasks toward higher-value activities such as validation, interpretation, and decision-making.

6.1 Theoretical contribution: Responsible Public Value

This study extends public value theory and the literature on responsible AI by introducing and operationalizing the concept of Responsible Public Value. We conceptualize it as the outcome of the interaction between three interdependent dimensions:

  • Operational efficiency and innovation: achieved through the automation of complex workflows like legislative monitoring.

  • Technical and data governance: materialized through MLOps tools (Airflow, MLflow) that ensure reproducibility and traceability.

  • Ethical and organizational responsibility: enacted through continuous human oversight and professional discretion.

Unlike normative frameworks that treat ethics as an external constraint, this study demonstrates that ethical responsibility is a constitutive condition for value creation. As expressed in the heuristic formula: Responsible Public Value = (efficiency + innovation) × ethical responsibility. This implies that efficiency gains only translate into sustainable public value when multiplied by institutionalized accountability and transparency.

6.2 Practical implications and limitations

From a managerial perspective, the case proves that responsible AI is feasible in resource-constrained environments. The hybrid on-premises/cloud architecture demonstrated cost efficiency (≈ €6.6/month) while maintaining data sovereignty.

Beyond the general implications, the DGOBCAN-AI case reveals specific structural boundaries. As summarized in Table 1, each governance dimension (ethical, technical, and data-centric) faces particular challenges that must be addressed to sustain the creation of Responsible Public Value. For instance, while the architecture provides procedural traceability, it still faces the ‘black box’ challenge of model interpretability, and the severe data scarcity requires a human-in-the-loop intensity that may limit scalability in other contexts. These limitations are detailed further below:

  • Methodological generalizability: As a single-case analysis, the findings are analytically but not statistically generalizable. The observed outcomes are deeply shaped by the specific Spanish regulatory environment, the Canary Islands’ organizational culture, and the specific technical capabilities of the administration.

  • Data scarcity and human dependency: The extreme class imbalance (0.27% relevant documents) remains a structural constraint. While the MLOps framework facilitates continuous improvement, the system’s reliability is still heavily dependent on resource-intensive expert validation. This dependency implies that the model’s performance is a contingent achievement of human-in-the-loop oversight rather than an autonomous capability.

  • Technical and sovereignty risks: The hybrid infrastructure, while cost-effective (€6.6/month), introduces a partial reliance on external cloud providers for GPU inference. This creates a technological dependency that requires active management to preserve long-term data sovereignty and institutional autonomy.

  • Interpretability vs. traceability: While the MLOps architecture ensures full procedural traceability (auditable logs via MLflow/Airflow), the internal decision logic of the transformer-based models remains complex to interpret. Furthermore, standardized metrics for assessing broader societal impacts, such as fairness perceptions or long-term trust, remain underdeveloped at this stage of the research.

6.3 Concluding remarks and future research

This study set out to examine how public value can be cocreated through the responsible governance of artificial intelligence, rather than viewing the technology as a stand-alone technical solution. AI does not create public value by itself; value emerges when AI is deliberately and transparently governed within institutions. This research provides a rare empirical link between MLOps infrastructure and the enactment of ethical principles in daily administrative routines.

Future research should therefore pursue three complementary directions. First, comparative studies across administrations and sectors could test the transferability of Responsible Public Value under different institutional conditions. Second, further conceptual work is needed to operationalize Responsible Public Value through measurable indicators that integrate efficiency, governance quality, and ethical impact. Third, advances in automated audit and explainability tools could strengthen the continuous assessment of responsibility within AI-enabled public services.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelvaluationstudy

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Data-centri…modelvaluationstudyeuropepaperresearchAI & Societ…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 174 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!