Black Box paradox: can it be unlocked?
di Miklai Kamilla Rózsa
Tempo di lettura 6'
AI technologies now shape battlefield operations through algorithmic surveillance and autonomous targeting systems, often operating under limited human oversight. The operational processes generated by deep learning-based AI systems remain largely opaque to developers and other users. This black box paradox raises transparency concerns, undermining trust and accountability and posing serious challenges to compliance with legal requirements in military operations, especially when systems whose internal logic cannot be fully understood are involved in lethal decision-making.
A black box AI system may disclose its input and output data while keeping its internal operations inaccessible to human understanding. By relying on massive datasets to learn predictive patterns, such systems generate decisions that are often not explainable and do not fully align with human reasoning. The combination of this inherent opacity with strategic secrecy and large-scale automation makes military applications especially difficult to scrutinize. As Scott Sullivan highlights, the most advanced AI systems employed by armed forces – such as Israel’s Habsora system – can increase the targeting capacity exponentially, while providing little insight and understanding into how targets are selected. This system achieves operational efficiency through automated abundance, but at the expense of reduced traceability and weakened ethical oversight capabilities.
These dynamics directly affect compliance with the core principles of International Humanitarian Law, namely distinction, proportionality, and precaution.
These principles require military commanders to distinguish combatants from civilians, to weigh expected military advantage against incidental civilian harm, and to take all feasible precautions to minimize harm to civilians. As Sullivan demonstrates, the effective applications of these principles relies heavily on human judgement, especially in the formulation and justification of targeting decisions. When a target is selected by an autonomous system whose rationale cannot be reconstructed, commanders are unable to exercise the “reasonable judgment” required under international humanitarian law, nor can responsibility be meaningfully assessed ex post, thereby generating a profound accountability gap.
The black box paradox, however, is not only a legal dilemma; it also represents a risk on the battlefield. AI systems often display predictable behavior during simulated operations, yet produce unexpected results when deployed in real-world environments, where unanticipated system inputs, deceptive tactics and changing environmental conditions are present. As Arthur Holland Michel notes, such unpredictability stems from three main sources: deficiencies in model training, the use of unverified data and the dynamic nature of operational environments, such as urban combat zones. Even when systems function as designed, the problem of human-AI decision logic mismatch persists preventing operators from fully understanding, trusting, or acting upon AI-generated outputs. Reflecting this concern, AI strategy documents issued by the Defense Advanced Research Projects Agency (DARPA) and NATO emphasize that human operators struggle to interpret and trust AI-driven decisions, due to a persistent cognitive gap between human and AI decision-making processes. In such conditions, operators face two major risks: the development or, conversely, disengagement from the system altogether. Both responses significantly increase the likelihood of catastrophic errors during military decision-making and operational execution.
Against this background, concept of Explainable AI (XAI) has emerged as a response to the operational, legal, and ethical challenges posed by black box AI systems. XAI seeks to develop AI models capable of generating understandable explanations
for their decisions, enabling commanders, analysts and auditors to better understand why a system produced specific recommendation or action.
Its primary objective is to bridge the growing gap between increasingly precise yet opaque models and the human actors who are expected to supervise and justify their decisions. In military settings – and ultimately on the battlefield – such transparency is requiring absolute clarity, as unexplained decisions risk leading to unlawful and unintended violence, death and destruction.
The most effective XAI approaches rely on visualization and attribution tools, such as Grad-CAM, which highlights the features of an image influencing classification decisions, and LIME, which generates interpretable representation of complex model behavior. These techniques allow the human operators to maintain oversight and review system behavior both during and after the operations. As shown by Toma’s application of Grad-CAM to autonomous weapon systems, even partial visual explanations help military personnel in assessing target legitimacy and identifying hidden biases or errors that would otherwise remain undetected. The operational benefits of explainable AI include enhanced decision-making in dynamic situations and reduced chances of accidental actions, and improved stability in human-machine collaboration. The implementation of XAI in military AI systems requires a complete transformation of the entire AI development lifecycle, starting from data preparation and model design to deployment methods and evaluation tools. Hence, the military need to transform their organizational culture by moving away from performance-driven optimization toward a more reflective approach that emphasizes transparency, safety, and human accountability.
In this way, XAI’s purpose is not to slow technological innovation, but to ensure that it remains aligned with the rule of law, democratic values, and most importantly the moral weight of decisions made on the battlefield. Despite the normative and operational promise of Explainable Artificial Intelligence, significant obstacles hinder its implementation in military contexts.
Advanced AI models using deep learning architectures achieve better performance, yet their internal operations remain difficult to interpret, forcing military developers and commanders to confront a persistent trade-off between performance and explainability
In high-risk military situations - where every millisecond counts, and human lives are at stake - speed and accuracy often overrides the demand for transparency. The immediate benefits stemming from using these systems produce long-term strategic and ethical vulnerabilities. Disclosure of internal mechanisms and model rationales would potentially reveal strategic information to adversaries and create security risks through exploitable weaknesses. Indeed, full transparency remains undesirable even when it becomes technologically feasible. Reflecting this tension, recent strategic documents show a policy shift away from strong commitment to explainability: while earlier NATO and DARPA efforts highlighted the importance of transparency and explainability as key to trustworthy AI, more recent approaches prioritize performance, data integrity, and process auditing instead. This drift reveals a basic contradiction: as AI becomes more critical AI in decision-making, tolerance for uncertainty decreases, while systems become less accessible to scrutiny.
Addressing these contradictions requires a transition toward human-centered autonomy, grounded in the principle that explainability and controllability must be embedded into military AI systems from the outset rather than retrofitted onto opaque models. Combat AI systems should be designed to provide abstract, probability-based explanations that allow human operators to understand the logic underlying system recommendations without compromising operational security. Such a framework needs institutional oversight, including independent review boards composed of legal experts, ethicists, military personnel, and AI specialists responsible for audits, and stress tests, and certification procedures. Testing environments should simulate both optimal and unpredictable adversarial situations to ensure AI models achieve proper generalization without producing unanticipated dangerous responses. Furthermore, the training of human operators should extend beyond basic system activation to include critical evaluation skills for determining trustworthiness, identification of intervention points, and procedures for reporting irregularities. The decision-making unit operates through mutual understanding. Military institutions can regain control over algorithmically driven operations through the implementation of multi-layered control systems.
Ultimately, the era of algorithmic warfare requires moving beyond the illusion of machine perfection, by adopting a human-oriented AI understanding. AI systems lack the ability to perceive reality, as they process incomplete data signals through imperfect systems, producing outputs that reflect only simplified maps of the world rather than the territory itself. Yet, these outputs produce direct consequences for human lives, military operations, and fundamental values. Explainable AI represents both a technical solution and an ethical and philosophical approach to AI systems. While machines may analyze data faster than humans, interpretive authority remains uniquely human. The refusal to make military AI explainable therefore represents a silent abdication of accountability and trust. Future defense systems require nations to move beyond algorithm optimization, and instead confront a deeper question: not only what machines should be capable of achieving, but what standards of judgment, responsibility, and oversight humans are willing to uphold. The protection of security against technological subversion depends on human oversight, interpretability, and judgment to ensure technology functions as intended.
Immagine: generata con OpenAI