AI Explainability: A Primer

Anastasia Siapka
Jan 11, 2024
5 min read

Updated: Dec 9, 2024

Is AI Really Facilitating Understanding?

Graphic entitled "Explainability" with the "ai" emphasized — Graphic from Canva, altered by Todd Mei

2023 was the year of Artificial Intelligence (AI): from generative AI to the EU’s AI Act, you would have been hard-pressed to miss the avalanche of AI-related news over the past months. Apart from its beneficial applications, AI has dominated media and academic and policy discussions because of the challenges that its development and deployment pose. Key among these is the challenge of ‘explainability’, implying that, although AI increasingly supports or replaces human decision-making, its outputs are hard to explain.

Introduction to AI

Albeit frequently discussed, the concept of ‘AI’ remains elusive, necessitating an overview of its main features.

First, AI systems consist of algorithms (meaning a series of defined steps to produce a particular output) commonly implemented in computer programs or information systems to analyse data and statistical relations.
Second, AI systems undertake goal-oriented tasks, which are associated with human intelligence, including ‘reasoning, the gathering of information, planning, learning, communicating, manipulating, detecting’.

AI systems could thus be defined as sets of algorithms performing goal-oriented tasks that would otherwise require human intelligence.

Systems with these features are embedded in software, in which case they operate in the virtual sphere (e.g., as virtual assistants, search engines, or chatbots) or hardware, in which case they are physically embodied (e.g., as robots, self-driving cars, or drones).

Explanations for Ordinary Decisions

To illustrate why explainability matters for AI-enabled decisions, we might consider why it does so for human decisions in the first place. I suggest that (good) explanations fulfill three essential functions.

1) Interpretability

Explanations provide their recipients with information answering an open-ended question (e.g., why, how, what, or where). Such information might be revising the recipient’s existing beliefs (e.g., revealing unknown relations between them) or supplementing them with new ones.

The information provided ultimately expands the recipient’s understanding and interpretation of the situation at hand.

Side view of woman in glasses looking into middle distance with AI graphics overlayed — Image from Canva

2) Action guidance

This newly generated understanding also applies to other, comparable situations.

Recipients may use explanations as the basis of future actions in order to reach desired outcomes. For instance, explaining that a loan application was rejected due to action x, informs applicants that repeating x will similarly yield rejections of future applications. Hence, to yield a different outcome, they would better try a different action y.

3) Accountability

Explanations enable recipients to assess or challenge the grounds for a decision affecting them (e.g., if they are arbitrary, false, or illegal) and seek remedy or redress. As such, explanations help identify errors and biases in decision-making, allocate responsibility, and eventually foster accountability.

Accountability can be understood as ‘a relationship between an actor and a forum, in which the actor has an obligation to explain and to justify his or her conduct, the forum can pose questions and pass judgment, and the actor may face consequences’.

Explanations for AI-enabled decisions

In light of these functions, explanations are not to be disregarded when transitioning from human decisions to AI-enabled ones. In the popular imagination, though, AI systems are perceived to be objective, neutral tools. This perception makes it hard to detect their malfunctions, question their outputs, and deviate from them.

Nonetheless, AI systems are profoundly influenced by the technologies, platforms, devices, data ontologies, and methodological choices involved in their development. Far from impartial mathematical constructs, they are considerably value-laden. Explanations are, therefore, necessary to interrogate the systems’ objectivity and discern the infiltration of biases or inaccuracies in the decision-making process.

Yet, what would an explanation look like in the case of AI decisions?

Wachter et al. suggest two types.

The first one includes explanations of system functionality, informing on the logic, significance, anticipated consequences, and general functionality of the system. These are given before (ex ante) or after (ex post) the decision-making process.
The second type includes explanations of specific decisions. These help recipients construe the rationale, reasons, and individual circumstances of the decision and are feasible only ex post.

While such explanations are desirable, the state-of-the-art in AI complicates their feasibility. Early AI systems followed simple decision-making processes, such as linear models and decision trees, which were relatively easy to understand. Current systems take predominantly the form of Machine Learning (ML) or specifically Deep Learning (DL), whose inner workings are harder to articulate in human terms.

This difficulty is apparent at both lay and expert levels. Those lacking specialist knowledge in computer programming cannot make sense of the rationale and methods underlying ML and DL. Hence, organisations and Big Tech companies, armed with infrastructure, expertise, and resources, develop AI systems that affect individuals, without the latter having comparable means to comprehend and assess these systems, thus aggravating epistemic inequalities and power asymmetries.

Surprisingly, neither domain experts attain straightforward knowledge of the workings of such tools. ML and DL systems interact with their environment dynamically: they adjust their inner structure based on the feedback received (e.g., in the form of data input by users) and subsequently adopt new, unanticipated behaviours.

Robot with headphones on a laptop — Image from Canva

Owing to these self-learning abilities of ML and DL systems, it is difficult and at times impossible to ascertain which features of the input data were considered relevant to the system outputs or how much weight was given to each feature. Reliant on Big Data, such systems generate outputs based on correlations among high-dimensional data points. Correlation is taken to be sufficient in itself, disposing of the need to establish causal relations. This often entails that ML and DL experts lack foresight into the system’s behaviour and can only retroactively attempt to decipher it.

Conclusion

AI systems have proven useful across multiple application domains, and so their complete avoidance would deprive society of beneficial innovation. They are, however, optimised for efficiency rather than human intelligibility, making it difficult to explain their function and especially their resulting decisions. AI thereby resembles a ‘black box’, denoting ‘a system whose workings are mysterious; we can observe its inputs and outputs, but we cannot tell how one becomes the other.’

This lack of explainability cautions against replacing human decision-making with AI in critical domains. Experimenting with decisions that determine access to employment, credit, health, and other important goods imposes considerable risks to individuals; risks that are not always justified by strict necessity or lack of suitable alternatives. Therefore, in high-stakes decisions, AI should be relegated to an advisory role, supporting yet not substituting human judgment. Even this role, though, should be critically approached.

Human decision-makers should scrutinise the outputs of these admittedly imperfect systems and interpret them as just one among many components of their overall judgment.

AI explainability is not, then, a merely theoretical challenge but one of wider import. Although neither human nor AI-enabled decisions are infallible, explanations matter for both, as they elevate recipients from uncritical, passive ‘objects’ of decisions to agents capable of adjusting their course of action and holding decision-makers accountable.

Interested in learning more?

This blogpost is based on my research into AI bias, available here.

About the Author

Anastasia Siapka is an FWO PhD Fellow (grant no. 1151621N/1151623N) at KU Leuven’s Centre for IT & IP Law (CiTiP). Her doctoral research evaluates the AI-driven automation of work from a neo-Aristotelian perspective. She was previously a research associate at CiTiP, carrying out ethical and legal research on emerging technologies, as well as a visiting researcher at King’s College London and the University of Edinburgh.

This blog and its content are protected under the Creative Commons license and may be used, adapted, or copied without permission of its creator so long as appropriate credit to the creator is given and an indication of any changes made is stated. The blog and its content cannot be used for commercial purposes.