Six kinds of explanation for AI (one is useless)

There are at least three broad categories of useful forms of explanation for AI:
  1. Explaining human actions that led to the system being released and sold as a product and / or operated as a service. Since humans are the only beings our law can hold accountable, this is a very useful form of explanation for establishing why things happened in terms of liabilities (legal or tax), praise, and a general emotionally-gratifying "how could this happen?" kind of answer. It can also tell you how you could be better using a system, or what to request in terms of changes to a system if you own or license it. All commercial AI should have this. So should the rest of commercial software. See further my blog on governance of AI or several of my 2019 AI ethics publications.
  2. Explaining what inputs resulted in what outputs. Actually, this describes two kinds of explanation:
    1. Even if a system is entirely the opaque kind of "black box", you can still try putting a bunch of related inputs in and find out what makes the output change.  This allows you to check for example whether you would have gotten a loan if you had been a little taller or younger or so forth. This is sometimes called "digital forensics." Probably all AI that interacts with people that didn't build it should have this.
    2. Or for robots like driverless cars, you can keep the airplane kind of "black box" that records logs of inputs and decisions taken by the system for later debugging. For robots at least, because such a black box is likely to have incidental personal data, this kind of black box should overwrite its old data regularly. For other kinds of systems e.g. tax or loan services, it might be worth keeping such logs around for some years, though obviously cybersecured. Probably all AI that affects decisions concerning people that didn't build it should have this.
  3. Seeing exactly how the system works.  This is what people often mean by "explanation", but it's not necessarily more useful than–or even as useful as–the previous two broad categories, depending on your use case.  But if you as a developer can actually understand all the components of your working system, then not only can you explain such details to (at least incredibly well-educated) users, you may also be better able to understand, maintain, and debug your own system. This broad category of explanation also has a couple of sorts:
    1. Using the kind of representations to encode the AI that humans can read. For example, production rules, logic, decision trees, etc.  Note that sometimes such systems will be so large and complex that they may not actually be very understandable anyway. Very few people can look at an orchestra score and know exactly how it sounds, maybe no one can really experience the qualia of polyphonous music that way. Gazillions of lines of computer code aren't exactly like an orchestra score, but hopefully you see my point.
    2. Fitting more transparent models to less transparent models where the AI is being generated by models learned by a less transparent system e.g. a deep neural network (DNN) learning system. Actually, this is how human intelligence works. We often don't really know why we do something, and we certainly don't and couldn't consciously determine every gesture thought or word choice. But we learn to guess why we are doing things based on models we acquire partly from our own experience, but quite a lot from being taught. So why you think you do what you do will depend a lot on what you've been told about yourself and about other human beings. If you're lucky (and you care enough), you'll keep learning more about how your own mind works for your entire life. Anyway, back to machine learning, work on this can be found going back decades, including recent work by Zhoubin Ghahramani and Murray Shanahan.
Some people think all AI should have explanation type 3.1, or that the GDPR requires this.  But in my opinion, we should leave it to the vendors of AI systems to decide how much of either of the type-3 explanations they provide with a system. As long as we hold software vendors accountable for the actions of their system, they can make their own informed tradeoffs about transparency versus performance. Not that it's clear that there really are any of those. (Hat tip: Dylan Evans)

Tallinn, Estonia – gratuitous picture for the blog index :-)
So that's five sorts of transparency. This week at the Tallinn Digital Summit I unfortunately witnessed the naming of a sixth form, which seems to catch on. It started with what looked to me very like an attempt at the type of disruptive disinformation exercise that gets documented by Naomi Oreskes and Erik Conway in Merchants of Doubt. Physicist Steve Hsu asserted explanation was impossible for AI, but in the face of apparently unexpected (by him) expert resistance from Nanjira Sambuli and Ben Cerveny he immediately backed off and agreed with them, but then wheedled in the idea that nevertheless deep explanation is impossible. His claim: you don't know what every weight in a deep neural network does, do you? Then you can't get something analogous to 3.1 above from a DNN.

The panel that drove the tweet that drove this blogpost.
from left to right: Ben Cerveny, Steve Hsu, Nanjira Sambuli
So what? Deep explanation would be useless. As I've been saying for years, when you audit a bank, you don't ask for a mapping of the synapses of the bank's employees, you ask for their accounts. That's the equivalent of explanation 1 above, which as I've said, every commercial software company should be able to produce. Some people say "but you could put the accountant on the stand", but at best then you would get version 3.2 above – if the employee's memory is great, their original understanding was correct, and they aren't lying.  As linked earlier, you can also already get version 3.2 with DNNs.

But maybe because Steve had been attempting to establish AI authority by wielding his physics degrees, my new response in a tweet was that worrying about "deep explanation" was like worrying about whether the molecules of a table were going to hold together or let a plate fall through them. We don't worry about molecules when we set a table, and we don't need to worry about exact DNN weights for regulating or even programming AI.  It's just not the right level of abstraction.

Microsoft (or at least its engineers, I don't know if it was coordinated) used to try to run this kind of "deep explanation" interference (though they didn't call it that) against the regulation of AI. They stopped two years ago, because they realised that ethics is the flip side of liability. Given the success of accountability and transparency in soft law like the OECD/G20 Principles of AI, I'd say that we are getting this kind of message across pretty well. But unfortunately there are still a bunch of actors who rail against authority and order, even when it is a big part of what protects them and their precious power and money.  They come up with narratives about how damaging the EU is or other governments. Sure, everything can and must constantly be improved, but some disruptions are actually a very bad idea. For example, most war.

Anyway, to recap:
  1. Everyone who writes software, with or without AI in it, using or not using AI techniques (including machine learning) to write the software or run the final system, should keep decent records of what they've done. This is good practice for their own ability to maintain the code and run their organisation, and it's essential practice for demonstrating due diligence.
  2. Most AI systems that are commercially released should have processes in place so that you can do forensics to check how and why they work, whether by feeding in large ranges of parameters to check when a decision would change, or logging inputs and outputs of the system, or both.
  3. If vendors are held accountable for the outcomes of using their software / intelligent system, they may choose also to get extra transparency by using more readily-comprehensible representations of their AI, but they may choose not to. They may also choose to build other, simpler models of how their software runs to provide more forms of explanation, for themselves or for others. I think it's OK to leave this up to them, and their lawyers.
None of these strategies should be seen as excluding the others, but at a minimum I'd recommend the first one should be mandatory in commercial AI. Generally speaking, the more sources of knowledge you have, the more likely you are to be able to understand something. But too much information is useless. All models are wrong, but some are way more useful than reality itself, at least for human reasoning.


Unknown said…
I used to think that those who operated AI systems should be obliged to explain on demand the reasoning behind their decisions. But humans are often unaware of the true reasons for their own decisions, so it hardly seems obvious to apply such a requirement to machines,

Perhaps a better requirement would be that operators of AI systems should on demand provide justifications for their decisions. That is something that humans can be expected to do also; and it provides a basis for challenges, empirical or logical.
Joanna Bryson said…
You may also want to read The Artificial Intelligence of the Ethics of Artificial Intelligence: An Introductory Overview for Law and Regulation (pdf)
Joanna J. Bryson, solicited and reviewed for M. Dubber, F. Pasquale, & S. Das (Eds.), The Oxford Handbook of Ethics of Artificial Intelligence, Oxford University Press. for a green open-access version.