Transparency is not binary

For some months now (starting the 13 April 2021) I've been giving a talk on the limits of transparency, inspired by both my experiences with and observations of the Global Partnership for AI (I'm one of Germany's original 9 nominees) and Google (both the recent Stochastic Parrots / Timnit Gebru / M Mitchell problem, and of course ATEAC.)

What I've said is that there are at least three limits to the extent of transparency:

  1. Combinatorics. For some reason, this week Rod Brooks decided to call this "information physics", but computer science is also a legitimate science, and normally in computer science we call it combinatorics. In the last decade, I've opened a lot of AI policy talks though saying that computation is a physical process taking time, space, and energy, and therefore we can never know everything, with or without AI. See for example minute 17 of my 13 April talk – I've been giving that particular slide for years). There's no question that this is a limit on transparency. It's a scientific and mathematical fact – we can't know everything, so what we do understand is always an approximation of reality.
  2. Political polarisation. I've had a published scientific paper about why and when political polarisation covaries with inequality since last December, and again I've been processing these ideas here for years e.g. my 2016 post on truth in the information age. Basically when we are more polarised we are more concerned with signalling our identity, probably because we are feeling threatened and know we can't make it on our own. This may make it psychologically harder for us to actually think and see truth. To be honest, this reason is pretty speculative.
  3. Mutually exclusive goals. This comes back to why leading AI and communication companies employing leading minds still can't seem to effectively reach agreements or communicate decisions. Given that all of us are dealing with abstracted versions of reality, what is the basis of that abstraction? Basically, we will compress information around the goals that we hold. If two people are hired (or otherwise deployed) with opposing goals, they may find each other incomprehensible, however smart they are. The best way to resolve the problem of multiple, conflicting goals in AI was actually one of the core deliverables of my PhD
One consequence of transparency having limits like these is that transparency is not necessarily binary. Obviously some things can be completely untransparent (opaque), for example if you burn or delete records of how something was built. But generally transparency relates not only to an artefact, but to who is trying to comprehend it.

How much time and effort you can put into understanding a system, or communicating a system, will vary based on what your goals and resources are. We've long known that there will need to be different kinds of transparency for different audiences. But in computer science, we've for even longer known that every time you replicate information, you set yourself up for future failure, because someone will update the thing momentarily important (e.g. the software code) and forget to update the other copies (e.g. the documentation.) This is why for a long time people talked about "self documenting code." On the one hand, having only one version was the only way to be sure all versions agreed, but on the other hand, code is never really entirely self documenting, that is, how and why it was written will never be transparent to everyone. Nevertheless, one way we can facilitate transparency is to ensure that the documents that support it are grounded in the actual system they document. For example, that they link directly to the source software code they are abstracting over. This can work even if not everyone is allowed to see the code or data, so long as there are trusted individuals such as government regulators to check that the links really do work.

More recently, I've been thinking about how to companies like Facebook or Huawei can provide transparency, when (to a first approximation) no one would believe them even if they did. Well, almost no one. It turns out Facebook has solved this problem in some circles just by hiring or funding outstanding and outspoken researchers who show all their work and data, and basically the external academic communities of researchers who know these Facebook-funded scientists have a reasonable amount of faith in what they are doing. Though part of the reason this works is that the researchers still in academia are also are now even more vigilant in reviewing the Facebook researchers' results, and presumably the Facebook-associated researchers are now even more careful to follow correct processes and fully document their work.

My conclusions from all this pondering is that transparency requires two things: 
  1. Veridical information presented comprehensibly. 
  2. A population able to recognise and at least partially comprehend such presentation.
These two things are interdependent. For example, good public education can make the first problem easier. Or a system may be transparent to experts but not to lay people.

Some governments and other agencies who are not truly interested in transparency but rather in less fettered exercise of power are trying to direct focus only on the second, population aspect. They come and ask people like me how to make people trust AI

No one should trust AI.  Similarly, the task of producing transparency is not (only) to ensure that systems appear transparent, but to ensure that they actually are. We need to work hard to make sure that systems actually are transparent, which means that accountability can be maintained across those systems  use. We'll know we've achieved that if when things go wrong, we can tell who is at fault, 
  • whether the damage was intentional and if so on whose part (the funder, the developing organisation, one individual developer, a hacker, an operator, the operating organisation?), or
  • whether it was negligent and if so at what stage (development or operation?)
Once we've done all this, it's a lot more likely that people will trust the system, because so many people would be involved in the process of building and regulating that system. This is essential: trust comes from between people, from their experience of each other. This goes back to that second constraint on transparency, polarisation. If people know someone who knows someone who works in software, AI, regulation, or whatever process is involved, they can call up and ask someone they know whether they should believe a story they see in a paper or on social media. That is much more likely to happen when we have higher social mobility, which correlates with lower economic inequality and lower political polarisation.

So transparency is not a binary. It's an extent – it's the extent to which a population can understand a system. And that extent has both social and technical components.

transparency and documentation on a repair process in Berlin