Like moths to a flame, we cannot resist the siren call of the Control Problem, the name AI philosophers give to the question of how or whether we will be able to control AI as it gets more and more intelligent. (Not for nothing did I name my book Crisis of Control.) A reporter contacted me to ask for comment on a newly published paper from the Max Planck Institute. The paper makes a mathematical case for the impossibility of controlling a superintelligence through an extension of the famous (in computer science, at least) Halting Problem. This is a proof given to first-year computer science students to blow their minds about how to think about programs as data, and works by establishing a contradiction: Suppose a function exists that can tell whether a program (whose source code is passed as input to the function) is going to halt. Now create a program that calls that function with itself as input, and if the function returns true, loop forever, i.e., do not halt. This program halts if it doesn’t halt, and doesn’t halt if it does halt. Smoke comes out of the computer: Paradox Alert! Therefore, no such function can exist.
You might intuit that a lot of hay can made from that line of reasoning, and this is exactly the road that MPI went down. Their paper says, “Suppose we had a function that could tell whether a program (an AI) would harm humans. Now imagine a program that calls that function on itself and if the result is false, cause harm to humans.” Boom: Paradox Alert. Therefore it is impossible in general to tell whether a given program will cause harm to humans.
That’s a narrow conclusion to hang a large amount of philosophical weight on, but the paper’s authors don’t mind going there. They invoke the AI boxing problem and the story of the Monkey’s Paw to make sure we are aware of the consequences of not being able to guarantee control of AI. This commentary was picked up by the reporter who contacted me. You can see the resulting article in Lifewire.
I supplied more input than they had room to quote, of course. I was quoted fairly, and not taken out of context, but for you, here’s the full text of what I gave the reporter:
The paper you cite extends a venerable computer science proof to the theoretical conclusion that it is impossible to prove that a sufficiently advanced computer program couldn’t harm humanity. That doesn’t mean they’ve proved that advanced AIs will harm humanity! Just that we have no assurances that they won’t. Much as when voting for a candidate for President, we have no guarantee that he won’t foment an insurrection at the end of his term.
Controlling AI is currently a quality assurance problem: AI is a branch of computer science and its products are software; unpredictability in its behavior is what we call bugs. There is a long-established discipline for testing software to find them. As AI becomes increasingly complex, its operational modes become so varied as to defy comprehensive testing. Will a self-driving vehicle start learning about the psychology of children because it needs to predict whether they will jump in front of the car? Will we need to teach that car the physics of flammable liquids so it can decide whether it can convey its passengers to safety if it encounters an overturned tanker in the road? The range of possible behavior approaches infinity. We cannot expect to keep testing manageable by limiting the possible knowledge and behavior of an AI because it is precisely that unbounded knowledge that will allow it to do what we want.
We cannot, ultimately, ensure the controllability of AI any more than we can ensure that of our children. We raise them right and hope for the best; so far they have not destroyed the world. To raise them well we need a better understanding of ethics; if we can’t clean our own house, what code are we supposed to ask AI to follow?
The problems in controlling AI right now are those of managing any other complex software; if we have a bug, what will it do, send grandma an electric bill for a million dollars? Or will image tagging software decide that African Americans are gorillas, as Google Photos did? These bugs are not the kind of uncontrollability that the paper’s authors or your readers are interested in. They want to know if and when AI will develop agency, or purposes that are clearly at odds with what its creators intended. That is known as the value alignment problem in artificial intelligence, and it does not require the AI become self-aware to be a problem. Nick Bostrom’s hypothetical paper-clip maximizer AI does not have to be conscious to wipe out the human race. But it does require a level of sophistication we do not currently know how to create. Most experts think we are decades away from that ability; your readers need not panic. We do, however, think that it is worth addressing the issue now, because the solution may also take decades to prepare.
What do you think about whether we could or should control superintelligent AI? Comment on this thread or use our contact form and maybe I can answer your question on my podcast.
“Machines take me by surprise with great frequency. This is largely because I do not do sufficient calculation to decide what to expect them to do.”
— Alan Turing

Dear Peter,
What a nice summary of a Control problem! It is really frightening, as you say, that we have not approached that problem with vigour in view of the nearly exponential advance of AI’s capabilities. If one adds to that the recent conclusion by AI researchers that we may not even know when AI will surpass our intelligence then your call for starting a serious programme for such a system of AI control right now is obvious to those closer to the subject. Lat year, I responded to the EU Commission’s request for proposals on controlling AI and its summary is here: https://sustensis.co.uk/create-a-friendly-superintelligence/. My view is that we cannot rely on a single method of control but need entirely different simultaneous controlling measures. However, in the end, I’m afraid, the best solution, which itself is quite risky, might be to gradually grow with the maturing Superintelligence by fusing the top AI developers (authorised by some global authorities!!!) through brain implants to directly communicate and control it from ‘inside’. As the AI matures, so the degree of the fusion of these superintelligent Transhumans will become stronger. I have described that in more detail here https://sustensis.co.uk/transhumans/.
Best regards
Tony
Tony Czarnecki
Sustensis
Managing Partner
Email: tony.czarnecki@sustensis.co.uk
Internet: http://www.sustensis.co.uk
Tel: +44 20 8686 4963
Mobile: +44 7879 445 363
LikeLike