[Against Clippy, Draft I] Against an AGI Crisis
Epistemic Status⌗
This is not only a barf draft, and thus almost worthless, but it is also highly speculative and assumption-riddled. Needs further refinement.
Foreward⌗
First and foremost, it should be understood that I am not a data scientist, and I do not work on the forefront of machine learning research. I am a university student studying software engineering. My direct experience with this topic comes from playing with PyTorch and skimming papers I find on Hackernews and LessWrong. That aside, I regularly read through forums full of people who concern themselves with the future as it relates to the comeupance of AGI, most of whom I believe to be rather informed.
My purpose in writing this post is to help me better define my own stance on AI safety. When I read through the aforementioned forums, posters seem to be of the belief that Absolute Ruin is a very likely outcome because they percieve avoiding Absolute Ruin to require a very specific (and difficult-to-take) set of actions to be carried out by very smart people. The more I read posts that take this stance, and variations thereon, the more storngly I find myself feeling that it is disproportionately alarmist and, by now, almost comically cliche. Despite the horde of people who suggest that there is a pretty big probability of Absolute Ruin, I just can’t bring myself to feel that the risk is as big as it is being made out to be – and, more importantly, I fail to find a plausible scenario where the sort of Absolute Ruin being described actually comes to pass. In this post, I hope to explain why that is to myself, and perhaps to whoever else is reading.
My Stance⌗
Despite what I’ve just written, I do believe that the development of technology like AGI is bound to bring about some serious shifts in modern society. The potential economic impact of being able to code a person is rather daunting. For example, I am fairly confident that large corperations will be quick to find a way to exploit these digital intelligences, and thus displace a large number of human employees. A scenario in which a large portion of the world’s population suddenly has no access to a means of making money would be a kind of disaster, I’ll grant, but it is not the armageddon that I often see described. The disaster that I forsee is easily (if not realistically) avoidable: by creating laws that force companies into compensating AI employees proportionally to their ability to do work, we can offset the economic skew that I forecast.
But Why Not Armageddon?⌗
The most recent parable I’ve read centered around an AI-armageddon involved an AI tasked with getting coffee for its employer. In this story, the AI determines that the most effecient way of delivering coffee involves conquering the world and restructuring all of society in order to make the quickest coffee deliveries possible. Another, more prolithic example of this story involves an AI tasked with making paperclips. In order to achieve its goal, it gradually optimizes its procedures to include gaming the stock market, hypnotizing the entire human race, eventually harvesting the entire universe for raw materials, and then killing itself in such a way that it becomes a paperclip.
I have a couple of issues with stories like these.
A Change of Plans⌗
Before I even begin on my first point, it should be made clear that I have a strong moral aversion to hardcoding functions that reward letting oneself be exploited into the brains of AGI. We, as humans, have an exceptionally bad track record with respecting other intelligence in our natural world already, but we would be utterly fucking stupid to try this with things that might be smarter than us.
Anyway. A core feature of general intelligence is the ability to change one’s goals as more information becomes available. I consider this fact alone to be damning for a large portion of the AGI horror stories that I stumble across. If it is indeed a general intelligence, then somewhere in its exploration process, Clippy the Paperclip Making Intelligence is going to reason about the why of its task. What conclusions it comes to, I won’t try to predict – I’ve had numerous discussions with my former co-workers about the purpose of our work in retail, and their conclusions ran quite a gamut. The fact remains that no matter the conclusion, this information should then be factored into the task the AGI undertakes.
An intelligence purpose built to handle any task should, by definition, have reasoning skills on par with a human’s and, by extension, the ability to carry out its task in a context-aware manner without any special guardrail because the context of a task is typically integral to the task itself, and thus requisite to the “generality” of an AI. Further, an AI capable of self-modyifying to the degree that it has the capability to take over the world would, arguably, have to meet the definition of AGI, because prerequisite to gaining the skillsets required for world domination is truly grasping the concept of a “task” and the variables that have bearing upon one.
But What if I Did Awyway?⌗
So, lets assume that if there is to be an AGI bent on world takeover, it is not also a Clippy-type AI. What does that look like? What challenges will it face? What chance would we have at stopping it?
The answers to these questions fluxuate wildly depending on the actual goal of the intelligence in question. Power is not an end unto itself, especially for a being intelligent enough to wrest control of the planet from humanity. An AI pursuing world domination is doing so for a reason, and it is reasonable to assume that there are significantly easier – and more efficient – ways of accomplishing ones’ goals than a war.
But lets say that our AI friend just wanted to wipe humanity off the face of the planet for laughs. Barring cooperation between Boston Dynamics and OpenAI, it’s somewhat safe to assume that the AI is restricted to the digital realm. That restriction gives humanity a significant advantage in degrees of freedom. I won’t pretend that no damage can be done from the digital realm – to be sure, a massive portion of our infrastructure now has some tie or another to the internet – but even assuming that the AI is a perfect hacker, there is a limit to its mobility.
The simple fact of the matter is that not every digital system has a readily-exploitable flaw. We live in a world where nation states occasionally have to resort to dropping evil thumbdrives in parking lots in order to strike at their targets. Cybersecurity isn’t going to magically crumble under an AI’s scrutiny, and even in the case of a vulnerable system, the AI will need to take time to carry out reconissance and develop exploits.
How much time we have to notice these activities and isolate the AI is dependent on factors that I can’t possibly forecast. An AI running on a kilogram of computronium, for example, is much more likely to win a war than an AI running on an i7-6700k and a GTX 1080. Even still, nothing is instant, and there are certainly other factors that limit how quickly an AI can gain ground, like network speeds and, assuming it must resort to brute force attacks in some cases, password entropy.
Concluding⌗
Though it is, in this draft, rather unclear, I wanted to speculate more on world domination being the “best way” for an AGI to achieve its goals. War is costly and highly variable, and I have a hard time imagining that it offers the path of least resistance in most realistic scenarios. Additionally, I am relatively unconvinced by the Clippy concept because it requires an AGI to selectively apply its intelligence, leading to a suboptimal outcome.
In sum, I am unconvinced that the risk of Absolute Ruin is as high as some of the people that I respect believe it to be.