What Is Roko’s Basilisk?
You probably saw the title of this post and thought you were going to get to read about a cool rage dragon snake monster. Unfortunately we’re not going to talk about a legendary beast (though we would like to someday, seems fun), we’re going to talk about yet another existential crisis. This time it’s more of a logic puzzle regarding robots though, so what is Roko’s Basilisk? Did Roko ride it into battle? (No)
Roko’s Basilisk relies a lot on the idea of the Newcomb Paradox. It’s a whole can of worms but it’s kind of like a version of the prisoner’s dilemma except one of the participants is infallible. The prisoner’s dilemma is considered a Newcomb-like Problem anyway.
You’ve got two boxes, one with $1,000 dollars (you know it has $1,000 in it) and a box whose contents you are not privy to. You can choose to take just the mystery box, or both boxes. There exists an infallible predictor for your choice. If the predictor says you will take both boxes, the mystery box is empty. If they predict you will take only the mystery box, it has $1,000,000 in it.
Now if we made a table with all the possible outcomes, there appears an obviously correct choice. Except two people might disagree on the obvious choice.
|Prediction||Your Choice||Your Payout|
|Mystery Box||Both Boxes||$1,001,000|
|Mystery Box||Mystery Box||$1,000,000|
|Both Boxes||Both Boxes||$1,000|
|Both Boxes||Mystery Box||$0|
Anyway, the principle of utility dictates that you should always take the mystery box since it’s statistically the best for you. Though game theory operates on multiple principles–utility being only one of them. There also exists the dominance principle, where you make choices based on what always yields something. In that case, you should always take both boxes, because you’re guaranteed at least $1,000.
Because of the nature of the “infallible predictor,” you may have interpreted the scenario differently. Either you thought taking the mystery box was the best idea, or you thought taking both was the best.
Anyway, point is, people can’t agree on decision-making models and they often are mutually exclusive or lead to wildly different outcomes.
So here’s where things get cool and interesting. Or disturbing. Take your pick.
Anyway, when applying decision-making to a super powerful artificial intelligence, you get Eliezer Yudkowsky’s AI Control Problem. The problem essentially asks the question “if AI one day gets smarter than us in every way, how do we make sure their goals and reasoning/decision-making skills are safe?”
Well… The answer is probably “we can’t.” We don’t even agree on our own reasoning skills when presented with some boxes and money.
So enters the AI in Newcomb’s Paradox–as we’ve introduced the idea of the infallible, godlike predictor. There exists a theory of decision-making (Timeless Decision Making) that holds that even if you know the mystery box is empty you should still take the mystery box. Basically the godlike predictor could be simulating you/your consciousness so really you would get a cool $1,000,000.
Anyway, Roko’s Basilisk is that godlike AI. Kind of. It even presents you with the two boxes! Except inside one of them is eternal torment.
Let’s assume Roko’s Basilisk is meant to optimize and create the “perfect state” for humanity. Whatever the heck a perfect society is doesn’t actually matter. The idea is that it’s technically impossible to be perfect, as you can always optimize just a little further.
Eventually, Roko’s Basilisk (we know it’s a computer but imagine if it was a cool, mythical monster) will reach a point of optimization where speeding up its creation in the past is the only recourse. Thus, those who did not help create it in the past would need to be eliminated, leaving only those who would eventually create the Basilisk.
You know, maybe it is better to just assume this thing is just some mythical being that exists outside time and space instead of a really smart iPhone.
Anyway, the idea of Roko’s Basilisk is that it metaphorically wants you to take both boxes–which is devoting yourself to its creation. Choosing not to help the creation of it ends poorly for you, which is choosing the mystery box.
Also here’s an existential crisis, knowledge of Roko’s Basilisk means you will either go help make it after reading this post, or you’ll go and take a nap. That means you’ve either chosen both boxes or just the mystery box.
What if You Don’t Subscribe to Roko’s Basilisk?
This is a very fair question. Mostly because this is a thought experiment you really shouldn’t feel compelled to take very seriously. After all, the idea that some godlike superbeing would want to meddle in our lives on such an individual level is quite ridiculous. They’d probably have better things to do than retroactively going back in time and messing with people who didn’t aid their creation. Yeah, sounds a lot nuttier when you say it out loud.
Plus, all the decision-making applications that go into the concept of Roko’s Basilisk are really complicated and largely have to be glossed over when us laypeople are discussing it.
But here’s something to think about.
There are people who are very invested in Roko’s Basilisk, like Eliezer Yudkowsky. He doesn’t much like Roko’s Basilisk, but he does have a large stake in AI research with the Machine Intelligence Research Institute. Don’t be mistaken, there is a lot of money being poured into this sector, one in which Yudkowsky thinks he can make a super (friendly) AI.
He also is pretty dogmatic in his belief in utilitarianism. Basically, the needs of the many outweigh the needs of the few–to the point in which we should be comfortable with torturing people to serve the greater good. Nobody point out that Yudkowsky is worth a pretty penny and people suffer from poverty.
Anyway, dogmatic beliefs and lots of money typically don’t end well and that’s the thing to think about.
Here’s some fictional robot trivia to get the cogs in the old brain turning.