Rock, Paper, Scissors in Strategy Games
A simple rock, paper, scissors (RPS) system of direct counters is a perfectly solid and legitimate basis for a strategy game provided that the rock, paper, and scissors offer unequal risk/rewards. Better still is if those rewards are unclear, meaning that players cannot easily determine the exact values of the rewards. The following video is not an example of that, but it's pretty exciting looking.
An Example of $10/$3/$1
Consider a strictly equal game of RPS with clear payoffs. We'll play 10 rounds of the game, with a $1 bet on each round. Which move should you choose? It makes absolutely no difference whether you choose rock, paper, or scissors. You'll be playing a pure guess. Since your move will be a pure guess, I can't incorporate your expected move into my strategy, partly because I have no basis to expect you to play one move or another, and partly because I really can't have any strategy to begin with.
Now consider the same game of RPS with unequal (but clearly defined) payoffs. If you win with rock, you win $10. If you win with scissors, you win $3. If you win with paper, you win $1. Which move do you play? You clearly want to play rock, since it has the highest payoff. I know you want to play rock. You know I know you know, and so on. Playing rock is such an obvious thing to do, you must realize I'll counter it ever time. But I can't counter it (with paper) EVERY time, since then you could play scissors at will for a free $3. In fact, playing scissors is pretty darn sneaky. It counters paper--the weakest move. Why would you expect me to do the weakest move? Are you expecting me to play paper just to counter your powerful rock? Why wouldn't I just play rock myself and risk the tie? You're expecting me to be sneaky by playing paper, and you're being doubly sneaky by countering with scissors. What you don't realize is that I was triply sneaky and I played the original obvious move of rock to beat you.
That may have all sounded like double-talk, but it's Yomi Layer 3 in action. And it had quite a curious property: playing rock was both the naive, obvious choice AND the triply sneaky choice.
Math Says There is a Solution
You might say that even with unequal payoffs, there's still an optimal way to play. The optimal solution is called a mixed strategy, meaning that it involves randomly choosing your moves, but obeying certain percentages.
You should play rock 10/14ths of the time, scissors 3/14ths, and paper 1/14th. If you play against another player who is playing suboptimally (for example, he plays paper 100% of the time), you can change your strategy to exploit him (by playing scissors 100%). But the optimal mixed strategy above means that no one can exploit you to do better.
While that is the math answer, three related factors creep into the real-world application of that strategy:
1) People are very bad at actually playing randomly, especially at specific percentages such as 3/14ths.
2) When people fail to play randomly, they are probably falling back on tendencies they do not know they have, but that you can detect and exploit.
3) People cannot help but let their personalities spill over into decisions about how conservative (playing paper) or risky (playing rock) they are.
Fighting Games
Fighting games rely heavily on RPS. They have both overall games of RPS going on as well as many rapid fire situations of RPS. Virtua Fighter games can even have 5 sets of RPS take place in a period of 2 seconds! Really!
Virtua Fighter's overall system of RPS is as follows: attacking beats throwing, throwing beats blocking or reversing, and blocking and reversing beats attacking. To be clear, let's define terms.
An attack is a move that deals damage. An attack has an initial startup phase where it can't yet do damage (a punch extending), a short phase where it actually can do damage (the sweet spot of the punch), and a recovery phase (the arm retracts). If the defender is blocking correctly, an attack will not damage him, but he can be thrown.
A throw is a special type of move that instantly grabs an opponent whether he's blocking or not and does damage. The catch is, a throw will not grab an opponent who attacking (specifically, a throw will fail if the opponent's move is in startup or hitting phase).
A reversal is a special type of move that grabs an incoming attack. Reversals usually look like throws, but they work at the exact opposite times. A reversal only works when the opponent's move is in startup or hitting phase, which are, incidentally, the only times a throw would fail.
Even these explanations are simplified, but the RPS system is basically there. Attack the opponent. If they tried to throw you, you'll hit them. If they block or reverse your attack, they nullified your attack. If you expect them to block, you can throw. If they expect you to throw, they can attack.
The fighting game Dead or Alive basically uses this same system, except that the risk/reward for doing a reversal is much different. Reversals are difficult and relatively rare in Virtua Fighter, but they're incredibly easy and do a ridiculous amount of damage in DOA. Reversals are so effective, in fact, that they can paralyze the enemy into not attacking for fear of being reversed. Of course, that's when you throw them....
Unclear Payoffs
While psychology makes it difficult for people to deal with unequal payoffs, it can be even more difficult to deal with unclear payoffs. Imagine that you are making an RPS decision in the fighting game above, and you must consider "how bad" it would be if you guessed wrong and got hit by an attack (as opposed to guessing wrong and getting thrown or reversed). How much damage will you take?
It depends on which character you are figthing, and which character you are. It depends on the distance between you and the opponent, and on the timings involved: maybe it's likely he'll do a launcher into a juggle combo or maybe that's not reasonable but you fear he might do a stagger into a ground combo. Is your character's back toward the wall, meaning that your opponent could get extra damage from a wall combo? Is your back near the edge of the ring, meaning you might lose the entire round to a ring out? How good is your opponent at doing combos? Is he likely to really maximize his damage, or just get one hit?
It's extremely unclear what the payoffs of this situation are. A half-second later when you are in another guessing game, the payoffs will be different (maybe the distance between you changed) and it will still be very unclear. Cutting through all that and making a reasonable guess requires knowledge of the game, of the opponent, and the presence of mind to put it all together. It is a real skill (I call it valuation) and a very valid skill to test. If you can make payoffs unequal AND unclear, then you've already gone a long way toward making a good strategy game.
RPS in RTS
Real-time strategy games such as StarCraft also use the RPS system. Like fighting games there's the concept of RPS on large scale and a small scale. On the small scale, particular units are designed to counter each other in a RPS way. A marine dies to a guardian. A guardian dies to a corsair. A corsair dies to a marine. Abstractly, there are 6 categories of unit. Ground units can either attack 1) other ground units, 2) air units, or 3) both. Air units can attack 4) other air units, 5) ground units, or 6) both. Pure ground-to-ground units usually beat both other types of ground units, yet lose to both types of air units that can attack ground. Similarly, pure air-to-air units usually beat both other types of air units, but loose to both types of ground units than can attack air.
RPS is not limited purely to units countering each other though. Real-time strategy games also have the concept of trading off powerful units now for a strong economy now, which leads to even more powerful units later. So on one extreme, a Zerg player in StarCraft might sacrifice his entire economy to get a quick attack force ("6 pool" is the term). This will likely beat a player who chose the other extreme of playing for pure economy and no immediate attack force (by building double oven triple hatcheries). A moderate build (pool on 9th peon, one sunken colony) will likely defend against the early attacker's rush, though. Surviving the rush, the moderate build will have a much superior economy and win in the end. However, this moderate build will produce an inferior economy to the player who built 2 or 3 hatcheries and went for pure economy.
In Starcraft, the early rush is a very, very risky strategy. It's all or nothing. You'll either win right away off it, or your rush will fail and you'll almost surely lose. Because of this, the early rush isn't all that common (depending on the map), but the very threat that the opponent might play the early rush is enough to stop you from playing for pure economy every time.
Finally, notice how hard it is to determine the actual payoffs in StarCraft. If your correct guess results in a battle between a few enemy Zealots and several of your Marines, what is the payoff? How many Marines will you lose? It depends on the micromanagement skill of both players, the terrain, and whether each player even focuses on the battle at all (maybe there's a more pressing battle somewhere else on the map). A lot of the goodness of StarCraft's design is that it's full of RPS with unequal and unclear payoffs.
Back to Basics
I'll leave you with this glimpse into the crazy world of people who don't seem to care about unequal or unclear payoffs.
Reader Comments (20)
Your optimal strategy for the 10-3-1 version of Rock-Scissors-Paper is wrong. I would respond to your "10/14 rock, 3/14 scissors, 1/14 paper" strategy by playing 100% rock: I would win $10 3/14 of the time and lose $1 1/14 of the time. The correct strategy to avoid being exploited is "3/14 rock, 1/14 scissors, 10/14 paper"; no matter what strategy I pick against it, I will lose as much on average as I win.
You're missing the point of the article, the human factor comes into play then. He plays his strategy, sees that you are playing rock 100%, then just counter with paper every time.
Math cannot factor in the human element, and the mind game aspect - which is what adds such a level of depth to such a simple formulaic way of balance.
"The correct strategy to avoid being exploited is "3/14 rock, 1/14 scissors, 10/14 paper"
A paper/scissors hybrid would stand a pretty good chance of beating that.
Snap up at least a couple wins against paper with the scissors, then tank the rest with just paper. The rocks would have to come out eventually, and the single scissors loss wouldn't be enough against the few scissors wins of your own.
But... who's gonna just sit there and let me get away with my then painfully obvious tactic? If he started snipping up my papers, I'd be in real trouble!
Something to consider is that in any given mixup, every pick is "due" out at some point, if you're going to follow any specific ratio. The more moves that you take, the less possibilities there are. For example, if you're down to 3 moves, and you're going to try a spread of "Rock, Paper, Scissors", I'd do the "Avalanche" to try and snipe for that 10$ on the scissors. The paper loss doesn't concern me too much, that's still 9$.
"A paper/scissors hybrid would stand a pretty good chance of beating that."
Your expected winnings with any paper/scissors hybrid against ThirdParty's strategy is $0.
If your estimation is correct -- that your strategy does indeed stand a much better than 50% chance of winning -- that means your strategy tends to get small rewards for winning but large penalties for losing.
If someone attempts to follow an "optimal pattern" though, I can keep a record of what plays they've already made and work out the chance that he'll play each different move on a play by play basis. That's the problem of trying to adhere to a certain playing rule as opposed to just giving weightings - giving weightings will result in roughly the same plays of each move over the long run, but in the short run your moves are less predictable (since each time you make one move, the chances of making a different move on your next turn do not alter).
Kolenzo, you don't understand independence.
Knowing what he played in the past does absolutely nothing to help you in the future, because of independence.
It doesn't matter what I played last move, it doesn't change the probability of me playing any given move on this one.
You seem to think the optimal strategy is to play those that many times out of 14 exactly. That's not how it works at all. You simply randomly do it every time.
If I were to determine my play by rolling a 14-sided die, I'm just as likely to get any result every time I do it. And that's precisely what you're doing with a mixed strategy. That's why it is impossible to beat; it is statistically the best average. It doesn't mean it cannot ever lose, but rather than on average it is the best strategy possible.
"Your expected winnings with any paper/scissors hybrid against ThirdParty's strategy is $0."
Zero dollars, huh?
When using that spread, 10/14 odds says he'll play paper, right? If that spread is indeed an optimal strategy, and playing paper roughly ten out of fourteen times is part of that strategy, playing scissors at the beginning of the match will most likely net me $3. It doesn't matter if I'm playing against a dice or a human, those are really good odds. If I can score one more win with scissors, which is still very likely, I'll have $6.
From that point forward, I can just play paper with relative safety. According to the doctrine, roughly 10/14s will be paper, roughly 3/14s will be rock, and roughly 1/14s will be scissors. If he plays paper, which is still a pretty likely outcome, nothing happens. If rock comes out, I get a buck. If his very very rare Scissors comes out, my opponent only gets $3, and I'll probably have 3 or more dollars lead. I'll probably win the match by that much.
To be fair, $9 is a bit more of a safe margin against a rouge two scissors play during the match, but the principle still stands.
If my opponent adheres to some "god strategy", I can always move to counter it. Just because it has the highest win/loss ratio against every possible spread doesn't make it invincible. After all, it isn't fighting against "every possibility" simultaneously, it's fighting against a mind that can recognize what can beat it! If everyone is doing the same thing, because it is the "impossible to beat" strategy, it makes sense that it can be predicted for that very reason.
Choose an even spread of 5/5/4, and you will lose to the relatively safe and very powerful strategy of heavy rock. For a set of 14 rounds, you only need two rock wins for a guaranteed match victory - you only need to spam rock after that.
Choose an "optimal" spread, and you'll end up with a whole lot of the same thing, which can be countered with a lot of whatever beats that thing. You might not even have to beat it more than a few times before switching to a safer strategy.
But wait! That's not even my point!
In reality, my opponent has a brain, and will not just sit there and let me get away with paper for the rest of the match. Hence, Yomi comes into play. My opponent can counter with Scissors, I can counter his counter with Rock, and on and on and on.
The thing with this setup is that there is always psychology behind every decision. Because all of the decisions are based on human thought, they can be predicted and countered. Yes, even when using dice to decide. A human must decide what values will go on the sides of the die, even if the die itself is autonomous of psychology. I'd argue that this is even worse than just deciding the moves yourself, as a die cannot adapt to a counter strategy.
Once you have to consider switching your strategy to adapt to your opponent, you've already stepped into the realm of Yomi.
Knowing what he played in the past does absolutely nothing to help you in the future,
The moves are not independent of each other; they will give you or your opponent different amounts of money depending on what is used. As the game goes on, different moves will have different risks and rewards associated with them. The odds of a given move helping you to win change vastly over time.
If you are only a dollar away from winning on the last move, a paper win is just as valuable as anything else; not normally the case.
And as I said, you only need two rock wins to guarantee a victory. Even just one rock win is very nasty. Don't tell me you wouldn't be at least a little scared to play scissors after a single rock loss, especially with such a safe move looming over you.
If you score a scissors win, paper becomes a bit safer inasmuch that it will only tie things up if you lose to enemy scissors.
The list goes on. I hope this shows that past moves do have an impact on future moves.
....
Oh, and yes, I've realized the stupidity of my "the moves in a spread are due at some point" remark. Yes, it does make more sense that a spread would just be a percent chance of doing a certain move, not a static flow-chart of what move comes next.
I'm staggered that so many people fail to recognize the brillance and clarity of this article. People just love to talk. As far as comments go, all I can say is, "there is nothing valuable to add to Sirlin's article." -CC
---------------------------
When using that spread, 10/14 odds says he'll play paper, right? If that spread is indeed an optimal strategy, and playing paper roughly ten out of fourteen times is part of that strategy, playing scissors at the beginning of the match will most likely net me $3.
---------------------------
Sure, you hvae a 71% chance of starting off with a $3 lead.
Of course, you also have a 21% chance of starting off $10 in the hole.
(and a 7% chance of starting off even)
What is your expected standings after the first throw? Go ahead, compute it....
Surprise! On average, you break even strategy, exactly what the math predicted. Funny how that works....
I see the problem.
You are thinking of a game with actual money that is physically paid out and kept.
I am thinking of a game where the "money" is basically just points, and is reset after a set of 14 rounds. The winner of a set would be determined by who would have the most money at the end of each one, and the winner of the match would be determined by who won the most sets (best of however many).
I do see that if I was just trying to win money over time by playing the game, the amount I'd win on average is equal to the amount I'd lose on average.
But, in the game that I was thinking of, it is only important to have more money than the opponent by the end of the set. I assumed this was the game everyone was talking about, but I just realized it doesn't say that anywhere in the article :(
I guess the missing ingredient to this is entropy.
The thing is, even if we were talking about a game where the goal was to be ahead after 14 games -- the math has still made a true statement: your average score against ThirdParty's strategy is $0, no matter what you do. In such a game, you would be right to argue that expected score does not translate into odds of victory... but you would be wrong to argue that the expected score at the end is anything but $0.
I think this is a general problem with analysis -- a problem of perception. Analysis often leads to very specific conclusions; if someone is thinking about a question, but the analysis answers a slightly different question, there is a tendancy not to notice the difference. e.g. here, when the math proves the expected winnings against ThirdParty's strategy is $0, but Questor was reading it as if it was claiming something about an optimal strategy for a game where the goal is to be ahead after 14 turns.
This, of course, is also often done by armchair analysts, which in turn breeds a misconception that the conclusions of a mathematical analysis are unreliable. e.g. how 'theorycraft' became an insult in the Starcraft domain
Oh well. /rant
P.S. I suspect that the best 'unexploitable' strategy for the "be ahead after 14 turns" version of RPS has less than 50% odds of victory. That should please Sirlin, because it means that if you want to have a serious chance to win, you have to join the mind game.
I guess a good strategy to actually win would be the following:
- First, apply ThirdParty's low-risk-algorithm while trying to analyse your opponent's behaviour.
- If you think you found a pattern*, try to exploit it at times.
- If your opponent gains the lead, try harder and increase risk (especially if time runs out).
- If you gain the lead, apply ThirdParty's algorithm again until you win, or until your opponent catches up again.
Of course your opponent could annoy you by applying ThirdParty's algorithm all the time, but this would just decrease your chance of winning to 50%.
A common misconception in RPS is that playing randomly is unbeatable. Playing truly randomly will leave you in the middle of the pack in any tournament. Effectively drawing every game. For the weighted version of RPS playing weighted randomly gives expected winnings of $0. To truly win at RPS, weighted or not, you have to outsmart your opponent. Check out how Iocane Powedr,the winning bot won the RPS programming contest, for AI Yomi at work. Iocane Powder Explained
But if I play randomly you so will not beat me.
Randoms 'weakness' is that it cannot take advantage of weaker predictable opponents. Say you have a tournament with 80% opponents who are weaker and 20% stronger then you are. By using random you end up in the middle, if you use a nonrandom strategy that beats the 8 but loose to the 2 strong you get ahead.
I think most see to fail that this is a game of common misconception. For example, say you're playing with a friend. You play rock on the first move, they play scissors, due to scissors being sharp and dangerous, most would think exactly the same in RPS. They lose, so they think you feel smart and you're going to play rock again because you think they're playing scissors again. That's wrong, and they decide to play paper to cover your expected rock. Although, if you know they think this way, you'll obviously pick scissors. Or say you start off with paper. They choose rock and lose, so they get a little frustrated (And don't say you won't, that's foolish, because you know everyone would, even yourself.) So they think you choose paper again due to winning, so they choose scissors. Same strategy.
Now then, say you lose on the first turn, it's the other way around. Most have a preference to choosing the same move at least 3 times in a row due to winning on the first time. Or due to personality. Since most out there are so to called "Laid-back" and use a lot of "slang" so to speak, and act uneducated, they'll get over confident. You'll be able to see through this and choose their week spot move. On the other hand, if they know their stuff, and play strategies, play the stupid strategy. They think you think you'll change your move on the next turn due to the previous turn. Or the present. So, you play stupid, while you know what they think you'll use. Say you choose scissors, they choose rock. They think to losing, you'll change strategies and play rock, so they'll play paper. If you know this, you'll pick scissors again. They'll get frustrated and catch on, and also play the stupid strategy.
Once you notice this, play the normal strategy, where you'd choose the other move's advantage.
Now then, if there's 3 people or more playing, that's a whole other story...
Hmm actually quite interesting to calculate...
under the assumption that the opponent uses a randomly chosen distribution for his moveset, chosen fairly out of all possible distributions, the optimal choice is to always throw rock. With an expected payoff of 3/2 each turn.
Which beats sirlin's suggestion that on average only gives 67/84 per turn(though that one IS less easy to counter)
Under the assumption that he chooses the counterpicks with the same probability as you throw the moves. (so if you use rock 9 out of 10 times he uses paper 9 out of 10 times.)
Then the optimal choice is to throw rock 4/7 of the time and paper 3/7
(Note that no matter what, the opponent can't win using rock using this strategy)
This gives an expected pay-off of 11/7.
Follow-up: I'm pretty sure that the Nash Equilibrium for the 10/3/1 game is to play Rock 10/43, Paper 30/43, and Scissors 3/43 of the time. Expected payoff is then 0.698 if both players use that strategy. Changing from that strategy can drive down the opponent's winnings, but cannot serve improve your own.
@DeltaDelta
I am pretty sure that in any case where both players use the same strategy then the expected payoff is zero.
Since neither player has any form of advantage over the other in such a situation.
Rody: Not generally speaking, no. Imagine two perfectly equivalent basketball teams playing the exact same strategy against each other; we would still expect each team to be scoring some positive number of points against the other.
Now alternatively, if you assume that points to one count as a loss to the other, or that you only keep track of how far ahead one is from the other, then that's the special type called a "zero sum game". But none of that was specified in the description above.
http://en.wikipedia.org/wiki/Zero_sum_game