« Cheaper International Shipping | Main | The Wii U »
Monday
Jun132011

Yomi Online Rankings

We're trying out making online Yomi matches best of 3 instead of best of 1 in the free online version at www.fantasystrike.com. You can leave feedback about that here.

In this system, think of taking the opponent's like down to 0 as winning a "round" in a fighting game. So when press Quickmatch and find an opponent, you play best of 3 "rounds." You can't switch your character between rounds. After someone wins a "game" (meaning they won best of 3 rounds), you can get a rematch. If you rematch, the loser can switch his character, but the winner cannot, as is customary in fighting games. Rematching can be good to do for your own personal improvement, as it will let you get deeper into the nuances of playing against a particular opponent. If they are exploiting some bad habit of yours, you might never really figure that out if you don't play a longer set. Up to you though!

The main reason our system was 1 round before instead of 2-out-of-3 was just time. If it takes 15 minutes to play Yomi, you'd have to set aside 45 minutes to play a best of 3, which is unreasonably long for the default way to play the game. Looking at the data, an average "round" lasts between 8 and 9 minutes though. This is shorter than we expected. If you figure that half the time a best of 3 will go to 3 rounds and half the time it will end in two rounds, the average for a best of 3 set is 22.5 minutes or less, which is acceptable for us to try. The benefit of more rounds is that you have more time to get a read on the opponent and exploit their weaknesses, so it's a better test of skill.

Leaderboards

We're resetting the leaderboards now, because of this change. From now on, leaderboards will be showing people's perforamnce in best-of-3 situations, rather than best-of-1. Congratulations to garcia1000 for taking the top spot on the first season's leaderboard. Garcia's hobby is making statements that sound plausible, but that are actually false. Also he plays Lum and is a poker master in real life. (Those two things are plausible and actually true.)

Stats from Season 1

Here are the stats for ranked games (with mirror matches exclude) by character:

Midori	2238 / 4238  52.81%
Geiger	2358 / 4492  52.49%
DeGrey	2731 / 5257  51.95%
Lum	2202 / 4310  51.09%
Rook	2367 / 4689  50.48%
Grave	2516 / 4987  50.45%
Argagarg2592 / 5172  50.12%
Jaina	2503 / 5107  49.01%
Valerie 2486 / 5231  47.52%
Setsuki 2924 / 6436  45.43%

Yomi's balance has held up remarkably well, so we're very happy about it. It remains closer balanced than any known fighting game. What's interesting about these stats is that Midori is often ranked last in tier lists while Grave is often ranked first. Grave doesn't even crack the top half of the chart here though, and Midori is #1. That said, all these percentages are very close anyway.

Now let's look at those stats ranked by popularity instead of by win-rate:

Setsuki 2924 / 6436  45.43%
DeGrey	2731 / 5257  51.95%
Valerie 2486 / 5231  47.52%
Argagarg2592 / 5172  50.12%
Jaina	2503 / 5107  49.01%
Grave	2516 / 4987  50.45%
Rook	2367 / 4689  50.48%
Geiger	2358 / 4492  52.49%
Lum	2202 / 4310  51.09%
Midori	2238 / 4238  52.81%

Midori is dead last in popularity even though he was first in win-rate. Lum is second to last here, and he is complicated to play well. Setsuki is considered easy to play and is the most popular character. This is ironic because Setsuki's original purpose was to appeal to Magic: the Gathering player who could understand her weirdness more readily. Because she can draw a lot of cards if her hand is empty, it means that doing strange things to just throw your cards away is actually good, even if it's counter-intuitive. It turns out that Setsuki is a favorite amongst new players though, even ones who aren't from MTG. It turns out that people just really love drawing a lot of cards.

Another note here is that players who are not really familiar with asymmetric games such as fighting games often say "but if one character is the best, why doesn't everyone play only that character??" Of course this never happens, as evidenced by: all fighting games, ever. There are a lot of reasons to prefer one character over another. Maybe you think someone is just cool. Maybe you enjoy the game mechanics of that character more. Often what happens is that your own personal skills allow you to be better at "mechanic A" that some character can do than "mechanic B" that the supposed best character does. Your personal win-rate can be higher by choosing the character you're more suited to play, rather than what some tier list claims. And again, these percentages show that ALL characters are capable of winning just fine, so you can play whoever you want. The spread is surprisingly even across all characters.

Matchmaking

When you click the quickmatch button, you're ranked with someone as close in skill to you as possible. The matchmaking system uses an Elo-like system (as is common in Chess). Note that it does NOT use the feature of Microsoft's TrueSkill that makes a win later count for much less than a win now. I think this stems from a fundamental disagreement I have with that system. If you win 10 games in a row a year from now against very strong opponents, you deserve to move in the rankings by the exact same amount as if you won 10 games against strong opponents today. You don't have to worry about the system "judging" that it thinks you're terrible and handicapping your gains to conform to its beliefs. Winning is winning and you deserve your points.

It's entirely possible that you are pretty good with one character but terrible with another character. Our matchmaking takes that into account. You choose your character before the matchmaking step, so if you are used to rocking everyone with DeGrey, you'll be ranked pretty high with him and face good opponents. If you suck at Geiger though, you can pick Geiger, click Quickmatch, and face other sucky people like you. Another way of putting this is that you can lose all day with Geiger without even affecting your DeGrey ranking.

You can also still play "custom games" that are outside of the matchmaking system. Here, you can play anyone you want, with settings of your choice, including a "best of 1" game. You can practice here without worrying about ranking (not that you should worry anyway!)

Ranking Up

When you first start playing Yomi at fantasystrike.com, you'll have a rank of "Student 1." Here's a list of the ranks you'll go through:

Student 1 (Flower)
Student 2 (Grasshopper)
Student 3 (Mouse)
Student 4 (Dove) 
Student 5 (Bunny)
Student 6 (Ferret)
Student 7 (Penguin)
Student 8 (Gazelle) 
Student 9 (Dolphin)
Student 10 (Monkey)  

As you win games, you rank up through the student ranks. Losing does NOT subtract anything. This makes the newbie experience a bit more friendly in that you can only get positive feedback, really. During this student phase, you can play any character you want (or multiple characters) and it all contributes to your account's overall Yomi student ranking. The matchmaking system is still in effect here, so if you win a lot you'll be playing better people and if you lose a lot you'll be playing worse people.

After you pass Student 10 (Monkey), you graduate to the Master Ranks. Here they are:

Master 1 (Disciple)
Master 2 (Journeyman)
Master 3 (Adept)
Master 4 (Veteran) 
Master 5 (Expert)
Master 6 (Virtuoso)
Master 7 (Hero)
Master 8 (Champion) 
Master 9 (Paragon)
Master 10 (Legend)
Grandmaster

These ranks work differently than the student ranks. First, they are separated by character. For example, you could play Rook and Valerie during the student phase, and you'd reach Master 1 with ALL your characters. If you continued to play only Rook and Valerie, your Rook and Valerie will reach higher Master levels but your other characters will remain at Master 1 level.

Next, these ranks are not as forgiving as the student ranks. Losing really does make you lose points here. That said, you will still probably progress at least a bit through these ranks, even if you're a bad player. Imagine that in reality, you're in the 40th percentile of players, meaning that you're only better than 40% of the people, and most players are better than you. The ranking system is tuned so that your Master Level will eventually be about Master Level 4, after you play enough games. You'll still have the experience of progressing, but the ladder is a skill-based measure in the end, and it converges to Elo score. So just to be clear, that means the top players are there because of their Elo ranking (their ability to win) and not simply because they played more.

We will probably reset the ladder each season, and keep records of winners of previous seasons. We will also probably hide names in the leaderboard that are inactive for too long. If you're top, you should actually play instead of retiring on a high number.

Tournaments

Player-run tournaments are highly encouraged, and you can get involved with them on my forums or on boardgamegeek.com. Hopefully you can also start a local tournament scene in your area. Feel free to practice online to learn the top strats first, then crush your friends locally.

Enjoy!

Reader Comments (21)

I wonder if stats for ranked games by character are the same for masters games?
P.S. Would love to see the same statistics for Puzzle Strike!

June 14, 2011 | Unregistered Commenterst8ic

I've never seen a fighting game whose win rate is virtually the inverse of its popularity list before. However you slice this data, the game is pretty balanced, although it could mean that some characters are way more appealing to new players, or that practicing against a certain character is a big component of success. The one exception is DeGrey, who should apparently be featured on your box art because he is your Ryu.

June 14, 2011 | Unregistered CommenterRobyrt

A tier list is not the same as a list of win rates.

June 14, 2011 | Unregistered CommenterJames Crom

That's true, but I'm not sure how that's relevant to anything. When I referred to tier lists, I was referring to tier lists (not shown here). When I referred to win rates, I was referring to win rates.

June 14, 2011 | Registered CommenterSirlin

I wonder if part of the reason for Setsuki being the worst-ranked character in win% is because she's the most popular character (and thus the most likely to be played by low-skilled newbies) and not because she's inherently lower tier. Does the data hold true if you only look at matches between Master-ranked players, for example? Conversely, might characters like Geiger be disproportionately unpopular among non-Master players?

June 14, 2011 | Unregistered CommenterJen Wong

Alternate theory on why Setsuki is played a lot: perhaps people like aggressive characters in Yomi.

Setsuki is by far the most agressive of the base set. The second most popular character is the character that gains the next-least from blocking. The third most popular character is a character that looks aggressive at first (biggest combo potential in the game), AND she's a hot chick (which is sure to increase her popularity).

Meanwhile, at the bottom of the popularity list there's Midori (defensive character if ever there was one), Lum (who can be played aggressively I guess, but garciaLum tends to involve blocking for infinity until you win with 4 Poker Flourishes in one turn or something equally absurd), and Geiger (which is kinda surprising, since every newbie I've taught the game to has taken to him quickly, and he's fun to play, and more intuitive than, say, Rook or Grave).

June 14, 2011 | Unregistered CommenterObscura

http://dl.dropbox.com/u/2187900/YomiLinearRegression.jpg

Maybe ten data points is not enough data points

June 15, 2011 | Unregistered Commentergarcia1000

If all the characters are very well to perfectly balanced (as in, all statistically undistinguishable from having a perfect 50% match win expectancy against the field), it would make sense that more popular characters have a lower average win percentage. You would expect niche characters to be played more by people who master them, and popular characters to have a disproportionate number of newbies playing them.

I need to dust off some of my statistics manual tonight to make sure, but I expect if you do an analysis on those numbers you'll see very few of them that disprove the null hypothesis (that the character has no advantage, i.e. is 50% against the field) with any sort of statistical significance, and you'll only see significance on a difference of means test between the top and bottom of the range, if any.

June 15, 2011 | Unregistered CommenterDarkAngel

I think it will be very interesting to see how the cursed vs non cursed matchups come out...

June 15, 2011 | Unregistered CommenterJoe W

Joe W... what? The cursed vs. non-cursed matchups will all be 10-0 in favor of non-cursed.

That's kind of the point of the cursed cards...

June 16, 2011 | Unregistered CommenterObscura

I just wanted to say that I read the matchmaking and ranking sections with HUGE admiration Sirlin, as I've not seen you talk much about these things before. But you've hit upon almost EVERY key feature that's required to make it work well, and that virtually no games out there manage to do properly. Fantastic! Special kudos for pointing out the flaw in trueskill as well, which I've rarely seen anyone else discuss or even be aware of, but it's something I've gone on about in the past too.

June 17, 2011 | Unregistered CommenterRik Newman (Remy77077)

Thanks Rik, glad you approve. ;)

June 17, 2011 | Registered CommenterSirlin

@ Obscura ,

I'm not quite so sure that it will be - Certainly the cursed cards give a disadvantage - but I do not think they will be as detrimental as 10-0. I think the results of cursed vs non cursed match-ups over time might say something about the balance of the mechanics. Certainly the card abilities are strong - but I've had several wins that came down more to just somehow beating the opponents with pure rock paper scissors play.

I also think that perhaps some of the cursed cards may have a more difficult time than others - I think it will be interesting to see if that holds true. I expect that there should be more of a variance between the cursed cards themselves - and if there isn't or if the cursed vs regular match-ups are more balanced than might appear - what does that say about the game mechanics / design?

June 19, 2011 | Unregistered CommenterJoe W

Joe W:
That's a very interesting observation. In the maxim you are kind of arguing that if the crused cards can play 5-5 vs non-cursed, then the critics are right: The game is nothing but Rock Paper Scissors.

On the other hand, the more lobsided cursed vs non-cursed play is, the more impressive is the balance of the game with non-cursed characters only.

I should point out that I too expect cursed characters to lose by quite a bit.

June 20, 2011 | Unregistered CommenterFalke

I find that in my group of players Setsuki is very popular because of her card draw mechanic. This is true for the veteran player and the beginner alike. Playing a lot of cards games, especially competitive L5R, shows you that card draw just wins you games. Combine that with Bag of Tricks and you got yourself some really powerful repetitive combos and options.

Also being a hot ninja girl never hurt. Go inverted beaver kick!

-Robert

June 21, 2011 | Unregistered CommenterRobert Fulkerson

Speaking as a casual Yomi player on fantasystrike, I absolutely play Setsuki because of my MTG background. I love drawing cards, and I love how Setsuki makes that possible.

I probably will never invest enough time into Yomi to be good, though, so I apologize for dragging down ninja chick's win rate.

June 21, 2011 | Unregistered Commentertwo_eyes

Really good matchmaking and ranking system. There's so many games that have horrible systems for those things and really need to copy this.

June 29, 2011 | Unregistered CommenterriotofWords

Thanks riotofWords!

June 29, 2011 | Registered CommenterSirlin

Just a thought on this (I've not seen anything more up to date that explains Yomi's matchmaking or ranking!) - Can people view/request their 'true ELO score' at any point? Obviously people at the top of the chart are effectively seeing it, but, for example, two Master Level 4's could have quite different ELO I could imagine if they've played very different numbers of games. I could be wrong, just my interpretation of what's written here. Of course I'm not prescribing what the 'best' solution here should be. I think it could be completely valid to always hide this score anyway.

Also, will I ever want to move past Student 9 (Dolphin) ? ;-) (Dolphin fan here).

You'll reach Master 1 because you can't even go down in the student ranks. So everyone gets to the master ranks eventually, at which point you'll then end up in your true elo position. For example, if you are the worst player in the world, you'll be at Master 1 forever (or until you actually improve, that is). Two master 4s could have different elo scores if one had leveled out there while the other was still climbing and had not yet reached his true place at Master 7 or whatever. You can kind of tell when you reach your actual elo by how many rank points you win / lose for each match, though I don't have the details handy right now.

May 15, 2012 | Registered CommenterSirlin
Comment in the forums
You can post about this article at www.fantasystrike.com.