wspaniel's picture
By: wspaniel, William Spaniel
Dec 09 2009 4:02pm
5
Login or register to post comments
1841 views


Months ago and on a different website, I said that Magic Online gave us the ability to statistically track the best cards and strategies for the formats available there. This article marks the first rigorous attempt to statistically model a limited format. I thousands of cards played in Zendikar limited, and recorded the cards that contributed to these wins and losses. Over the next five days, I will be releasing the results for all the commons. Unlike traditional analysis, the data will not be a matter of opinion and heresy—they will be of pure observation. I will not justify cards are good because I said so—they will be good because they have won actual games on Magic Online. (However, I will include a brief explanation for why every card appears where it does, based on my in-game observations.) The results will surprise you and may make you reconsider everything from pick orders to some of the very basic conventional wisdom of limited formats.

Before I begin with the results, I must detail the methodology of my experiment. (If this does not interest you, feel free to skip to the list of cards.) To begin, I assume that skill, luck, and the quality of a player's deck determine who wins any particular confrontation. While undoubtedly skill matters, this study is focused on the luck and card quality factors. Players actually have a great deal of control over both of these, as a poorly-constructed deck will win less often than a well-constructed one. From this, we can conclude that some cards contribute to wins more frequently than others. If an average card ever reached play in a game, we would expect its controller to have only won that game around 50% of the time. But if a truly exceptional card reached play, we would expect its controller to have won upwards of 70% of the time.

Watching replays of Pro Tour San Diego qualifiers on Magic Online (and carefully avoiding the qualifier in which the system malfunctioned and everyone played a 140 card deck—yes, this actually happened), I recorded the results of more than a thousand players. Every time a card hit play, I would record it as either a win or a loss, depending on what ultimately happened in that game. If the card reached play multiple times (perhaps because of a Grim Discovery), it only counted once. But if a player cast multiples of a single card, I counted that card multiple times.

Such a large number of observations were necessary to remove the play skill bias that would have shown up in a smaller-n study. It also shrinks the margins of error, allowing for better hypothesis testing, which I ran at 90%, 95%, and 99% confidence.

For those of you unfamiliar with hypothesis testing, here is a brief explanation for what each of those means:

90% Confidence: When I say we can be 90% confident that a card positively contributes to victories, it means that there was only a 10% chance that the card has no impact and the data came back so eschewed based on pure luck. While the odds of being wrong here are only 1/10, we should be very skeptical of these results as statisticians. Generally, it is only a good idea to accept these results if we have a good theory behind them. For example, I would accept Burst Lightning—a quality removal spell as being true—but I would cast doubt on whether Blood Seeker was actually affecting things.

95% Confidence: This is the gold standard of statistics. When a card meets 95% confidence, the likelihood the card is merely average but we got this extreme of data back is only 1/20. At this point, it is a good idea to start thinking of theories to justify the results if you do not have one already.

99% Confidence: While rare (there were only four in this study), a 99% confidence virtually guarantees that a given card has an impact on the match—there is only a 1/100 chance that this result is wrong. You should pay careful attention to these.

Just because a card does not show up a significant does not mean you should not care about the results. But you should not treat them as gospel, either. The best analogy I can draw is to that of a baseball team. It is possible that your star hitter goes through a minor slump at the beginning of the season and an average player goes on a torrid streak at the same time. That does not mean the average player is better than the star; it just means he was better during that period of observation. So don’t be surprised if the study ranks an average card lower than one you perceive as a top-pick. My card-by-card commentary will help qualitatively decide whether this was just statistical coincidence or if it might be part of a larger trend.

Additionally, just because a card is sub-50% does not mean you should automatically stop playing it in all of your decks. Going back to the baseball analogy, a team cannot field nine players with batting averages all over .300. But it can maximize its performance by putting its best players in the lineup. So if you need to play (Vampire’s Bite) to have enough black cards to justify running Sorin Markov, go right ahead; but if Vampire Lacerator is floating around in your sideboard, the data indicate you should swap out the Vampire’s Bite.

On to the cards. Today is white. Let’s begin with a list of the cards ordered by their win percentages:

Steppe Lynx: 68%**
Cliff Threader: 60%
Kor Skyfisher: 60%
Journey to Nowhere: 56%
Kor Sanctifiers: 55%
Ondu Cleric: 55%
Makindi Shieldmate: 50%
Kor Hookmaster: 48%
Kor Cartographer: 45%
Kor Outfitter: 41%
Nimbus Wings: 30%**
Bold Defense: Ineligible
Caravan Hurda: Ineligible
Narrow Escape: Ineligible
Noble Vestige: Ineligible
Pillarfield Ox: Ineligible
(Shieldmate’s Blessing): Ineligible
Sunspring Expedition: Ineligible

*Statistic is significant at 90%.
**Statistic is significant at 95%.
***Statistic is significant at 99%.
A card is ineligible when there are fewer than 20 observations.


Now let’s run down the cards:

Steppe Lynx: 68%**
Before reading this article, I bet you believed that Journey to Nowhere is the best white common available. That may be the case—we’ll see why Journey to Nowhere is a full 12% lower in a moment—but Steppe Lynx is here because it really, really wins games. No one drop in this format puts on greater pressure than the Lynx, and tricks with fetchlands and Harrow do some ridiculous things. Because there aren’t many creatures in the format that can block it early, your opponent ends up having to spend a removal spell like Burst Lightning or Disfigure early on, which prevents them from using those spells on bigger beaters later on. Add in the fact that Steppe Lynx remains relevant in the later game, and you have a great card. Its one flaw—that it is useless on defense—hardly seems to matter.

Later in the week, we will see that this is not a coincidence: the entire cycle of common landfall creatures have favorable ratings, three of which are statistically significant.

Cliff Threader: 60%
Blocking does not work very well in this format. We’ve known that since the prerelease. But it might come as a surprise that a 2/1 creature with mountainwalk does so well for itself. Cliff Threader does well for a few reasons. First, and most obvious, is that anyone sideboarding this in will have a 2/1 unblockable creature whenever they cast it in games two and three. Two untouchable damage per turn starting on turn two puts your opponent on an extremely short clock and requires a removal spell right away, just like Steppe Lynx. However, you should probably be maindecking Cliff Threader, as red was the most popular color in the games I tracked, and was frequently featured as at least a splash color for access to Burst Lightning and Magma Rift. And, in the worst case scenario, the Threader is a 2/1 for two mana. That’s okay in Zendikar, especially since you don’t have to worry about Plated Geopede blocking it.

Kor Skyfisher: 60%
Yes, the top three cards for white all are creatures costing two mana or less. There’s always been some question as to whether the risk of lost tempo keeps Kor Skyfisher from being a great creature. This data should help answer that question. Playing Skyfisher on turn two on the draw with no permanent to pick up is a horrible idea. But outside of that, you can’t go wrong. Any time you can retrigger an important 187-ability makes you very likely to win the game. Meanwhile, a turn one Adventuring Gear followed by Kor Skyfisher isn’t a horrible play either and puts your opponent on a limited clock.

On a side note, as we will see tomorrow, Kor Skyfisher has one not-so-obvious consequence: it makes Paralyzing Grasp awful.

Journey to Nowhere: 56%
If I sat down at a draft today and opened Journey to Nowhere along with Steppe Lynx, Cliff Threader, or Kor Skyfisher, I would not hesitate to take Journey first. The reason it ranks so low here, though, is a little humorous: people are terrible playing it. In the 37 losses I observed Journey to Nowhere being a part of, an all-too-common theme was something like this:

Turn 1: Plains, Go.
Turn 1: Swamp, Vampire Lacerator, Go.
Turn 2: Plains, Journey to Nowhere, Go.

Turn 4: Swamp, Crypt Ripper, Attack, Go.

Turn 7: &@*# and your Crypt Ripper. You didn’t deserve to win that game. If I had just drawn one of my six removal spells, I would have won for sure. Lucksack.

Hey, Timothy Scrubsville, you did draw a removal spell. It was Journey to Nowhere! Your answer was in your opening hand—you were foolish enough to cast it on an average creature. No one should be surprised when they make such a boneheaded maneuver and lose, but it happens far too frequently. Played optimally, Journey to Nowhere is still white’s obvious MVP. Don’t let bad players change your mind about this.

On the bright side, however, I will be releasing a complete list of cards on Friday. This should assist players in making judgments whether a creature is worthy of a removal spell or not. Hint: Crypt Ripper is, Vampire Lacerator isn’t.

Kor Sanctifiers: 55%
This may seem fairly obvious analysis, but I will say it anyway: when Kor Sanctifiers is kicked, its controller usually wins. When it is not, its controller usually loses. The good news is that there are a couple great cards in this format that Kor Sanctifiers can hit: Journey to Nowhere (on Crypt Keeper, not Vampire Lacerator) and Soul Stair Expedition (take my word for it until we get to black). In addition, it also nails a bunch of average to suboptimal cards (Paralyzing Grasp, Adventuring Gear, (Explorer’s Scope), Stonework Puma, and all of the Expeditions). Part of the trick is knowing when to hold Kor Sanctifiers and when to play it just for the body. In my estimation, people generally cast it too hastily. Nail down the right mix and Kor Sanctifiers will do better than 55% for you.

Ondu Cleric: 55%
Ondu Cleric gets buoyed by a strange fact: it is virtually exclusively played in heavy ally decks. Decks lucky enough to receive a bunch of Umara Raptors, Nimana Sell-Swords, and Tuktuk Grunts tend to do pretty well on their own. Thus, the 55% statistic is a little misleading, as the other cards from this group are doing the bulk of the work, and Ondu Cleric is just riding their coattails. In comparison, whenever I saw Ondu Cleric played on a battlefield devoid of allies, its controller promptly lost most of the time. Consequently, I do not recommend you run this guy unless he fits your deck. Compare to Makindi Shieldmate:

Makindi Shieldmate: 50%
Like Ondu Cleric, Makindi Shieldmate does exceedingly well in ally decks and very poorly outside of them. However, because the Shieldmate appears to do more than the Cleric, it gets played more frequently outside of its natural home. Players who attempt this are routinely punished for it. A wall of its size isn’t horrible for this format—it’s just not that great either. As a result, Makindi Shieldmate appears to do worse statistically than Ondu Cleric even though it is probably better in the aggregate. Don’t be caught in this trap.

Kor Hookmaster: 48%
Kor Hookmaster is about an average of a card as you can get. Its body is average and not compelling; its ability is average and not compelling; and consequently its win percentage is average and not compelling. Some of the time you will lock down a key attacker and buy yourself some extra time. Other times you will clear out a blocker for a couple turns, giving you the ability to swing for the win. The rest of the time you will just get a 2/2 for three mana, and frequently you won’t even be able to tap anything with it. Thus, Kor Hookmaster earns an unimpressive 48%.

Kor Cartographer: 45%
This might come as a surprise given Zendikar’s theme and how good the common cycle of landfall creatures are statistically. However, in practice, Kor Cartographer does not contribute much. Don’t get me wrong—it is nice to swing for an extra couple of points of damage when combined with Steppe Lynx and its ilk. But most of the time you get a 2/2 with an extra Plains. Unfortunately, mana acceleration doesn’t come at a premium when it puts your fifth land into play. Hence Kor Cartographer ranks lower than you might have thought.

Kor Outfitter: 41%
Out of all white’s cards that qualified for this ranking system, Kor Outfitter does the second-to-worst, and it’s lucky it didn’t fall to an even lower percentage. In a good number of the Outfitter’s victories, its controller would have been equally as well off with a 2/2 with no abilities or perhaps even better with a vanilla 2/2 for 1W. Yes, you can save a mana here and there by immediately equipping your Adventuring Gear or Explorer’s Scope, but that does not compensate you for an awkwardly priced 2/2.

Nimbus Wings: 30%**
Despite conventional wisdom, not all creature enchantments fared poorly in this study. Nimbus Wings wasn’t one of those. You can attribute the 30% to the natural risks that creature enchantments leave you vulnerable to. Other creature enchantments get away with it because they afford some sort of protection to the creature they enchant. Here, Nimbus Wings barely provides a boost to toughness, and somewhat humorously leaves you open to Oran-Rief Recluse. (Yes, I witnessed this happen.) Thus, it gives me great pleasure to say Nimbus Wings is bad to a statistically significant extent.

Bold Defense: Ineligible (8-1**)
While only appearing in nine games (a sign that people don’t want to put the card in their decks), Bold Defense won in eight games and lost only one, which actually makes it statistically significant to 95% confidence. On one hand, it acts as a bit of an Overrun in the late game, so it should not be surprising that Bold Defense has a good win percentage. On the other, I wonder how many times it just sat in its controller’s hand while they got pummeled by opposing creatures without the mana to ever cast the thing. As such, I want more data on Bold Defense before I pass final judgment on it.

Caravan Hurda: Ineligible (1-4)
Step one: spend five mana. Step two: put a creature that is functionally a wall into play. Step three: gain one life whenever it actually engages in combat. I don’t think that Caravan Hurda’s 20% mark in limited action is a coincidence, and I always wondered why its controller ever decided to put it in his or her deck in the first place.

Narrow Escape: Ineligible (5-6)
Three of these five wins came from playing Journey to Nowhere, stacking the ability, casting Narrow Escape, removing an opposing creature from the game permanently, and then recasting Journey to Nowhere. Another picked up a creature with a Paralyzing Grasp on it (hint: Paralyzing Grasp is terrible). The other saved a creature from dying in combat. Most of the six losses saw a player desperately clinging on for another turn by grabbing an irrelevant permanent just for the extra life. Thus, Narrow Escape is either a little awesome or very bad. I’d like to see some more data on it.

Noble Vestige: Ineligible (0-2)
Noble Vestige was played twice, both to losing efforts. Three mana for one damage in the air a turn or a limited healer effect doesn’t go far in any format. I don’t want to see any more data for this one, nor do I think I will be able to find people silly enough to run it in their decks to give me that data.

Pillarfield Ox: Ineligible (11-5)
Somehow, Pillarfield Ox managed to rake in a 69% clip in limited action. I do not have an explanation for why that is the case other than random chance, as it did not do much in any of the games I watched. Treat this one as coincidence until I pull in more data.

(Shieldmate’s Blessing): Ineligible (3-3)
Was Healing Salve too good? (Keep in mind it came from a cycle that included (Ancestral Recall), Dark Ritual, and Lightning Bolt.) While Shieldmate’s Blessing came out at an even 3-3 over six games, I’d still prefer a slightly below average creature like Kor Hookmaster or Kor Cartographer over a poor combat trick like this one.

Sunspring Expedition: Ineligible (1-4)
How much life do you need to gain from a single card to make it worthwhile in limited? Eight is not enough, apparently. Even when Sunspring Expedition went off, it barely affected the game. And the times when it didn’t were absolutely miserable for its controller. Try applying pressure with Steppe Lynx instead.

That wraps up all of white’s commons. Although there are only five cards above 50% that meet the requirements to be eligible for the study, four of them are quite good, and consequently white is underappreciated in this format.

Next time, we will run down blue and answer some pressing questions such as:

Is Reckless Scholar any good in a format that requires playing lands, not discarding them?
Do 0/4s for one mana do anything?
Was someone silly enough to play Spell Pierce?
And finally, just how bad is (Paralyzing Grasp, an alleged “removal” spell?

See you then.

William Spaniel

williamspaniel@gmail.com 

34 Comments

Pretty cool by Anonymous (not verified) at Wed, 12/09/2009 - 16:29
Anonymous's picture

There is one bit of a flaw in that you didn't mention what the format is. I'm guessing the PTQ was sealed deck? It does skew the results, because a red splash is MUCH more likely in sealed than it is in draft; I've rarely seen anyone actually splash a third color in ZZZ, so mountains shouldn't be as prevalent in draft as in sealed, making the cliff thread slightly worse.

Also, it's hearsay, not heresy. Probably just a typo that the inline corrector didn't spot because heresy is also a word, but meanings are worlds apart.

Besides that, I'd like to commend you for your dedication. I'm curious to see the rest of the data, even though I think the information might not be all that useful. It's plenty interesting, at least, and the analysis is pretty good.

You're right in that this by wspaniel at Wed, 12/09/2009 - 16:55
wspaniel's picture

You're right in that this data is geared towards sealed and not draft. While I do think it has a lot of external validity towards that format as well, you should be looking at these results as from sealed deck tournaments--because that is where they are from.

Bah that 'hearsay' is heresy by Paul Leicht at Wed, 12/09/2009 - 19:16
Paul Leicht's picture

Bah that 'hearsay' is heresy :p

Glad to see you writing by psymunn (not verified) at Wed, 12/09/2009 - 17:05
psymunn's picture

Glad to see you writing again!!! Interesting read, and looking forward to the rest. Can we anticipate more articles than simply these 5?

Depends on the feedback. by wspaniel at Wed, 12/09/2009 - 19:31
wspaniel's picture

Depends on the feedback. Compiling the data is extremely costly, and I don't have much to write about outside of that.

OH SNAP!! William Spaniel!!! by ShardFenix at Wed, 12/09/2009 - 17:40
ShardFenix's picture

OH SNAP!! William Spaniel!!! I remember I sed to read you all the time back on "that other site". Glad to see you writing again. Great article btw. Though i was expecting more of the curmudgeon-ness.

Sorry to disappoint. by wspaniel at Wed, 12/09/2009 - 19:31
wspaniel's picture

Sorry to disappoint.

I am really not a huge fan of by Paul Leicht at Wed, 12/09/2009 - 19:19
Paul Leicht's picture
5

I am really not a huge fan of statistical analysis particularly in small numbers (but really not at all) but this was an interesting read. You had some typos but over all it was educational. I haven't played ANY Zendikar limited and I don't play standard or block so my understanding of the value of some of the cards you covered is different than what you show here. I feel like I learned something. Nice article and 5x fireball at you.

Small-n studies are bad, by wspaniel at Wed, 12/09/2009 - 19:33
wspaniel's picture

Small-n studies are bad, which is why I indicated statistically significant results where applicable. You can trust those (with differing degrees) of confidence.

Really enjoyed reading this, by Joyd (not verified) at Wed, 12/09/2009 - 19:32
Joyd's picture

Really enjoyed reading this, and definitely looking forward to the rest of the series. (My guess as to why there's a correlation between Blood Seeker hitting play and winning is that if you're playing Blood Seeker, you're probably pretty heavy into black, because nobody splashes for Blood Seeker. If you're pretty heavy into black, you're in a good spot because black is leaps and bounds better than other colors in Zen limited. (Or is it?))

Not to spoil anything, by wspaniel at Wed, 12/09/2009 - 19:36
wspaniel's picture

Not to spoil anything, but...

Plains: 52%
Island: 48%
Swamp: 51%
Mountain: 51%
Fores: 46%

(None is statistically significant.)

So, no, Blood Seeker (53%) isn't doing well just because it is a black card. It's not statistically significant either, so I just tack it up to randomness.

Ah, I had misinterpreted the by Joyd (not verified) at Thu, 12/10/2009 - 15:32
Joyd's picture

Ah, I had misinterpreted the article as saying that you -had- seen statistically significant results on Blood Seeker, since you mentioned it in the 90% confidence section. Never mind me.

It's too bad you can't see by ghweiss at Wed, 12/09/2009 - 20:08
ghweiss's picture

It's too bad you can't see the players' hands. Ideally we'd want to know the correlation between DRAWING a particular card and winning the game, not just casting it. Case in point: Bold Defense basically says "win the game" when you pay 7 mana, so it's generally worth holding until that point. This skews the data though, because you can't tell how many games were lost with Bold Defense stuck in hand.

Thanks for doing this, though. Coupled with the infamous "8-4 Top 10 List" from 2 weeks ago, I'm very curious as to how an ideal ranking system could be established.

Ideally I could see every by wspaniel at Wed, 12/09/2009 - 20:24
wspaniel's picture

Ideally I could see every card in their deck and do an extremely large-n. But MTGO won't let me. Unless Wizards wants to hire a statistician for the summer...

Could you provide a link to the "8-4 Top 10 List" article?

The list was sent out with by spg at Thu, 12/10/2009 - 15:55
spg's picture

The list was sent out with the monthly player rewards newsletter:

Magic Online Factoid
Top 10 first picks of 8-4 ZEN draft winners:

1. Hideous End
2. Burst Lightning
3. Vampire Nighthawk
4. Journey to Nowhere
5. Marsh Casualties
6. Disfigure
7. Plated Geopede
8. Trusty Machete
9. Malakir Bloodwitch
10. Kor Skyfisher

hmm, so don't care about by Anonymous (not verified) at Wed, 12/09/2009 - 20:49
Anonymous's picture

hmm, so don't care about sealed, totally different format. In any case alot of what's being tracked here is what GOOD players play vs what BAD players play, so for example all players know to play journey in any deck whenever they can support the one white mana, alot of BAD players don't understand how good and solid steppe lynx is, so he winds up in a winner's deck more often than journey is able to create an upset of a bad player over a better player. The data isn't useless of course its just alot more complicated than simply being able to say this card is better than that, ill still take journey over lynx in any draft, if for no other reason steppe is more likely to wheel, also journey is just flat out better.

Also if you're including post board games, where cliff threader wasnt in the main youre going to get skewed results, since his text will read "this creature is unblockable" for his 2 mana cost, if you only include game 1's or games where he was in the maindeck you'd get alot different results, and again different conclusions for draft.

This analysis also ignores by Anonymous (not verified) at Wed, 12/09/2009 - 20:53
Anonymous's picture

This analysis also ignores any biasing done during deck-building. For example, some allies might show good impact on win rates because they're ONLY played in ally decks, but are terrible cards otherwise.

my comment 2 comments ago may by Anonymous (not verified) at Wed, 12/09/2009 - 21:29
Anonymous's picture

my comment 2 comments ago may seem really negative, but I'm actually thrilled to see real statistics applied to magic limited, just wish it were more targeted to draft.

Awesome! by Felorin at Thu, 12/10/2009 - 01:47
Felorin's picture

I love this article! I like analyzing things mathematically myself, but rarely have the time to do so in depth on anything. I really enjoy reading results like this. It would be great to see some numbers on drafts too if that's ever possible - it would give some interesting insights into what types of cards get better & get worse between sealed and draft. Also it'd give more insights into which cards to pick in drafts.

But if you can only just finish giving us all this information on sealed, please do. I'll look forward to reading the rest of it!

I would like to be able to do by wspaniel at Thu, 12/10/2009 - 06:12
wspaniel's picture

I would like to be able to do it with drafts, but MTGO doesn't give me access to enough replays. I can only lift four games off of each premiere event. (I can't track every single game of the top eight, which would be at least 14 games, because my sample would biased.) So for me to do a full a series on drafts, I would need to track hundreds of events. That would take an awfully long time.

Possible solutions:

1) Email Wizards and tell them that you want the draft replays to be open to the public.

2) Email Wizards and tell them you want them to commission a study at Pro Tour San Diego. I live in the greater San Diego area, and it would be pretty easy to pull something like this off. But I would need Wizards to be on board.

Yeah, the introductory bit was slightly misleading... by Jenesis (not verified) at Thu, 12/10/2009 - 02:58
Jenesis's picture

"If I sat down at a draft today and opened Journey to Nowhere..." but the statistics are based on Sealed. A little saddened by the low score on Hookmaster, seeing as it's being considered even in Standard. An interesting read, nevertheless, although you can't really judge cards in a vacuum. Eagerly awaiting the other four (five?) parts in the series.

wait, so steppe lynx is good by Anonymous (not verified) at Thu, 12/10/2009 - 03:06
Anonymous's picture

wait, so steppe lynx is good in sealed? who knew!

This was an amazing Article! by Luke (not verified) at Thu, 12/10/2009 - 09:39
Luke's picture

This was an amazing Article! I really appreciate you taking the time to do this and cannot wait to read the rest!!! Thank you, Luke

i liked this article, i mean by Anonymous (not verified) at Thu, 12/10/2009 - 10:35
Anonymous's picture

i liked this article, i mean if you read the other articles about zen commons, you already deducted the majority of this content, but its always nice to see a new evaluation put forth.

keep it up

I appreciate the effort that by srg (not verified) at Thu, 12/10/2009 - 12:21
srg's picture

I appreciate the effort that went into gathering this data, and it was a good read even if, as you point out, a lot of the results can be chalked up to randomness.

As a math major, it's good to see someone writing that actually understands statistics beyond dividing things by the sample size. People love to throw around percentages in this game.

C/C problems? by Anonymous (not verified) at Thu, 12/10/2009 - 16:19
Anonymous's picture

This is very interesting, and on the whole, great work. Some things for the less statistically-minded to keep in mind as reading re: biasing factors:

-This is really a correlation study, not a causation study - there may be cards that are "generally" found next to each other that are driven up/down due to factors like synergy, and the existence of a card doesn't guarantee its impact on winning the game (directly, at least). That doesn't mean the card didn't help, either - just that it's not guaranteed in each specific case.

-Skill level still exists, and it's hard to say how large an impact that has, even on sealed.

-Separately, but related, there may be a "perception gap" whereby better players select a certain subset of cards at a higher rate because of factors other than the card's actual (inherent) quality or ability to affect the game . . . some examples may include the general distaste for green among top players or a preference for black, or a general thought that "aggro is best" so the strategy is "forced" in more pools by better players. This would amplify skill level factors for a subset of cards (although the correlation would be unchanged, the causation tilts toward play skill).

With that said, I'm kind of surprised at the low level of significance given the high n - and I can't imagine how much time this took. Thanks for tracking this down - it's always interesting to see how the data interacts with conventional wisdom. Great work.

Yup, really miss the by Anonymous (not verified) at Thu, 12/10/2009 - 23:43
Anonymous's picture

Yup, really miss the curmudgeonly, cynical William Spaniel. Good article though.

I like the new Spaniel by MT206 (not verified) at Fri, 12/11/2009 - 05:09
MT206's picture

I think I have to disagree with Anon above and say I like the new Spaniel. I think this kind of analysis is interesting, although I think that outside of release weekends, for most people draft data is more relevant than sealed. Regardless, I have to commend you for the amount of effort that had to go into compiling this much data.

This is a really great by Anonymous (not verified) at Fri, 12/11/2009 - 11:35
Anonymous's picture

This is a really great article with very useful information. I also appreciate the fresh approach.

It'd be great if you could get some kind of screen scraper going so you didn't have to gather data by hand. It'd be enough to scrape just the text/comments portion, which would give you every card in the match. It's probably worth your time to hunt around for a coder to do this for you. It's technically a violation of the client usage agreement tho.

I think the "Bold Defense Problem" is a big flaw that you should be emphasizing as you go forward. Lots of cards would have a lower percentage if you considered that they were in hands at end of game. Some (Cancel comes to mind) might have higher. For instance Iona probably will show like a 90% win, but that doesn't mean it should be in your deck.

If someone can point me in by wspaniel at Fri, 12/11/2009 - 15:21
wspaniel's picture

If someone can point me in the direction of a person who can program such a screen scraper, I would consider using it.

What i'm wondering, is what by Kevin Grove (not verified) at Fri, 12/11/2009 - 14:15
Kevin Grove's picture

What i'm wondering, is what the importance is off the fact that cheaper cards are more on the battlefield then more expenisive cards. Especcially for the fact that games are lost by manaflood/manascrew

I have wondered this, too. It by wspaniel at Fri, 12/11/2009 - 15:23
wspaniel's picture

I have wondered this, too. It seems like more expensive cards should win games more frequently than cheaper cards--not because they are necessarily better to have in your deck, but because the very fact you played them means that a lot of things had already worked out well for you earlier in the game. In the future, if there is such a bias, I plan on handicapping more expensive cards accordingly. However, I need to capture more data from more formats before I can do that.

In other news, very, very few games I watched were won or lost on account of mana screw. I was surprised to see that.

awesome! by solebush at Fri, 12/11/2009 - 16:01
solebush's picture
5

Really cool idea and I really appreciate the work that you put into doing this. While its easy to debate the actual meaning of the data you collected, it's great just to have access to it.
I have to agree with a previous poster in that this type of data is probably more useful in identifying what the more skilled players are playing, rather than what cards specifically are leading to victories. There pretty much aren't any cards (maybe nighthawk?) whose presence alone can be expected to alter the outcome of a game. As a result, I would expect most cards that the vast majority of players will always play if they are on color, to have a rating that is not significantly different from 50%. Just as you said yourself, Journey should win more, if people actually played it right. I think the same may go for hookmaster here, although maybe it isn't as exciting in sealed as it is in draft. Along the same lines, I think the reason nimbus wings scored so low is that it is exactly the type of card that beginners value quite highly but more experienced players tend to prefer avoiding. It is very hard to agree that the act itself of playing such a card is what lead to a loss. Rather, the act of playing the card indicates that it is more likely that a weaker player is piloting this deck.

awesome! by solebush at Sat, 12/12/2009 - 01:49
solebush's picture
5

Really cool idea and I really appreciate the work that you put into doing this. While its easy to debate the actual meaning of the data you collected, it's great just to have access to it.
I have to agree with a previous poster in that this is type of data is probably more useful in identifying what the more skilled players are playing, rather than what cards specifically are leading to victories. There pretty much aren't any cards (maybe nighthawk?) whose presence alone can be expected to alter the outcome of a game. As a result, I would expect most cards that the vast majority of players will always play if they are on color, to have a rating that is not significantly different from 50%. Just as you said yourself, Journey should win more, if people actually played it right. I think the same may go for hookmaster here, although maybe it isn't as exciting in sealed as it is in draft. Along the same lines, I think the reason nimbus wings scored so low is that it is exactly the type of card that beginners value quite highly but more experienced players tend to prefer avoiding. It is very hard to agree that the act itself of playing such a card is what lead to a loss. Rather, the act of playing the card indicates that it is more likely that a weaker player is piloting this deck.