wspaniel's picture
By: wspaniel, William Spaniel
Feb 16 2010 2:18am
0
Login or register to post comments
1329 views


Zendikar’s White Commons
Zendikar’s Blue Commons
Zendikar’s Black Commons
Zendikar’s Red Commons

It occurred to me over the weekend that I still haven’t released the green/artifact/land results to my statistical study on Zendikar sealed. I apologize for that. So I’m bringing them to you today. Before you cry foul, I think they are still relevant. Undoubtedly green got the shaft in Zendikar; but Worldwake may shift the balance of power, and it will be important to know exactly which green cards you should be on the lookout for out of your Zendikar packs.

On that note, here is the rundown:

Vines of Vastwood: 64%**
Grazing Gladehart: 59%**
Vastwood Gorger: 59%
Timbermaw Larva: 56%
Territorial Baloth: 52%
Mold Shambler: 52%
Harrow: 51%
Oran-Rief Survivalist: 47%
Khalni Heart Expedition: 47%
Nissa’s Chosen: 46%
Oran-Rief Recluse: 43%
Beast Hunt: Ineligible
Joraga Bard: Ineligible
Relic Crush: Ineligible
Savage Silhouette: Ineligible
Scythe Tiger: Ineligible
Tanglesap: Ineligible
Vastwood Gorger: Ineligible
Zendikar Farguide: Ineligible

*Statistic is significant at 90%.
**Statistic is significant at 95%.
***Statistic is significant at 99%.
A card is ineligible when there are fewer than 20 observations.

A large percentage of green cards ended up as ineligible. There is a compounding effect going on here. On one hand, they are almost inarguably sub-optimal cards to begin with, which means we won’t be seeing them hit play very often. Together, they make green a bad color, which means few people are going to be running Forests. Now with few people running Forests and few of them willing to play Beast Hunt, Joraga Bard, Relic Crush, Savage Silhouette, Scythe Tiger, Tanglesap, Vastwood Gorger, and Zendikar Farguide, the sample size is going to be too small to test any hypotheses.

That being said, two cards came out positively. Let’s find out why:

Vines of Vastwood: 64%**

Half counterspell, half Giant Growth, Vines of Vastwood is one of the best common combat tricks we have seen in a while. It’s also one of the only compelling reasons to run green, even if it means going heavily into the color to get access to GG. (Coincidently, paying that GG will often result in a GG of a different type.)

Grazing Gladehart: 59%**

Life gain is only bad when it doesn’t come with something else to go with it. As it turns out, a 2/2 for three mana is perfectly acceptable (and good!), especially when it can possibly be worth ten or more life over the course of a game. The community sniffed this one out spot-on.

Timbermaw Larva: 56%

Give Timbermaw Larva some credit: even though it did not come out as statistically significant, playing the card requires a lot of Forests, which in turn means running a lot of green cards in your deck. Green, generally speaking, is weak. Yet, despite that, Timbermaw Larva propels its caster to a 56% clip. That’s something to consider as Worldwake reduces the number of Zendikar boosters in sealed pools.

Territorial Baloth: 52%

Dear Kermit,

Are you from Zendikar?

Much love,
William Spaniel

It’s not easy being green. Territorial Baloth’s performance is disappointing coming in the wake of Steppe Lynx’s 68%**, Windrider Eel’s 57%, Hagra Crocodile’s 61%*, and Plated Geopede’s 60%**. On the bright side, I think the Baloth makes a delightful splash card.

Mold Shambler: 52%

Add it to the list: Mold Shambler might be higher if it weren’t weighed down by being in green. I always seem to find a good target for its ability during the late game, and I have no qualms about dropping it on the fourth turn to fit my curve. (Indeed, I believe that far too many people hold back Mold Shambler for its kicker effect when it should just be cast for four mana.)

Harrow: 51%

Back when Invasion was around, Harrow was considered one of the top commons in limited, and first picking it in draft wasn’t unusual. This data calls Harrow’s usefulness into question. I understand that this instant only puts lands into play, but you would figure that would be quite good in a format with landfall. Well, evidently not. While I would not have been surprised to see Harrow finish in the high 60%s, it’s a mere 51%.

Oran-Rief Survivalist: 47%

Note that despite the prevailing wisdom, (Grizzly Bears don’t actually get it done in limited (they are marginal at best), and Oran-Rief Survivalist is a Grizzly Bear when unaccompanied by other allies. Obviously he’s going to be better in an ally-heavy deck, but those are kinds of things are meant for draft decks, not sealed pools.

Khalni Heart Expedition: 47%

If Harrow was only a 51%, you couldn’t expect a delayed Harrow—much less one that may never trigger if you don’t play enough lands—to do much better. Note that this Expedition is further hurt by the fact that it wants to find splash lands, yet green ironically tends to be the splash.

Nissa’s Chosen: 46%

Nissa’s Choosen is a horrible card, and it would be labeled as “bad” if it weren’t for a few games where Nissa Revane picked up the slack. You don’t want to be heavy green in this format. Yet Nissa’s Chosen forces you to play Forest, Forest at the beginning of the game for it to be any good. As such, unless you have the planeswalker, you shouldn’t have Nissa’s Chosen in your deck.

Oran-Rief Recluse: 43%

You may value removal highly, but six mana to only kill a flyer isn’t burning up the record books.

Beast Hunt: Ineligible (0-2)

Even if you do play a deck with eighteen creatures, you are going to whiff awfully frequently with Beast Hunt. When you do hit, you will wonder why you spent four mana on the process. Online players are evidently well aware of these obstacles, having chosen to cast Beast Hunt only twice in the games observed.

Joraga Bard: Ineligible (7-11)

Vigilance just isn’t worth it. The sad thing is that I’m sure most of the eighteen players who cast Joraga Bard played it because they had some crazy fantasy about wanting to make their other allies better.

Relic Crush: Ineligible (1-0)

Versatile? Yes. Too expensive to be played? You better believe it.

Savage Silhouette: Ineligible (8-5)

First Goblin Warpaint went 20-18. Now Savage Silhouette went 8-5. While neither of these cards is a bomb by any means, they certainly weren’t stinking up anyone’s deck either. It might be time to start reconsidering whether creature enchantments truly are the bane of a limited player’s existence.

Scythe Tiger: Ineligible (1-8)

If you were like me, you fell for the trap that is Scythe Tiger, but you have learned your lesson since then. While three power creatures tend to be usable, the truth is that Scythe Tiger’s actual mana cost is about five or six—you won’t want to cast him until then. Even at that point, you’d be better off with a bunch of other spells.

Tanglesap: Ineligible (0-1)

At the prerelease, I got someone to cough up a win by bluffing a Tanglesap. Yes, this actually happened. No, it will never happen again. Heck, I doubt more than 50% of Magic Online players would be able to tell you the full rules text of Tanglesap. It is that irrelevant.

Vastwood Gorger: Ineligible (10-7)

Like Territorial Baloth, I find Vastwood Gorger to be splashable in a deck looking for more creatures. But that won’t get it much play, as evidenced by only seventeen appearances for more than 1,000 total players.

Zendikar Farguide: Ineligible (3-5)

A forestwalker would be a lot more powerful if people were playing Forests…which people are not doing.

Now let’s move on to the artifacts…

Expedition Map: 61%
Adventuring Gear: 56%
Explorer’s Scope: 51%
Stonework Puma: 44%
Hedron Scrabbler: Ineligible
Spidersilk Net: Ineligible

And the analysis…

Expedition Map: 61%

Expedition Map’s high numbers I believe are half statistical anomaly, half let’s-go-get-Valakut-and-go-crazy. I’m still willing to cut it from my deck, particularly if I’m only running two colors.

Adventuring Gear: 56%

Other than Territorial Baloth (a green card), landfall appears to be most powerful when it comes on a creature. That being said, Adventuring Gear can add quite a bit of damage to your attacks over the course of the game, which lands it here.

Explorer’s Scope: 51%

This one came as a bit of a surprise for me considering Explorer’s Scope (1) triggers landfalls, (2) accelerates mana, and (3) reduces the number of irrelevant draws you have in the late game. At one mana, you think those three abilities would put together something better than a 51%...

Stonework Puma: 44%

If Oran-Rief Survivalist didn’t fare well, you shouldn’t be surprised to see Stonework Puma rank so low. It’s especially painful that Stonework Puma costs one mana more and can only contribute to other allies’ abilities.

Hedron Scrabbler: Ineligible (6-7)

I have a hard time believing Hedron Scrabbler could be good in any deck—I don’t care how aggressive you think you are—when Grizzly Bears barely cuts it.

Spidersilk Net: Ineligible (5-13)

Honestly, not much excites me more than my opponent playing Spidersilk Net before playing their first land of the game.

And finally let’s get to the lands:

Piranha Marsh: 73%***
Kabira Crossroads: 60%**
Soaring Seacliff: 57%
Teetering Peaks: 53%
Turntimber Grove: Ineligible

Plains: 52%
Island: 48%
Swamp: 51%
Mountain: 51%
Forest: 46%

Piranha Marsh: 73%***

You can imagine how amazed I was to learn that black mana is good, especially when it deals damage to an opponent. Crafty players hold these in their hands during the late game to give their opponents the false sense of security that it is okay to fall to 1 life. Guess what? It’s not.

Kabira Crossroads: 60%**

While Piranha Marsh might not been a surprise, Kabira Crossroads might be. Life gain on a land works, and works well.

Soaring Seacliff: 57%

I heard someone claiming that they run Soaring Seacliffs off color just for the jump effect. At 57% from blue decks exclusively (I didn’t see anyone running them off color), that seems kind of silly.

Teetering Peaks: 53%

I’m still playing Teetering Peaks over a Mountain, but it’s not actually guaranteed to affect combat or life totals like Piranha Marsh and Kabira Crossroads can.

Turntimber Grove: Ineligible (6-8)

Much the same story as Teetering Peaks, except Turntimber Grove isn’t good when your opponent doesn’t have anything to block with, and you can only play it when you are running green spells. (No off-color attempts, please.)

Plains/Island/Swamp/Mountain/Forest
No shock that Forest came back the worst. It’s not statistically significant because people are only willing to play it in sealed when it provides better alternatives to their next best option, which can really only happen when those players pull a lot of Vines of Vastwood and Grazing Gladehart.

That wraps up this series. Good luck at the prereleases this weekend. I might just be tracking the impact of Worldwake then…

William Spaniel
williamspaniel@gmail.com
Read more articles by William Spaniel!

Appendix: Methodology

I assume that skill, luck, and the quality of a player's deck determine who wins any particular confrontation. While undoubtedly skill matters, this study is focused on the luck and card quality factors. Players actually have a great deal of control over both of these, as a poorly-constructed deck will win less often than a well-constructed one. From this, we can conclude that some cards contribute to wins more frequently than others. If an average card ever reached play in a game, we would expect its controller to have only won that game around 50% of the time. But if a truly exceptional card reached play, we would expect its controller to have won upwards of 70% of the time.

Watching replays of Pro Tour San Diego qualifiers on Magic Online (and carefully avoiding the qualifier in which the system malfunctioned and everyone played a 140 card deck—yes, this actually happened), I recorded the results of more than a thousand players. Every time a card hit play, I would record it as either a win or a loss, depending on what ultimately happened in that game. If the card reached play multiple times (perhaps because of a (Grim Discovery)), it only counted once. But if a player cast multiples of a single card, I counted that card multiple times.

Such a large number of observations were necessary to remove the play skill bias that would have shown up in a smaller-n study. It also shrinks the margins of error, allowing for better hypothesis testing, which I ran at 90%, 95%, and 99% confidence.

For those of you unfamiliar with hypothesis testing, here is a brief explanation for what each of those means:

90% Confidence: When I say we can be 90% confident that a card positively contributes to victories, it means that there was only a 10% chance that the card has no impact and the data came back so eschewed based on pure luck. While the odds of being wrong here are only 1/10, we should be very skeptical of these results as statisticians. Generally, it is only a good idea to accept these results if we have a good theory behind them. For example, I would accept (Burst Lightning)—a quality removal spell as being true—but I would cast doubt on whether (Blood Seeker) was actually affecting things.

95% Confidence: This is the gold standard of statistics. When a card meets 95% confidence, the likelihood the card is merely average but we got this extreme of data back is only 1/20. At this point, it is a good idea to start thinking of theories to justify the results if you do not have one already.

99% Confidence: While rare (there were only four in this study), a 99% confidence virtually guarantees that a given card has an impact on the match—there is only a 1/100 chance that this result is wrong. You should pay careful attention to these.

Just because a card does not show up a significant does not mean you should not care about the results. But you should not treat them as gospel, either. The best analogy I can draw is to that of a baseball team. It is possible that your star hitter goes through a minor slump at the beginning of the season and an average player goes on a torrid streak at the same time. That does not mean the average player is better than the star; it just means he was better during that period of observation. So don’t be surprised if the study ranks an average card lower than one you perceive as a top-pick. My card-by-card commentary will help qualitatively decide whether this was just statistical coincidence or if it might be part of a larger trend.

Additionally, just because a card is sub-50% does not mean you should automatically stop playing it in all of your decks. Going back to the baseball analogy, a team cannot field nine players with batting averages all over .300. But it can maximize its performance by putting its best players in the lineup. So if you need to play Vampire’s Bite to have enough black cards to justify running (Sorin Markov), go right ahead; but if (Vampire Lacerator) is floating around in your sideboard, the data indicate you should swap out the Vampire’s Bite.

13 Comments

I agree with Explorer's by Steve (not verified) at Tue, 02/16/2010 - 08:50
Steve's picture

I agree with Explorer's Scope, IMO it's quite underrated. At any point of any game, assuming you have the bodies that will attack, filtering lands is good. I especially like it in U/W.

Have to disagree with Nissa's Chosen, though. In this format, 3 toughness on turn 2 is nothing to laugh at, even if it means a little extra green.

An article I wrote with a by wspaniel at Tue, 02/16/2010 - 17:08
wspaniel's picture

An article I wrote with a very quiet comments section. Something has to be wrong with the world.

I think anyone whose by Anonymous (not verified) at Wed, 02/17/2010 - 00:18
Anonymous's picture

I think anyone whose background is in statistics has already chimed in.

At this point, there's nothing left to say about a political "scientist" throwing out intentionally misleading statements about their statistics. Your model fails on the most basic levels.

"I assume that skill, luck, and the quality of a player's deck determine who wins any particular confrontation. While undoubtedly skill matters, this study is focused on the luck and card quality factors. Players actually have a great deal of control over both of these, as a poorly-constructed deck will win less often than a well-constructed one. From this, we can conclude that some cards contribute to wins more frequently than others. If an average card ever reached play in a game, we would expect its controller to have only won that game around 50% of the time. But if a truly exceptional card reached play, we would expect its controller to have won upwards of 70% of the time."

Just because you try to look like a jedi in your picture it doesn't mean that your hand waving actually works.

While I'm sure this poster's by wspaniel at Wed, 02/17/2010 - 05:11
wspaniel's picture

While I'm sure this poster's comments were intended to make me feel sad, his remarks are silly. Political science is virtually all math at the graduate level. So while it is possible this poster's background is in math, while he was spending all of his time learning pure theory, I've actually been working on applications.

My models are not the end-all of strategy. There are issues. (From what I recall, Godot's posts were the most eloquent and constructive of the criticisms.) But it beats the hell out of a lot of the alternatives to these methods. My model does not fail at the most basic of levels, and I do not need any magical hand waving to get people to read this stuff.

luckily Jedi are fictional i by ShardFenix at Wed, 02/17/2010 - 10:42
ShardFenix's picture

luckily Jedi are fictional i dont think the world could handle darth spaniel...and yes you would be a sith dont lie...

lol well actually the cards by ShardFenix at Tue, 02/16/2010 - 18:16
ShardFenix's picture

lol well actually the cards up there seem to be in a pretty reasonable order. There are sore thumb stickouts like a 3/3 for 5 beating scalable removal. This basically came out like the green common pick order.

I suppose that's true, except by wspaniel at Tue, 02/16/2010 - 18:22
wspaniel's picture

I suppose that's true, except for maybe Vastwood Gorger.

well coming back as by ShardFenix at Tue, 02/16/2010 - 18:26
ShardFenix's picture

well coming back as ineligible is unfortunate though ive been able to pick him up in many online drafts as late as pick 8 or 9 sometimes. Though most times unless i'm heavy allies, he would go before the survivalist and probably the expedition as well

I think it's quiet because by Godot at Tue, 02/16/2010 - 19:56
Godot's picture

I think it's quiet because the peanut gallery has already said what there is to be said about the various merits and shortcomings of this approach. If this were the *first* entry in the series...

I'll say this, no other color embodies the "this is sealed not draft" aspect of this series of articles. Survivalist when you can draft an ally deck and Nissa's Chosen when you can be near-monogreen are a completely different story.

I'd say the Vastwood Gorger, even if not at significant numbers, also underscores both the fact that sealed is slower and that this analysis doesn't take into account the impact of casting cost on the results. Disfigure may be the only relevant spell the person played on their mull to 5, stuck on 2 lands game. Every Vastwood Gorger represents someone who made it to six mana.

Thanks for the series, though, food for thought no matter how you slice it.

I missed the other articles, by Anonymous (not verified) at Wed, 02/17/2010 - 02:19
Anonymous's picture

I missed the other articles, but isn't your method fatally flawed? Using your approach, wouldn't Coalition Victory be viewed as the greatest card ever because when it's resolved you almost always win? How do you take into account the matches where a card sits in your hand unplayable and thus contributes to the loss? You did consider this case, right?

Some cards might work this by wspaniel at Wed, 02/17/2010 - 05:15
wspaniel's picture

Some cards might work this way, but there aren't going to be many of them. (As it turns out, not even Overrun worked this way in practice.) I think that most of these would turn out as ineligible anyway, so I wouldn't even be considering them.

Maybe this was covered in by Godot at Wed, 02/17/2010 - 02:52
Godot's picture

Maybe this was covered in previous comments, but you say "every time a card hit play" you counted it. Really what you mean is, "every time a spell is cast," because you counted each spell as it hit the stack, not as it resolved, right?

Just making sure.

Stack is correct. I was just by wspaniel at Wed, 02/17/2010 - 05:12
wspaniel's picture

Stack is correct. I was just looking for other ways to say similar things.