Blizzard is back with another Developer Insights that takes us through how they used data to balance the mode.
Hi! I’m Tian, a Senior Data Scientist with the Hearthstone team, and today we’re talking about the math behind Arena balance.
Arena matches are played all the time, and they constantly generate data—A LOT of data—that we can use to help make sure that Arena is more balanced. If you had to place me in Boom Labs, I’d probably be in the Mathematical Science department!
The Balancing Game
Arena balance is done in two stages. First, we determine which bucket(s) each card goes into (a bucket is a subset of cards with similar performance quality). A card generally falls into two buckets, and we divide Legendary and non-Legendary cards into two different bucket systems. We decide which bucket each card falls into by its win-rate and pick-rate during games. That means each of the three cards you see during a pick are on similar power tiers.
We then balance the win-rate across the nine classes. Ideally, it’s as close to 50% as possible. We achieve this balance by tuning the weights associated with each card. Weight is a number that represents the relative likelihood that a card appears in a draft. The more weight a card has, the higher the chance that you’ll see it during a draft. If the weight of a card is changed, that also changes the chances that the bucket it’s in will appear.
A lot of data is needed to make the system work, but due to the huge number of Arena games played daily, we have plenty of data to utilize.
Utilizing that data to affect game balance requires three steps.
- Build a model
- Solve constrained optimization problems
- Calculate the weights
After all that’s done, we need to schedule hotfixes to apply the changes.
Build A Model
If you’re a frequent Arena player, you might be familiar with calculating win probability. Some cards skew probability more heavily than others. For example, drawing The Lich King in a game would affect your win probability a lot more than drawing Snowflipper Penguin.
Let’s assume you draw The Lich King during a game. You may start thinking: “What is my win probability now that I’ve drawn The Lich King? Is it 60%? 50%? How do I evaluate this quantitatively?” Further assume that you draw Ice Barrier on your next turn—now you’ll want to re-evaluate your win probability once more.
We built a machine learning model to answer those questions. The computer is fed tons of data; using details from every Arena game played, it learns how to predict the win probability based on all the information it has access to. In slightly more formal terms, we “train” the model that we built. Thus, it’s able to provide the answer on win probability given X cards drawn every time we ask.
Solving Constrained Optimization Problems
Let’s take a step back and imagine that the model is a box with lots of knobs that you can tune. Each knob is associated with a specific card. When you tune a knob, you are in fact tuning numbers associated with that card.
Let’s say that before you tune a knob, the box tells you that the current win probability is 40%. After you make a turn, the predicted win probability changes to 46%. This poses a very interesting question: if you tune a bunch of knobs, will you be able to turn the win-probability to something you desire?
This question leads to the idea that we need to construct an optimization problem. In mathematical terms, we want to find the best solution from all feasible solutions. We want to make a target as close as possible to what we want by “tuning a bunch of knobs” at the same time. In formal terms, we minimize some objective function over a high-dimensional vector.
In Arena balance, we want the predicted win-rate to be as close as possible to 50% regardless of class, and we change numbers associated with each card to achieve that.
However, the knobs can’t be tuned arbitrarily—some constraints apply. Here’s a list of some constraints we have programmed into our “box”.
- The new number should be within +/-30% range of some fixed value. Drastic changes can potentially harm the gameplay experience.
- If we want to reduce the power of a class in Arena, its strongest cards will need to appear less often than its less powerful cards. Vice versa if we want a class to get stronger.
- There are some hard constraints required to keep the problem valid. For example, the total gains in the appearance probability need to be the same as the total losses (zero-sum, in mathematical terms)
Calculate the Weights
The final step in using Arena data to affect balance is to adjust the weights assigned to each card based on what we get from the first two steps. In general, a card with a weight of 2.0 shows up twice as often as a card with a weight of 1.0. The constrained optimization tells us which “knobs” to tune and how much to tune these knobs. We then link each knob to the probability of each card showing up in a draft. Now we know how much we need to change the weight of each card, not counting other modifiers derived from the card’s traits (e.g., whether it’s a spell or weapon, which expansion it’s from, etc.)
Leveling the Playing Field
After this stage of the balance is done, the overall win-rate across all nine classes should be very close to +/-50%. However, there have been rare situations where the win-rate after balance is still not ideal. That can happen if a certain class’ win rate is too far away from 50% before we do the weight adjustments. In those cases we might not hit our ideal numbers, but they’ll still be in better shape than what they were before.
Thanks to this system being able to utilize Arena data in advanced computational mathematics and machine learning, we’re able to determine whether a class needs to be strengthened or weakened, and then choose the optimal weight for each card for each class.
I hope this insight into our micro-adjustment system for Arena was interesting! We want to know what you think, so please let us know if you have any questions in the comments
Some games I will get more legendary than other games.
Why is that. I feel like some arena matches I will get 2 legendary matches and in those matches I win a lot more than matches where do not get any legendary.
If I pick worse cards are you saying I have a higher chance to get legendary on future picks? Just want to clarify if that is correct.
I’m not sure where you get this idea from. The above article doesn’t say anything like that as far as I can see.
It’s just luck, I have had 4 legendarys in an arena draft and zero in many others. Just because the probability of something occurring is low does not mean that it won’t happen.
Based on what I can see, I don’t think that there is any reason to think that picking ‘worse’ cards would give you a higher chance of getting a legendary.
In any event, the above states that all of the cards are placed into buckets and each pick is from a certain bucket. This means that they consider all of the options to be close to equal in terms of win rate so you have no control over whether or not you get a good or a bad pick in the eyes of Blizzard.
On the contrary. I believe the OP is correct. Picking crappy cards lowers your projected win rate to far below 50%. The algorithms recalculate what your next card pools weight would be based on how close you are to that 50%. To get it as close as possible you start getting better cards. It’s still random so legends aren’t guaranteed either way. But I would say statistically your odds do increase providing the legendaries are weighted at a more desirable number than non legends.
I didn’t have the same understanding as what you just outlined.
I thought that what happens is they give a weighting to each cards appearance however this is unaffected by your previous picks and you could consistently get good cards or bad cards (within each bucket) however on average most people will land somewhere in the middle.
e.g. I have had drafts where I have had up to 5 of the same card which is quite a high winrate card according to the data… it simply kept appearing so I chose it every time.
It would be good if someone could clarify because I am unsure how it works after your comments.
Theyre trying to have every CLASS have an overall win rate of 50%, not each individual deck. Picking crappy cards early on will not increase your likelihood of getting better cards later in the draft.
Its like a casino. All of it is random and the house controls the win rates when anomalies happen. If you get legendary picks, you hit the jackpot and are more likely to win. If you don’t, it doesn’t mean you are more likely to get them in the future. That is gambler’s fallacy: believing a certain outcome means the opposite will happen (positive or negative) in the future.
The only time that cognitive bias is legit in this game is when opening packs because it is literally made with a pity timer.