# Thread: Need help reverse engineering a formula

1. ## Need help reverse engineering a formula

I'm working on an MMO emulator program. Have any of you played something like WoW or another MMO? Well, in the one I'm working on, it costs gold when training your spells. If your character is level 20, the spells will cost a certain amount per training point, and a different amount if you're level 40 etc. I'm trying to figure out how exactly the costs are determined according to the formula that was used when the game was still running based on the data points that I have. Any help on this would be immensely nice.

I've learned some things about the formula. First off, the level of the spell is irrelevant to the cost formula beyond determining how many trains are needed before the cost starts spiking above the base cost. What I mean is the amount a level 1 skill spikes at it's first spiking train is the same as a level 15 skill spikes at it's first spike train. Here is the formula to determine when a skill begins spiking for a given level:

spike = trains_in_skill + skill_level - player_level - 2; //min 0

So a level 30 player will not spike a level 15 skill until 18 trains into the skill: 1 = 18(trains) + 15(skill_level) - 30(player_level) - 2. Everything before the 18th train will be the base cost of the skill. The 18th train will be the first spike (1.011 for level 30). The 19th train will be the second spike (1.0087 for level 30). It doesn't matter if a skill is at level 1, level 10 or level 20. The table below will always be correct for the first spike point for that player's level. A level 1 skill (base cost 15) on a level 30 player at the first spike point will be (15 * 1.0011) = 15.0165 gold. A level 10 skills (base cost 150) first spike cost will be (150 * 1.0011) = 150.165 gold.

In the table below columns are the player level (10, 15, 20, 25, 30) and rows are the number of trains into spiking a skill. Then each cell is the multiplier over the base skill cost for training a skill at that particular level. So at player level 15, the 25th spiking train would be the base_cost x 58.0897.

Code:
       Level 10          Level 15          Level 20          Level 25          Level 30
1      1.0049            1.0036            1.0026            1.0018            1.0011
2      1.0392            1.0292            1.0209            1.0140            1.0087
3      1.1321            1.0987            1.0705            1.0474            1.0293
4      1.3133            1.2338            1.1670            1.1124            1.0696
5      1.6120            1.4567            1.3262            1.2196            1.1358
6      2.0575            1.7892            1.5637            1.3795            1.2347
7      2.6792            2.2532            1.8952            1.6026            1.3727
8      3.5067            2.8707            2.3363            1.8995            1.5563
9      4.5691            3.6636            2.9026            2.2808            1.7921
10     5.8957            4.6537            3.6099            2.7569            2.0865
11     7.5163            5.8631            4.4737            3.3384            2.4462
12     9.4599            7.3136            5.5099            4.0359            2.8776
13     11.7560           9.0273            6.7339            4.8599            3.3872
14     14.4340           11.0259           8.1615            5.8209            3.9815
15     17.5232           13.3314           9.8084            6.9296            4.6671
16     21.0531           15.9657           11.6901           8.1963            5.4505
17     25.0529           18.9508           13.8224           9.6316            6.3382
18     29.5521           22.3086           16.2208           11.2462           7.3368
19     34.5801           26.0610           18.9012           13.0506           8.4527
20     40.1661           30.2299           21.8791           15.0552           9.6924
21     46.3397           34.8373           25.1701           17.2707           11.0626
22     53.1301           39.9050           28.7900           19.7075           12.5696
23     60.5668           45.4551           32.7544           22.3762           14.2201
24     68.6791           51.5093           37.0790           25.2874           16.0205
25     77.4964           58.0897           41.7794           28.4516           17.9774
26     87.0480           65.2181           46.8713           31.8793           20.0972
27     97.3633           72.9166           52.3703           35.5811           22.3866
28     108.4719          81.2069           58.2921           39.5675           24.8520
29     120.4029          90.1111           64.6524           43.8491           27.4999
30     133.1857          99.6510           71.4668           48.4363           30.3369
31     146.8499          109.8486          78.7510           53.3398           33.3695
32     161.4245          120.7258          86.5206           58.5701           36.6041
33     176.9392          132.3045          94.7913           64.1377           40.0474
34     193.4232          144.6066          103.5788          70.0532           43.7058
35     210.9060          157.6541          112.8987          76.3271           47.5859
36     229.4169          171.4689          122.7667          82.9700           51.6942
37     248.9853          186.0729          133.1984          89.9923           X
38     269.6405          201.4881          144.2095          97.4047           X
39     291.4120          217.7363          155.8156          105.2176          X
40     X                 X                 168.0325          113.4416          X

I see two non-linear variables to this table. First off the base cost for the spiking at a given spike point decreases at a non-linear rate for each level. Second off the rate of the multiplier increases at a non-linear rate for each train over the initial spike point. I need to take this table and turn it into a formula for spiking skill costs. Either that or we're going to have to record every possible spike point for every level and make a look up table in place of the formula.

2. ## Re: Need help reverse engineering a formula

I used to do things like this for runescape but I have never played warcraft so the idea of spiking is foreign to me. You'd probably get better help if you put your information for us into more mathematical terms.
Let player level= L
skill level= k
number of trains into spiking a skill= N
multiplier= m
Spike= S
base cost= b
trains in skill= t (I'm not sure if this is different to N)

You have told us that
S = t + k - L - 2 (min 0)

You have a table containing data for m for different values of N and L

Am I correct in saying you want to find a formula for m as a function of N and L so that you can find the cost which is equal to b*m

Do you have any idea what kind of formula the game makers would use?
For example m increases with N and decreases with L so the formula could be

$m= c_1 N+ c_2 \frac{1}{L} + c_3$
or
$m= c_1 N+ c_2 L + c_3$

where all the c values are just coefficients which could be positive or negative

Game makers tend to not use polynomials beyond quadratics otherwise small increases can have massive effects. However, some formulae on runescape were in this form

$x= \frac{A+D}{D-2A}$

Where $A= 10+4a$ and $D=5+6d$
a and d were the attack and defence of two players fighting each other.
The game makers actually didn't design the formula for x, they designed the formulae for A and D but the way they used random number generators lead to x being an average of $\frac{A+D}{D-2A}$ so they didn't realise how complicated they had made it.

If your formula was something like this you would probably never find it exactly but with a Taylor series expansion any kin dof formula they use can be approximated by

$m= c_0+ c_1 N+ c_2 L+ c_3 NL + c_4 L^2 +c_5 N^2 + c_6 N^2L+ c_7 NL^2 + c_8 N^3 + c_9 L^3 + c_{10} N^2L^2 + ...$

This formula goes on to infinity but if you take the first few terms you can get a good approximation. Past a certain point the terms become negligible.

To find the coefficients in that formula you can use multiple regression analysis which is quite easy to do with a spreadsheet. Treat $N$ and $N^2$ as two different variables even though they are really just one variable.

I found the spreadsheet on this page to be very useful for multiple variable regression. The link to download the spreadsheet is under "how to do multiple regression", I suggest you read the full article to understand how to determine which variables are important to calculating the multiplier.
Handbook of Biological Statistics: Multiple regression

This spreadsheet allows for 12 variables so you could do a lot of the terms in the series.

Have you come across p values in statistics before? These are important for deciding if one of the terms in the series in unimportant.

3. ## Re: Need help reverse engineering a formula

Thanks for the response!

Do you have any idea what kind of formula the game makers would use?
No, they didn't post it anywhere. What I'm trying to do is find a formula that will mimic what the client program for the game says. The client is the program that's on the person's computer and if you have a skill that's at level 20 and the character is level 34, then the client will display that the skill will cost something like 1200 gold to put a training point into it. We have to have the formula so that we know how much gold we're supposed to subtract from the players inventory. Otherwise, the client will say they subtracted 1200 gold but we think they only subtracted 1000 and then there are syncing issues between their client and the server we're running.

There has to be some kind of polynomial that fits because this doesn't use a random number generator, I'm sure of that. I think it's a polynomial equation involving (character level - level granted + trains) cubed, and some negative coefficient of absolute level.

I took a character and trained another skill on it at various levels and got a list of gold costs that looked like this:
Code:
+----------+--------+---------+---------+---------+---------+
| Rank\Lvl | 2      | 3       | 4       | 5       | 6       |
+----------+--------+---------+---------+---------+---------+
| 1        | 100    | 100     | 100     | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 2        | 100    | 100     | 100     | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 3        | 100    | 100     | 100     | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 4        | 5949   | 100     | 100     | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 5        | 46893  | 5684    | 100     | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 6        | 158025 | 44771   | 5426    | 100     | 100     |
+----------+--------+---------+---------+---------+---------+
| 7        | 374442 | 150886  | 42705   | 5174    | 100     |
+----------+--------+---------+---------+---------+---------+
| 8        |        | 357471  | 143891  | 40692   | 4929    |
+----------+--------+---------+---------+---------+---------+
| 9        |        | 698091  | 340937  | 137099  | 38734   |
+----------+--------+---------+---------+---------+---------+
| 10       |        | 1206229 | 665798  | 324837  | 130488  |
+----------+--------+---------+---------+---------+---------+
| 11       |        | 1915388 | 1150426 | 634353  | 309169  |
+----------+--------+---------+---------+---------+---------+
| 12       |        | 2859072 | 1826775 | 1096088 | 603750  |
+----------+--------+---------+---------+---------+---------+
| 13       |        | 4070784 | 2726799 | 1740489 | 1043207 |
+----------+--------+---------+---------+---------+---------+
| 14       |        |         | 3882451 | 2597999 | 1656516 |
+----------+--------+---------+---------+---------+---------+
| 15       |        |         |         | 3699061 | 2472651 |
+----------+--------+---------+---------+---------+---------+
| 16       |        |         |         | 5074121 | 3520587 |
+----------+--------+---------+---------+---------+---------+
| 17       |        |         |         | 6753621 | 4829301 |
+----------+--------+---------+---------+---------+---------+
| 18       |        |         |         |         | 6427766 |
+----------+--------+---------+---------+---------+---------+
| 19       |        |         |         |         | 8344959 |
+----------+--------+---------+---------+---------+---------+
Putting columns into a polynomial curve fit program and tinkering around a bit, such that a curve started with the last 100 price before it started to go up, I got the following equation:

y = 3916.62963 x^3 - 1.111111268·10E-1 x^2 + 1.164021641·10E-1 x + 100.3333332

With a very good (root mean error less than 0.01) fit. You will note that last coefficient is the base price, and the first is very close to 40 (maximum trains) times the base price. The middle two coefficients are very small.

So this is quite suggestive on its own, if you now look at diagonal pairs, like Rank 4 at Level 2 and Rank 5 at level 3, you will find the price is similar, but not the same, so there is cleared a modification to the price based purely on level. Further tinkering around is starting to suggest that there is a flat percentage discount for each level of the calculated price, seemingly about (20+level)%, but I am still working on that.

I think the price is based on the third power of the difference between level, and granted level plus spell rank. I think this base price is then reduced by a flat ( level + base amount)%.

Thoughts?

4. ## Re: Need help reverse engineering a formula

What is the maximum error on the fit? There should be a small error due to rounding off numbers. If you can get a more simple model which fits the data then you should go with that, see how good the fit is with a quadratic formula and if the mean error is less than 1 I'd go with that. The cubic will always be a better fit than the quadratic so don't assume that the quadratic is wrong just because the cubic fits slightly better. The equation you have is a bit suspicious because the coefficients are not round numbers, game developers use nice numbers in their code which are either integers or simple fractions, for example the coefficient of x2 is almost -1/9.

If you think the equation is something like
$y= (c_0 +c_1x+c_2x^2 +c_3 x^3)\frac{c_4+\text{level}}{100}$

Then first do a fit while keeping level constant, see if you get similar values for the coefficients when level is held constant at 15 or 20 or 25
If the coefficients are similar then that shows that percentage discount is a function of level only.
Using the values of the coefficients let z= (c_0 +c_1x+c_2x^2 +c_3 x^3) use the values
Now using the values for c0-c3 you calculated treat z as a known and it should be easy to see if level affects the percentage discount as you thought.

5. ## Re: Need help reverse engineering a formula

Originally Posted by Shakarri
What is the maximum error on the fit? There should be a small error due to rounding off numbers.
It just has to match the whole number. The client does use decimals all the way to the 1000th, I've confirmed that, but it unfortunately does not display them which makes things more difficult. The client also doesn't round the numbers at all, so 93.789 would display as 93, and 93.123 would also display as 93. That means if you've got 200 gold in your bank, and the client says you need 200 gold to train the skill, but our server says it costs 200.5, then the server would tell them they don't have enough gold even though they've got the amount that the client says they need.

On the other hand, the scenario isn't all that likely since people usually have a lot more gold than they need and are training many skill points at once. I'd be happy enough if I could just get the whole number to be correct.

6. ## Re: Need help reverse engineering a formula

Ok if the numbers are rounded to 4 decimal places then an error of 0.01 is kind of big but since you are using this formula to estimate numbers larger than 100 it should not matter for when you use it to calculate costs. Did you try fitting a quadratic relationship? The regression spreadsheet is really useful for determining if the extra accuracy from using a cubic relationship is significant.
And when keeping player level constant like I was suggesting in the last post did the fitting software give the same equation?

7. ## Re: Need help reverse engineering a formula

I'm still figuring out the spreadsheet, but I think I tried what you suggested with the curve fitting software.

Originally Posted by Shakarri
And when keeping player level constant like I was suggesting in the last post did the fitting software give the same equation?
Using the first data set I posted in this thread, if you put each column through a curve fitting tool you get the following

Code:
+----+------------------+------------------+------------------+------------------+------------------+
| L  | x3               | x2               | x                | fixed            | error            |
+----+------------------+------------------+------------------+------------------+------------------+
| 10 | 4.895768810·10-3 | 1.235917040·10-7 | 2.570765901·10-6 | 9.999905696·10-1 | 5.764666854·10-8 |
+----+------------------+------------------+------------------+------------------+------------------+
| 15 | 3.653738123·10-3 | 1.009877337·10-7 | 1.199061899·10-7 | 9.999798998·10-1 | 2.657761264·10-8 |
+----+------------------+------------------+------------------+------------------+------------------+
| 20 | 2.609889085·10-3 | 3.926274386·10-7 | 5.792725394·10-6 | 9.999817163·10-1 | 2.503959321·10-8 |
+----+------------------+------------------+------------------+------------------+------------------+
| 25 | 1.756895202·10-3 | 2.493801574·10-7 | 1.986120992·10-6 | 9.999881419·10-1 | 3.201981555·10-8 |
+----+------------------+------------------+------------------+------------------+------------------+
| 30 | 1.086550595·10-3 | 4.567713674·10-8 | 3.012965522·10-7 | 1.000003463      | 2.548545087·10-8 |
+----+------------------+------------------+------------------+------------------+------------------+
So the x3 coefficient basically dwarfs anything else, so lets just consider that for the moment.

If I fit the changing x3 coefficient against character level we get another equation:

y = 3.811971429·10-6 x2 - 3.427850571·10-4 x + 0.007940886

Putting in the different values for the tested levels I get a set of level based coefficients like this:

Code:
+----------+----------+----------+----------+----------+
| 10       | 15       | 20       | 25       | 30       |
+----------+----------+----------+----------+----------+
| 4.89E-03 | 3.66E-03 | 2.61E-03 | 1.75E-03 | 1.09E-03 |
+----------+----------+----------+----------+----------+
Error: 2.382546286e-11
Putting it back together so that I try

y =(level coefficent* ranks ^3)+ 1

I get a table like this:

Code:
+----+----------+-------------+----------+----------+----------+
|    | 10       | 15          | 20       | 25       | 30       |
+----+----------+-------------+----------+----------+----------+
| 1  | 1.004894 | 1.003656804 | 1.00261  | 1.001754 | 1.001088 |
+----+----------+-------------+----------+----------+----------+
| 2  | 1.039154 | 1.02925443  | 1.02088  | 1.01403  | 1.008705 |
+----+----------+-------------+----------+----------+----------+
| 3  | 1.132144 | 1.0987337   | 1.070469 | 1.047351 | 1.029379 |
+----+----------+-------------+----------+----------+----------+
| 4  | 1.313231 | 1.234035438 | 1.167038 | 1.112239 | 1.069639 |
+----+----------+-------------+----------+----------+----------+
| 5  | 1.611779 | 1.457100464 | 1.326247 | 1.219218 | 1.136014 |
+----+----------+-------------+----------+----------+----------+
| 6  | 2.057154 | 1.789869602 | 1.563754 | 1.378808 | 1.235031 |
+----+----------+-------------+----------+----------+----------+
| 7  | 2.678722 | 2.254283674 | 1.895221 | 1.601533 | 1.373221 |
+----+----------+-------------+----------+----------+----------+
| 8  | 3.505847 | 2.872283502 | 2.336306 | 1.897916 | 1.557112 |
+----+----------+-------------+----------+----------+----------+
| 9  | 4.567896 | 3.665809908 | 2.902671 | 2.278478 | 1.793231 |
+----+----------+-------------+----------+----------+----------+
| 10 | 5.894233 | 4.656803715 | 3.609973 | 2.753742 | 2.088109 |
+----+----------+-------------+----------+----------+----------+
Which is pretty close to the first table I tried, so using a two stage polynomial equation it's fairly accurate. But it's still missing something with the spike portion. It gets off as you spike up, and the more it spikes the more it gradually gets farther and farther off (by like 0.2% more each spike). This can lead to a couple hundred extra gold at high spikes.

8. ## Re: Need help reverse engineering a formula

Ok good, so you have determined that it is definitely a function of level. It doesn't really make sense for only the x3 coefficient to be a function of player level but since the x3 term dwarfs the rest it might seem that way. More likely is that the whole polynomial is multiplied by a function of player level as you suggested on the first of January.
I really cannot understand what you are saying about spikes. It seems like an event that occasionally happens which in some way reduces the cost per spell. You will have to explain it with fewer abstract terms from the game like "spike up". If it is a boost that occasionally happens, how often does it happen? how do variables change when it does happen? Explain what happens step by step. It is also a bit confusing that you always write polynomials in y and x and reuse those letters for different variables.

9. ## Re: Need help reverse engineering a formula

Originally Posted by Shakarri
I really cannot understand what you are saying about spikes. It seems like an event that occasionally happens which in some way reduces the cost per spell. You will have to explain it with fewer abstract terms from the game like "spike up". If it is a boost that occasionally happens, how often does it happen? how do variables change when it does happen? Explain what happens step by step.
Each spell has 40 ranks and 40 is considered "grandmaster" or GM. So if you look at the 2nd data set I posted, you'll see that it costs 100 gold (this is the base cost of the spell) when the player is level 2 until he has trained the spell to rank 4. When the spell gets to rank 4, it spikes from 100 gold cost to 5949. If the player was level 6, the cost wouldn't spike from 100 to 4929 until rank 8. The game developers did this to make training spells to high ranks prohibitive when your character is a low level.

Also, some spells aren't granted until your character is level 20 or 40, with max character level being 75. So if the spell is granted at level 20, the cost will spike to prohibitive amounts more quickly if your character is level 20 than if the character is level 75.

But you see the spike is different since it spikes to 5949 at level 1 but 4929 at level 6. The spell ranks after that don't use the same multiplier either.

10. ## Re: Need help reverse engineering a formula

Oh I see what you mean. Well there is a formula for the beginning cost after the first spike.
Cost= 6452-255(level). I'm not sure if that helps you.

It gets off as you spike up, and the more it spikes the more it gradually gets farther and farther off (by like 0.2% more each spike). This can lead to a couple hundred extra gold at high spikes.
If you mean that the equation is off for the values which should be 100 well there is nothing you can do to get a formula which is accurate there because the game devs are using a case type function like this
$cost= \begin{cases}100 & \mbox{if } rank < level+2 \\ f(x) & \mbox{if } rank \geq level+2 \\ \end{cases}$