PDA

View Full Version : Ressource Flood and Screw



Pages : [1] 2

Storm_Fireblade
10-30-2013, 03:18 PM
Is it just me or does anyone else experience ressource flooding or screwing way more often lately? I've had almost dozens of games now, where I encountered one or the other. Even extreme situations like playing my control deck with 26 ressources and getting 0, mulligan to 6 getting 0, mulligan to 5 getting 0...did happen more than once.

And while I was playing other decks with a lower ressource amount as well, I saw a lot of ressource flooding anyway. Played against some mill decks more than once today and 2-3 times I saw something like 5-6 ressources out of 6 cards getting milled.

Whatever algorithm is responsible for the shuffling, it might need some additional work :)

hammer
10-30-2013, 03:20 PM
I suffered years of people complaining about the shuffler algorithm in MTGO, please dear hex community don't go there!

Vengus
10-30-2013, 03:26 PM
I normally do not mulligan, it always gives bad results. :p

First hand:
Hmm, 1 resource, 6 non-resources. I'd rather have a few more resources, lets mulligan.

Second hand:
6 resources omg.

3rd hand:
5 non-resources wtf.

4th hand:
3 resources, 1 troop. Oh whatever.. Should have stayed with the 1st hand.

Zarien
10-30-2013, 03:43 PM
Since someone else started this thread, I get to join in without guilt.

I'm not sure what the algorithm is that randomizes the cards. But I see way way more instances of people getting 1 or 0 resource hands (in 24-26 resource decks even) or just being flooded with 6 resources than I do other types of hands. This happens, obviously, but it just seems to be that it happens more often than it doesn't.

Don't even get me started on mulliganning. Almost everytime I've mulliganed. I get either 6 non-resource cards or something completely unplayable. It's happened often enough that I typically won't even mulligan now even if I have a crappy hand.

Hexmage
10-30-2013, 04:15 PM
Ok what people should do is keep track of how many opening hands have how many resources in them. I'm guessing in the long run they'll see its statistically average.

EDIT: You can even just start bot games to see your opening hand/ first mulligan to do this

Storm_Fireblade
10-30-2013, 04:33 PM
I suffered years of people complaining about the shuffler algorithm in MTGO, please dear hex community don't go there!

I played MTGO myself and this has nothing to do with complaining actually. The reason why I bring it up is because I've experienced a LOT more extreme flood and screw since the last patch. And before I simply ignore it, I'd rather ask others, if they might have noticed something similar. After all, this is what the Alpha is for.

Zarien
10-30-2013, 06:21 PM
I played MTGO myself and this has nothing to do with complaining actually. The reason why I bring it up is because I've experienced a LOT more extreme flood and screw since the last patch. And before I simply ignore it, I'd rather ask others, if they might have noticed something similar. After all, this is what the Alpha is for.

This. I understand what you're saying hexmage, but many of the current people in the alpha at this moment (including myself) are experienced tcg players. If there's one thing were used to, it's resource screw or flooding. I wouldn't respond to this thread if I felt it was a normal amount. I'm responding because it happens consistently in the opening hand, and in the mulligans.

UDareUTake
10-30-2013, 06:44 PM
Perhaps I can do everyone a favor, I shall be running 100 games (Mulling 3 times and showing the results on a spreadsheet)

Personally I felt something was weird with the flood/screw as well (not only this but sometimes weirdly enough you get to draw 3 cards in the opening hand quite frequently etc)

UDareUTake
10-30-2013, 10:07 PM
After clicking through 100 games and mulling 3 times each game @.@

24 Resources, 60-card deck

Here are the results

Opening Hand

0 Resource - 3%
1 Resource - 7%
2 Resources - 23%
3 Resources - 35%
4 Resources - 21%
5 Resources - 9%
6 Resources - 2%
7 Resources - 0%

First Mull

0 Resource - 1%
1 Resource - 20%
2 Resources - 25%
3 Resources - 34%
4 Resources - 17%
5 Resources - 2%
6 Resources - 1%
7 Resources - N/A

2nd Mull

0 Resource - 3%
1 Resource - 21%
2 Resources - 34%
3 Resources - 29%
4 Resources - 12%
5 Resources - 1%
6 Resources - N/A
7 Resources - N/A

3rd Mull

0 Resource - 15%
1 Resource - 37%
2 Resources - 33%
3 Resources - 14%
4 Resources - 1%
5 Resources - N/A
6 Resources - N/A
7 Resources - N/A

Gwaer
10-30-2013, 10:22 PM
Those numbers are actually quite far off where they should be, we need a lot more tests to be sure though, check back in after you hit 1,000.

UDareUTake
10-30-2013, 10:44 PM
Perhaps more people can chip in to do batches of 50, 100 games? using the same guideline, 24 resources/60card deck, then we'll just tally the whole thing tgt?

Gwaer
10-30-2013, 10:47 PM
I am doing a run myself right now, those numbers were far enough off to be worrying.

Maphalux
10-30-2013, 10:59 PM
I also did 100 but I did not mull three times. I mulled only when necessary such as not having the right color shards for my cards or too few/many shards.

My rules going in were a 0, 1, 6, and 7 resource hand would be auto mulls. 2 and 5s would be mulled if the rest of my cards could not support me long enough to draw into what I needed.

Note that mull to 5 were obviously also mulled to 6 but not included in the mull to 6 bracket so as to not count that game twice.

As you'll see below, only 1 game out of the 100 did I truly get resource screwed and I never flooded out.

Results:




Initial Resources in hand
Total Games
No Mull Needed
Mull to 6
Mull to 5
True Screw after Mull5


0 Resources

1


1




1 Resource

11


10


1


2 Resources

28

20

6

2



3 Resources

31

29

1

1



4 Resources

23

21

1

1



5 Resources

6

5

1




6 Resources

0






7 Resources

0

Gwaer
10-30-2013, 11:05 PM
Was that with a 24 resource deck? I can't really include it in this dataset either way since it's incomplete. It'd throw the percentages off, thanks though.

Maphalux
10-30-2013, 11:07 PM
Yes 24 resources. I left out 4 games that were on the next page of my word doc. I updated to the full 100.

Zarien
10-30-2013, 11:27 PM
I'll do this tomorrow as well so we can get some extra data.

Zarien
10-30-2013, 11:32 PM
I also did 100 but I did not mull three times. I mulled only when necessary such as not having the right color shards for my cards or too few/many shards.

My rules going in were a 0, 1, 6, and 7 resource hand would be auto mulls. 2 and 5s would be mulled if the rest of my cards could not support me long enough to draw into what I needed.

Note that mull to 5 were obviously also mulled to 6 but not included in the mull to 6 bracket so as to not count that game twice.

As you'll see below, only 1 game out of the 100 did I truly get resource screwed and I never flooded out.

Results:




Initial Resources in hand
Total Games
No Mull Needed
Mull to 6
Mull to 5
True Screw after Mull5


0 Resources

1


1




1 Resource

11


10


1


2 Resources

28

20

6

2



3 Resources

31

29

1

1



4 Resources

23

21

1

1



5 Resources

6

5

1




6 Resources

0






7 Resources

0







It's a good set of data to have, but the parameters aren't really applicable to the other sets. Mainly because your non resource cards are going to create different curves and different "playable" hands than others. I think sticking with the basic "how many resources" data and card "copies" will be better for now. And then once we add it all up we can apply the curves to the data and hone it if we see a pattern.

Gwaer
10-31-2013, 12:07 AM
Opening Hand

0 Resource - 3%
1 Resource - 16%
2 Resources - 21%
3 Resources - 33%
4 Resources - 20%
5 Resources - 7%
6 Resources - 0%
7 Resources - 0%

First Mull

0 Resource - 3%
1 Resource - 15%
2 Resources - 31%
3 Resources - 37%
4 Resources - 10%
5 Resources - 2%
6 Resources - 2%
7 Resources - N/A

2nd Mull

0 Resource - 7%
1 Resource - 25%
2 Resources - 46%
3 Resources - 17%
4 Resources - 5%
5 Resources - 0%
6 Resources - N/A
7 Resources - N/A

3rd Mull

0 Resource - 16%
1 Resource - 40%
2 Resources - 32%
3 Resources - 10%
4 Resources - 2%
5 Resources - N/A
6 Resources - N/A
7 Resources - N/A



Raw data:
7 43452152223541142233133231123431114432322521343133 33320533314235333343223330422414122310424433414454
6 31332223233134234433145032325234622222314163332232 33433123232321433130332213233321321421122231431202
5 22221324242321222132012211123413100103122122222212 11124121221112231212322232301222323320233432032212
4 20102001111231102321432121322112012212022400132122 13132201222112111121113211110011101202023210121213

DeusPhasmatis
10-31-2013, 12:19 AM
Those numbers are actually quite far off where they should be, we need a lot more tests to be sure though, check back in after you hit 1,000.

They look alright for the sample size they are. For the initial hand (i.e. 7 cards):

Deck size: 60 Resource count 24 Hand size 7
Probability of drawing 0 resources in opening hand is 0.021615
Probability of drawing 1 resources in opening hand is 0.121041
Probability of drawing 2 resources in opening hand is 0.269415
Probability of drawing 3 resources in opening hand is 0.308704
Probability of drawing 4 resources in opening hand is 0.196448
Probability of drawing 5 resources in opening hand is 0.069335
Probability of drawing 6 resources in opening hand is 0.012546
Probability of drawing 7 resources in opening hand is 0.000896

For the first mull (i.e. 6 cards):

Deck size: 60 Resource count: 24 Hand size: 6
Probability of drawing 0 resources in opening hand is 0.038906
Probability of drawing 1 resources in opening hand is 0.180725
Probability of drawing 2 resources in opening hand is 0.324741
Probability of drawing 3 resources in opening hand is 0.288659
Probability of drawing 4 resources in opening hand is 0.133717
Probability of drawing 5 resources in opening hand is 0.030564
Probability of drawing 6 resources in opening hand is 0.002688

For the second mull (i.e. 5 cards):

Deck size: 60 Resource count: 24 Hand size: 5
Probability of drawing 0 resources in opening hand is 0.069027
Probability of drawing 1 resources in opening hand is 0.258851
Probability of drawing 2 resources in opening hand is 0.360823
Probability of drawing 3 resources in opening hand is 0.233474
Probability of drawing 4 resources in opening hand is 0.070042
Probability of drawing 5 resources in opening hand is 0.007782

For the third mull (i.e. 4 cards):

Deck size: 60 Resource count: 24 Hand size: 4
Probability of drawing 0 resources in opening hand is 0.120797
Probability of drawing 1 resources in opening hand is 0.351410
Probability of drawing 2 resources in opening hand is 0.356578
Probability of drawing 3 resources in opening hand is 0.149423
Probability of drawing 4 resources in opening hand is 0.021791

Gwaer
10-31-2013, 12:21 AM
I wasn't going to post the actual percentages they should be, just to stop people from fiddling with numbers. =P

And some of them are quite far off. There also seems to be a weird pattern, most of my outliers that should be rare happened in the same set of draws.

DeusPhasmatis
10-31-2013, 12:22 AM
I didn't write that probability calculator so I could ignore it!

Gwaer
10-31-2013, 12:27 AM
Why would you write a hypergeometric calculator? There are plenty available.

DeusPhasmatis
10-31-2013, 12:32 AM
And some of them are quite far off. There also seems to be a weird pattern, most of my outliers that should be rare happened in the same set of draws.

For only 100 samples I am unsurprised at substantial deviations. However, it'd be interesting to see large sequences of of draws (just 7 card hands) to see if there are clumping patterns.


Why would you write a hypergeometric calculator? There are plenty available.

Fun.

UDareUTake
10-31-2013, 12:33 AM
For me i just use the =HYPGEOMDIST formula in google spreadsheet to test my decks last time, anyway, here are the current consolidated data -> https://docs.google.com/spreadsheet/ccc?key=0AgrNS7FanmrudFhmYVhaT0UzUk5rampSMF9VR0diN mc#gid=0

Gwaer
10-31-2013, 12:40 AM
Nice spreadsheet. There's the groundwork laid, everyone hop to! For science!

Storm_Fireblade
10-31-2013, 01:14 AM
And some of them are quite far off. There also seems to be a weird pattern, most of my outliers that should be rare happened in the same set of draws.

Thats excactly, what I noticed as well. I mean getting flooded or screwed is part of the game and personally I'm a fan of ressources and quite happy, they did stick to a system like that. Getting 0 ressources in hand, 0 at 6 cards, 0 at 5 cards and 0 at 4 cards or the opposite, to see your opponent mill 6+ of your cards several times and 95% in a row of those are ressources, thats odd though. Especially when happening several times.

If I can, I'll join in on the test later and help with the numbers. It might just be a coincidence, but it sure doesn't feel so.

UDareUTake
10-31-2013, 01:21 AM
I actually recorded down everything, so i just added a new sheet onto the google spreadsheet and it shows the raw data in sequence

Maphalux
10-31-2013, 01:48 AM
Ok. Once more unto the breach, my friends. I went back and did another hundred to give useful data. This time with the three mulls. So here it is.

7 Card hand
0 - 2%
1 - 13%
2 - 26%
3 - 35%
4 - 18%
5 - 5%
6 - 0%
7 - 1%

1st Mull
0 - 5%
1 - 23%
2 - 23%
3 - 35%
4 - 13%
5 - 1%
6 - 0%


2nd Mull

0 - 7%
1 - 25%
2 - 41%
3 - 24%
4 - 3%
5 - 0%


3rd Mull
0 - 9%
1 - 35%
2 - 38%
3 - 18%
4 - 0%


Recorded numbers:

7 Card hands: 3 3 3 4 3 3 4 3 3 4 2 3 4 1 3 4 3 3 5 1 3 3 3 3 3 3 3 0 2 3 2 1 3 4 3 5 1 2 2 2 2 2 4
2 7 0 1 1 2 2 4 5 2 2 2 1 2 3 2 4 4 1 1 4 3 2 3 4 1 2 2 1 3 2 5 4 3 3 3 3 3 2 3 2 2 3 4 4 2 4 5 1 3 2
3 1 4 2 4 3

6 card hands: 1 0 3 0 4 3 2 4 1 1 1 1 4 2 3 1 0 1 3 4 1 3 3 2 4 2 2 1 2 0 3 2 3 2 3 2 2 1 2 3 1 3 2
2 3 3 1 3 1 3 1 4 3 4 3 2 3 4 2 1 1 3 1 2 4 1 3 4 3 3 1 3 2 3 3 3 3 3 3 3 3 5 1 1 2 2 3 2 1 4 4 3 2 0
3 2 1 3 4 2

5 card hands: 2 2 2 2 2 3 1 1 3 1 3 2 1 1 2 3 2 2 2 2 0 3 2 1 0 3 3 1 3 3 1 2 2 2 3 2 3 2 3 1 1 2 3
2 3 2 2 1 0 1 3 2 2 2 1 2 1 4 3 2 0 0 0 1 3 1 3 2 2 3 1 1 4 4 2 3 2 0 3 2 2 2 2 2 2 2 1 3 1 1 2 3 1 2
3 1 2 1 1 2

4 card hands: 2 1 2 2 2 3 2 3 2 2 1 3 1 1 3 2 2 1 1 2 3 0 0 1 1 1 1 1 3 1 2 1 3 0 0 1 1 0 2 1 1 2 1
2 2 2 2 2 2 3 3 2 1 2 2 0 1 2 1 2 2 2 3 3 1 1 0 1 1 0 3 2 1 1 2 1 1 1 2 1 2 3 3 1 3 2 0 2 1 2 1 3 3 2
2 2 2 2 3 1

Xtopher
10-31-2013, 03:35 AM
To me it seems fine given the sample sizes. After ten years of debating this on the MTGO forums, though, I'll leave Hex to others to battle through.

Gwaer
10-31-2013, 04:37 AM
There are still some significant oddities in this data set so far. After we get a few more people posting we'll see. I also am working on an easy and uniform way to test the clumping issues that seem to be extremely common. Not just in resources but in all cards. Worst case scenario everything turns out very close and we will be able to point people to these tests in the future.

Banquetto
10-31-2013, 04:53 AM
(removed after reading rest of thread)

Niedar
10-31-2013, 04:54 AM
Well I would be more concerned if you weren't seeing clumping, after all we are human and don't work well with randomness. We assign patterns to everything.

Gorgol
10-31-2013, 06:51 AM
I did 200 hands.

Opening Hand
0 Resource: (2) 1%
1 Resource: (28) 14%
2 Resource: (46) 23%
3 Resource: (61) 30.5%
4 Resource: (43) 21.5%
5 Resource: (16) 8%
6 Resource: (3) 1.5%
7 Resource: (1) 0.5%


1st Mulligan
0 Resource: (11) 5.5%
1 Resource: (46) 23%
2 Resource: (68) 34%
3 Resource: (42) 21%
4 Resource: (28) 14%
5 Resource: (5) 2.5%
6 Resource: (0) 0%


2nd Mulligan
0 Resource: (7) 3.5%
1 Resource: (49) 24.5%
2 Resource: (78) 39%
3 Resource: (45) 22.5%
4 Resource: (18) 9%
5 Resource: (3) 1.5%


3rd Mulligan
0 Resource: (28) 14%
1 Resource: (66) 33%
2 Resource: (72) 36%
3 Resource: (26) 13%
4 Resource: (8) 4%

Raw data: https://docs.google.com/spreadsheet/ccc?key=0AjKdPUgKOUAodEhjWG1SZjVCY2E1bjExN0VGV1Bpd 1E&usp=sharing

Mahes
10-31-2013, 07:09 AM
Well, if they are using the same random generator as the coin toss......

I remember playing Everquest and almost after every major patch people would swear that Jboots were slower. Not saying that is the case here, but it just made me smile remembering those days.

Zarien
10-31-2013, 09:12 AM
I realized we're all probably doing this test against the AI. At some point we should test if the algorithm produces different results in pvp matches. I'd be willing to test this with one of you when you guys get time to see if our numbers are different from what were getting against the AI as well.

beepharoni
10-31-2013, 09:49 AM
I will be sure to do some of this tonight. I feel like it's beneficial for everyone to try and get as much data as we can into this.

Fred
10-31-2013, 09:59 AM
Those numbers are actually quite far off where they should be, we need a lot more tests to be sure though, check back in after you hit 1,000.


Those numbers are actually not quite that far off from what they should be, and any significant difference can easily be blamed on a small sample size. Here are the same numbers again, now with the theorical value (rounded to the nearest whole number) based on hypergeometric distribution. You will notice the real thing is not that far off.



After clicking through 100 games and mulling 3 times each game @.@

24 Resources, 60-card deck

Here are the results

Opening Hand

0 Resource - 3% (theory = 1%)
1 Resource - 7% (theory = 8%)
2 Resources - 23% (theory = 21%)
3 Resources - 35% (theory = 30%)
4 Resources - 21% (theory = 24%)
5 Resources - 9% (theory = 12%)
6 Resources - 2% (theory = 3%)
7 Resources - 0% (theory = 0%)

First Mull

0 Resource - 1% (theory = 2%)
1 Resource - 20% (theory = 12%)
2 Resources - 25% (theory = 27%)
3 Resources - 34% (theory = 31%)
4 Resources - 17% (theory = 20%)
5 Resources - 2% (theory = 7%)
6 Resources - 1% (theory = 1%)
7 Resources - N/A

2nd Mull

0 Resource - 3% (theory = 4%)
1 Resource - 21% (theory = 18%)
2 Resources - 34% (theory = 32%)
3 Resources - 29% (theory = 29%)
4 Resources - 12% (theory = 13%)
5 Resources - 1% (theory = 3%)
6 Resources - N/A
7 Resources - N/A

3rd Mull

0 Resource - 15% (theory = 7%)
1 Resource - 37% (theory = 26%)
2 Resources - 33% (theory = 36%)
3 Resources - 14% (theory = 23%)
4 Resources - 1% (theory = 7%)
5 Resources - N/A
6 Resources - N/A
7 Resources - N/A

Two major differences appear at 1st mull (transfer from 5 to 1 resource), and 3rd mull (transfer from 3/4 to 0/1), but the opening hand numbers are spot on, and everything else is pretty close to what it should theorically be.

bluehawk80
10-31-2013, 10:11 AM
I dont think, that this kind of stuff is worth testing manually by alpha testers. It is definitly worth mentioning.

But a software engineer who works on Hex can easily generate this kind of data with large sample sizes and without much effort.

beepharoni
10-31-2013, 10:15 AM
I would rather they spend their time working out game-breaking bugs and let us test this. As it's alpha, and we are here to test the software. Playing games to "win" is pointless.. This is what we were put here to do!

Zarien
10-31-2013, 10:51 AM
I dont think, that this kind of stuff is worth testing manually by alpha testers. It is definitly worth mentioning.

But a software engineer who works on Hex can easily generate this kind of data with large sample sizes and without much effort.
It's only a waste if the people doing it think it is. If we can produce some data that enforces that something is odd enough for CZE to check, then I think our time was well served.

Gwaer
10-31-2013, 11:11 AM
It can also serve the opposite purpose. If our numbers looks okay after a long enough time then we can point people to that information when they complain. Either way it seems like a worthwhile endeavor. If you don't agree don't participate, no harm done.

Xtopher
10-31-2013, 11:27 AM
I agree it could be helpful, but I know from experience with MTGO that despite no one every being able to demonstrate the shuffler failing for one statistically relevant sample (and you only need it to fail on one test to prove it's broken) and there being countless tests completed with thousands of samples (one with 10 million samples run by Elf) that there are still people that rely on the anecdotal evidence of their last few matches.

I had fun running tests for awhile, though, with MTGO so if you're enjoying the process, by all means continue. There's even a possibility that something isn't right with the shuffler so you're performing a service either way.

beepharoni
10-31-2013, 03:54 PM
Well. I wasn't able to play 100 games. ran out of time before a friend got here. but i was able to get 40 games worth of data.
Not sure who is keeping track of all of this, but ill post the raw data here:
7 cards:
2332113334241253324654131233332214142516
6 cards
2242233413323134124233324232243200334223
5 cards
0223223111121342541123010112340221111323
4 cards
3033101220210221101221112022223322012012

Hexmage
10-31-2013, 05:28 PM
Looking at the thread the all the value for the random distribution look fine. Its following the bell shape I was expecting and most of the strange patterns you see is because its truly random. As an example if you ask a friend to make a list of flipping a coin 50 times (let him make up if its heads or tails) and then you let him do it by actually flipping a coin, you'll be able to tell the real one from the fake one quite easily.

In short the system is random its just that people notice getting a group of strange effects close to each other more easily because it rarely happens IRL because people cant truly shuffle randomly.

Just because you're unlucky and get 20 bad hands in a row, doesn't mean the shuffler isn't random. It's just as likely to have 20 god hands in a row (and nobody would be complaining about that).

Gwaer
10-31-2013, 05:49 PM
It's not about randomness it's about probability, what is the probability of getting 0 resources in a 7 card hand with 24 resources in a 60 card deck? it's around 2 percent. That's not bad, what are the chances that on your next draw you will get 6 resources? thats 0.2 percent. What are the chances that on your next draw you will have all 5 resources again? That's 0.7 percent. so .02, .002 and .007 All in one draw sequence? More than once in less than 1000 draws?

This indicates that there is the strong possibility that there can be a problem somewhere. I realize that a lot of people regurgitate nonsense about randomness that they have heard before. That's fine. However the more people that do this the better to confirm a problem or to allay fears of a problem. But there is definitely some odd behaviors that seriously shouldn't be happening as often as they are. Miniscule probabilistic chance of them occurring so frequently.

I think it probably comes down to the issue that is causing you to regularly get triples of cards (.003 chance). Sample size could very much still be the issue, and I have gotten some crazy anomalous results, that's fine. That's the point of many people doing it and contributing.

issowi
10-31-2013, 05:54 PM
i haven't played any TCG's since like 1999 but now that i'm into Hex I guess i'd have to ask the question why 24 resources? Why not 22 or 23? What is the thought process and/or logic there?

Gwaer
10-31-2013, 05:57 PM
Any number of resources would do. 24 is an average number in many decks. As long as everyone is using the same number you can calculate the expected probability, and then see if it matches.

LLCoolDave
10-31-2013, 06:17 PM
It's not about randomness it's about probability, what is the probability of getting 0 resources in a 7 card hand with 24 resources in a 60 card deck? it's around 2 percent. That's not bad, what are the chances that on your next draw you will get 6 resources? thats 0.2 percent. What are the chances that on your next draw you will have all 5 resources again? That's 0.7 percent. so .02, .002 and .007 All in one draw sequence? More than once in less than 1000 draws?

So what if you got 0 resources into 0 resources? Surely you would consider that to be an anomaly, too. What about 6 into 6? 1 into 1? You are clearly looking for patterns here, then pick situations that are fairly unlikely a priori and point at them afterwards and say "look, something is up here". That is NOT how you do statistical analysis. Deal a random hand of Poker with 6 players and calculate the odds of that particular outcome. Man that is unlikely to happen, but clearly it just HAS happened in our data set of 1 hand, and this rather unlikely outcome is clearly very over represented in that data set! Well, sure, that's a small sample size so let's play out another 499 hands to increase the data we have. Well, as expected, that particular hand didn't happen again, but our data still seems to show it as a 1 in 500 chance, much higher than the calculated a priori probability! Something has to be wrong!

What you are doing right now is collecting data, looking for anomalies, and then using that same data to confirm your suspicion of that anomaly. That's just not helpful at all. What you need to do is formulate a hypothesis (Something along the lines of: When dealt a 7 card hand with 1 or less resources, the subsequent 6 card hand is likely to have less resources than a hypergeometric distribution would suggest), THEN collect data and run a confidence test of your hypothesis on that data set. Only then can you tell if you are just seeing patterns where there are none or if you are actually onto something. Collecting more data right now is completely useless until you have an actual hypothesis to test. Until you formulate one that isn't based on any data provided here all the additional data provided is not helpful. You're crying wolf in a zoo.

For the initial hypothesis of "Resources are distributed according to a hypergeometric distribution with the given parameters", well, the data we collected so far seems to match that very well. Somebody else can do the proper statistical calculations on it, but looking at the spreadsheet I have no doubt it will pass at a sufficiently high confidence level for me to just ignore this topic in the future. If you want to bring in anecdotal evidence into this, I personally haven't noticed any of those triple card in opening hand issues you seem to have. I'm also pretty sure your .003 number is off as that is about the odds of drawing 3 of a PARTICULAR 4-of in your opening hand. What you need to calculate is the probability of drawing 3 of ANY of your 4-ofs in the deck in your opening hand, which is significantly higher.

issowi
10-31-2013, 06:24 PM
i get the concept of statistically testing a deck with the same parameters. i am just wondering why 24 was used. is 24 the "standard" resource usage for most decks? if so what the math that makes 24 special?

Xtopher
10-31-2013, 07:00 PM
24 is 40% of 60. That's a common percentage of resources that people build their deck with. More so in 40 card decks, I think, than in 60 card, but 40% is a reasonable percentage for any deck.

Maphalux
10-31-2013, 07:20 PM
I don't believe there is a problem but I see no harm in collecting the data. If for nothing else, to point to it to show everything is just fine. And if it happens to actually reveal an issue, only good things can come of that too.

issowi
10-31-2013, 07:26 PM
24 is 40% of 60. That's a common percentage of resources that people build their deck with. More so in 40 card decks, I think, than in 60 card, but 40% is a reasonable percentage for any deck.

i guess my question is why is that reasonable? like i said i'm just getting back into the TCG world and personally i feel 40% is very high for resources. but personal feel versus known practices is much different.

i think of it from the other side that leaves me 46 cards that i can actually use to make the deck. i like the feel of 22 resources 48 other cards better. that leaves me with 12 different cards (4 each) which i feel (like a said feel doesn't mean practicality) makes for a better deck.

obviously i am entitled to make a deck as i please etc etc but i am much more interested in learning about why others are choosing 24 vs 22 resources. arguing semantics i know but i've seen that number a few times now and i feel like i'm missing something strategic that i just haven't been made aware of yet.

thanks for the help.

PS not trying to derail the thread i plan on running 100 or so scenarios this weekend to add to the conversation.

Gwaer
10-31-2013, 07:40 PM
The point is that either of those hands are ridiculously small odds, and they are tending to happen in groups. What groups doesn't matter. My hypothesis has been from before this thread existed that the issue is related to something I had been seeing constantly. Groups of the same cards that are statistically improbable happening with great frequency. My hypothesis currently is that whatever causes that problem may be causing this one as well. Which more data may very well disprove. I don't understand your vehemence against people collecting that data.


The number of resources in the deck is irrelevant. As long as everyone is using the same amount. The first person picked 24, so that's what I went with. Generally I run less lands than that in my decks too. But I have some with as many as 40 lands. It really depends on the deck.

Xtopher
10-31-2013, 08:04 PM
I don't think he's against collecting data. What he's saying is that any conclusions you make about the data you collect will not be valid because of the methodology. You're tracking 8 different stats simultaneously (0 lands through 7) that are interdependent of each other. Even with a sample size greater than 1000 it's probable that at least one of those stats is going to be off enough that it looks kinda wonky, even if everything is working like it should.

An excellent example: http://xkcd.com/882/

Ebynfel
10-31-2013, 08:22 PM
i guess my question is why is that reasonable? like i said i'm just getting back into the TCG world and personally i feel 40% is very high for resources. but personal feel versus known practices is much different.

i think of it from the other side that leaves me 46 cards that i can actually use to make the deck. i like the feel of 22 resources 48 other cards better. that leaves me with 12 different cards (4 each) which i feel (like a said feel doesn't mean practicality) makes for a better deck.

obviously i am entitled to make a deck as i please etc etc but i am much more interested in learning about why others are choosing 24 vs 22 resources. arguing semantics i know but i've seen that number a few times now and i feel like i'm missing something strategic that i just haven't been made aware of yet.

thanks for the help.

PS not trying to derail the thread i plan on running 100 or so scenarios this weekend to add to the conversation.

it's mostly about mathematical analysis and what proper ratio of resources would give you the strongest X turns. About having a higher probability of hitting X mana by turn Y. There's been a lot of math done on it, and there are a lot of factors. Big one being mana curve. Higher curve requires more resources. lower curve less. But 24 is a good balancing point that is usually pretty useful.

Basically works out to "I want to play my cards optimally, how many resources do I need" and following statistics formula to reach a conclusion. Also, tons of research, playtesting, and experience went into formulating all of this for M:tG. It can be slightly different in Hex but I would imagine the formulas used for Magic are applicable here as well.

Gwaer
10-31-2013, 08:33 PM
I don't think he's against collecting data. What he's saying is that any conclusions you make about the data you collect will not be valid because of the methodology. You're tracking 8 different stats simultaneously (0 lands through 7) that are interdependent of each other. Even with a sample size greater than 1000 it's probable that at least one of those stats is going to be off enough that it looks kinda wonky, even if everything is working like it should.

An excellent example: http://xkcd.com/882/

We're actually tracking 1 stat. how many lands are in a hand.

The fact that the ridiculous outlying results that we should be getting once every 1000 hands are coming much more frequently and in clumps is a legitimate concern, not imagination.

Werlix
10-31-2013, 08:49 PM
We're actually tracking 1 stat. how many lands are in a hand.

The fact that the ridiculous outlying results that we should be getting once every 1000 hands are coming much more frequently and in clumps is a legitimate concern, not imagination.

https://docs.google.com/spreadsheet/ccc?key=0AgrNS7FanmrudFhmYVhaT0UzUk5rampSMF9VR0diN mc#gid=0

Doesn't this spreadsheet prove that these 'ridiculous outlying results' aren't occurring more often than they should in any statistically significant way?

DeusPhasmatis
10-31-2013, 08:58 PM
What you are doing right now is collecting data, looking for anomalies, and then using that same data to confirm your suspicion of that anomaly. That's just not helpful at all. What you need to do is formulate a hypothesis (Something along the lines of: When dealt a 7 card hand with 1 or less resources, the subsequent 6 card hand is likely to have less resources than a hypergeometric distribution would suggest), THEN collect data and run a confidence test of your hypothesis on that data set. Only then can you tell if you are just seeing patterns where there are none or if you are actually onto something. Collecting more data right now is completely useless until you have an actual hypothesis to test. Until you formulate one that isn't based on any data provided here all the additional data provided is not helpful. You're crying wolf in a zoo.

You don't need to have the hypothesis before the data. As long as you're capable when doing the math, and have sufficient data, you can evaluate the comparative strength of multiple hypotheses at any time.


I don't think he's against collecting data. What he's saying is that any conclusions you make about the data you collect will not be valid because of the methodology. You're tracking 8 different stats simultaneously (0 lands through 7) that are interdependent of each other. Even with a sample size greater than 1000 it's probable that at least one of those stats is going to be off enough that it looks kinda wonky, even if everything is working like it should.

There's only one Random Variable being tracked, and that's the number of resources in the hand. The problem is that you need a lot of samples (something like 10,000) before you can be confident in saying that the divergence is something other than expected variance.

Gwaer
10-31-2013, 09:00 PM
It hasn't actually been updated yet. I've run 200 more since then. There's actually two different discussions going on, 1 is the primary question being asked, "are the random hands obeying the statistical rules that would be expected" The answer seems to be more or less yes.

However, that does not mean that the other possible issue that I think exists doesn't. "Is there a bug that causes the shuffler to spit out the same cards more often than it should?" More data will help to prove that these have just been weird anomalies, which is completely possible. That's why I'm spending hours doing this stuff.

ZillahEnoch
10-31-2013, 09:17 PM
I'm entirely with you Gwaer.
However, you will have to create a completely different batch of data for testing your second hypothesis (you're not keeping track of the spiting/batching of cards in the current one).

Gwaer
10-31-2013, 09:30 PM
I started keeping track of that with the last 200. I do think the current set will be useful though.

UDareUTake
10-31-2013, 10:05 PM
Just updated the spreadsheet, for those that want to take a look at the sequence of the data, go to the Raw Data spreadsheet

beepharoni
10-31-2013, 10:07 PM
If there is still a need for it, I am not working tomorrow and i can try to add some more data by playing another 50 -100 games where i just mull and track the data

Xtopher
10-31-2013, 11:02 PM
The reason I'm saying you're tracking 8 stats simultaneously is that if you wanted to set up a level of confidence, like for example, 90% certain the shuffler is correct for 0 lands, 90% for 1 land, etc., when you figure the probability it comes out that there's a 57% chance that at least one of those 8 stats will fail the test. With 95% confidence it goes down to 34%, but that's still pretty high. If you go with 99% confidence, that would take a lot of data, but you'd only end up with a false positive 8% of the time.

I spent a decade dealing with this issue over on the MTGO forums and I'm trying to anticipate for you the objections the math/stats people are going to have with your methodology. I'm trying to be helpful, but I don't think it's being taken that way. It's cool, though, I'm not that passionate about this issue.

Zarien
10-31-2013, 11:20 PM
The reason I'm saying you're tracking 8 stats simultaneously is that if you wanted to set up a level of confidence, like for example, 90% certain the shuffler is correct for 0 lands, 90% for 1 land, etc., when you figure the probability it comes out that there's a 57% chance that at least one of those 8 stats will fail the test. With 95% confidence it goes down to 34%, but that's still pretty high. If you go with 99% confidence, that would take a lot of data, but you'd only end up with a false positive 8% of the time.

I spent a decade dealing with this issue over on the MTGO forums and I'm trying to anticipate for you the objections the math/stats people are going to have with your methodology. I'm trying to be helpful, but I don't think it's being taken that way. It's cool, though, I'm not that passionate about this issue.

I don't think he was taking your input in a bad way. I think it could definitely have been worded a little differently to help it be received better. But overall I didn't see anything wrong with what you were saying. I think at this point of infancy, he and everyone mainly wants to just for now stick with their current test data, to verify if there even is something that might seem odd. And then if something is actually popping up, we could bring attention to it or setup more refined testing parameters to hone our data results.

Gwaer
10-31-2013, 11:23 PM
Yea, obviously your hatred for MTGO discussions is coloring your opinion of this discussion, you're missing the point. Also this is a new game, with new code that very easily could have mistakes. Moreover the mulligan mechanic may be different, and part of the problem so tracking a relationship between anomalous opening hands and mulligans makes sense. If you don't like the discussion don't take part, but no harm can come from gathering the data and doing the numbers.

I don't want you to think that this is the end all be all of tests, it's just the first one, it's easy, it's already being done, and I don't see any reason to stop so early. It can theoretically make me lose interest in doing the others.

DeusPhasmatis
10-31-2013, 11:51 PM
The reason I'm saying you're tracking 8 stats simultaneously is that if you wanted to set up a level of confidence, like for example, 90% certain the shuffler is correct for 0 lands, 90% for 1 land, etc., when you figure the probability it comes out that there's a 57% chance that at least one of those 8 stats will fail the test. With 95% confidence it goes down to 34%, but that's still pretty high. If you go with 99% confidence, that would take a lot of data, but you'd only end up with a false positive 8% of the time.

I'm pretty sure there's a way to analyze the spread of outcomes against the expected distribution using variance and standard deviation.

Xtopher
11-01-2013, 12:04 AM
I've consistently said it's good to record data, the more the better. You guys are doing fine and you seem to understand the volume of hands you'll have to record so I'm sure you'll have something useful when you're done.

Ravallian
11-01-2013, 04:40 AM
I think this is a great community effort to compile some meaningful data for verification/vindication but surely the uber server engineer/admin somewhere has all the RAW data that CZE can make an even better judgement of the shuffle stats?...having these data release somewhere at interval will help with quelling these issues I believe as through the life of the game I'm sure I'll see many more thread on this subject :p

LLCoolDave
11-01-2013, 06:37 AM
The point is that either of those hands are ridiculously small odds, and they are tending to happen in groups. What groups doesn't matter. My hypothesis has been from before this thread existed that the issue is related to something I had been seeing constantly. Groups of the same cards that are statistically improbable happening with great frequency.

That is not a hypothesis in a statistical sense. If you change the granularity with which you look at your opening hands, you quickly come to very wrong conclusions. 3 resources 4 spells? Yeah, that's fairly likely, nothing to see here, next hand. 3 of the same resource and 3 gas trolls! My god that's unlikely! There's only 4 gas trolls in my deck, this particular hand has less than 0.3% of a chance of showing up! Except, you know, if you calculate the odds for getting that exact first hand you had you'd find it to be pretty unlikely as well, once you stop intuitively just lumping it as resources and spells. Individually unlikely events occur in groups in this data set ALL the time, because once you look at all your hands with the type of granularity you apply to the ones that stick out to your personal pattern detector, ALL of your hands are fairly unlikely.

What is happening here is that there are certain types of hands that respond strongly to our pattern recognition as a human being, while others don't. The 30ish % probability of getting 3 resources 4 spells can not be compared to the <0.3% probability of getting a triple Gas Troll opening because these probabilities are pulled from different distributions! While on the surface they both seem to answer the question of "How likely is this opening hand?", in reality they use very different definitions of "this hand". This is a very easy mistake to make.

It is very easy to look at a truly random sample and find a pattern in it. It's something humans are very strong at. Unfortunately, we are also very good at being strongly biased by these patterns we find. We are notoriously bad random number generators (run the coin flip experiment yourself, if you are not convinced) and we are also notoriously bad at understanding the sheer scale at which the law of large numbers operates for most probability distributions. (We are also pretty bad at intuitively understanding what it even says in the first place, see all the different variations of the gambler's fallacy.)

What I see you doing here is look at the data, find a couple things that stick out as odd to you, as a human being, and then try to find a justification afterwards for why what seems odd was actually unlikely and use that to prove that what was happening was indeed off from a statistical point of view. What you need to do is give a mathematical definition of "those hands" and "happening in groups", and then we can try to figure out what is really going on. As long as these definitions stay subjective, you're just massively biased and will always find the pattern you are looking for. At this point the probabilities we look at will be small enough that we need a pretty massive sample size to come to any good conclusions. If we just get another 500 hands and still incorporate our initially biased sample that lead to our hypothesis, our full sample set will still be strongly biased. That's why I highly advocate collecting unbiased new data after you have set your hypothesis to check in this situation.

Any reasonably sized sample will have outliers and oddities. Like 7 resources into 6 resources or some such. This individual outlier may tingle your pattern detection senses because, taken on its own, it is very unlikely to occur (that's why it is an outlier in the first place). What you usually fail to take into account is all the similarly unlikely events that have failed to occur in the data set. If you combine all the situations that would strike you as being odd, count their occurrences in the data, and compare that to the calculated probability of SOMETHING odd happening, you'll very likely find that, even in the data set of the size we have right now, things behave very much like we'd predict them to.

As long as you don't have a proper, tangible definition of what ODDITY even means when it comes to opening hands, you'll always be able to twist the data to show support for your theory. Although collecting more data is fine, until you step up the methodology of analyzing it afterwards you'll still just be the boy that cried wolf at the zoo.

DeusPhasmatis
11-01-2013, 07:20 AM
Raw data generated fairly isn't biased. If you're correct and there is no bias in the random number generator, then the samples taken are truly random and regardless of any seeming pattern, they do not bias any data they are included in. Even more strongly, discarding said data is actually bad form because it is a non-fair sample selection mechanism. You can't throw data out just because it looks biased.

In fact, even if the random number generator is broken, as long as data has a fair chance of being included (as in, if two hands have an equal probability of being generated, then they should have an equal probability of ending up in the sample, i.e. no biased sampling) then the data is valid.

Another way to put it is: even if the initial sample truly is biased (because of random chance landing on an outlier), under sufficient additional data this bias will trend towards zero. Otherwise known as the Law of Large Numbers (http://en.wikipedia.org/wiki/Law_of_large_numbers). You don't get to discard data just because you don't like the conclusions people are drawing from it.

Xtopher
11-01-2013, 07:47 AM
As long as the analysis is done correctly, there's no problem with generating the data before a solid hypothesis. It's just a matter of making sure your sample size is sufficiently large enough.

The concern is that if you're tracking 8 (in this case, 32, actually) different categories simultaneously, the chances are very high that at least one of them will look out of whack when in fact there's nothing wrong.

Let's say you collect 2000 samples in this shotgun method and you observe that there are many more 3-resource hands after one mulligan than there should be. However, you've got 32 categories. The chances are extremely high (almost 100%) that at least one category is going to be off significantly, even if the shuffler is working fine.

Nothing wrong with collecting the data, you just have to be very careful when you do an analysis of it.

DeusPhasmatis
11-01-2013, 08:14 AM
The concern is that if you're tracking 8 (in this case, 32, actually) different categories simultaneously, the chances are very high that at least one of them will look out of whack when in fact there's nothing wrong.

There is one category (number of resources in a hand). There is a know probability distribution for that Random Variable (http://en.wikipedia.org/wiki/Random_variable) that the samples can be compared to. And the assorted "categories" (actually the different values the random variable can take on) are related phenomena that cannot be separated from each other (no more than the number of heads flipped on a coin can be separated from the number of tails).

Following up on the coin example, if the number of heads is wrong, the number of tails must also be wrong because your total probability must sum to 1 (i.e. if you get 40% heads then you must get 60% tails). Extrapolated to Random Variables with more than 2 outcomes, this means that at least one other outcome must be wrong, and statistically you'd expected a balanced spread of wrongness on all the other outcomes (on average, about half of them would also be wrong).

dwebber88
11-01-2013, 08:24 AM
i watched a lot of streams and the thing i think you need to test for is dual color decks.

from my observations i've seen that actually drawing 1/1 1/2 or 2/2 resources is significantly harder than drawing 2, 3 or 4 resources of the same color (0/2 0/3 or 0/4).

I have started a thread to test this: Here (http://forums.cryptozoic.com/showthread.php?t=29554)

Xtopher
11-01-2013, 08:46 AM
I understand what you're saying Deus. What I'm not seeing, I guess, is how to statistically analyze the data. For example, if it was just a matter of testing how often a player gets 1 resource in his opening hand, that's a fairly easy analysis to determine a confidence level for. I have no idea, though, how to do that for the entire distribution of opening hand resource possibilities. What's the procedure to test whether the shuffler is working right for a confidence level of 95%, for example? My degree is in Mathematics, but I only took one stats class, so this is outside my area of expertise.

Slish
11-01-2013, 09:28 AM
Ive read most of the posts in this thread.

As a former pokerplayer (where these kind of stats are very regular). These kinds of discussions always amuse me :)

It's very simple. You need a VERY large sample size to have some certainty in your test results. Im talking about 10k hands AT LEAST. in this case.

I've had runs in poker for MONTHS that I was off statistically. And by 10 thousands of hands. This is NORMAL.
It just happens, and only in very large sample sizes, the % gets close to where they should be.

--

Another remark I would like to make is: "Psychology".
The human brain has the 'amazing' quality and perception to try to see patterns. Especially when they affect you in a bad way.
The normal case: you will notice and remember much more that you've had 10 bad hands in a row.
You will not really remember that much that you had 40 normal hands in a row, since its 'normal'. While 40 perfect hands (probability-wise) in a row is in fact very unusual!


--

Also I saw the spreadsheet made with most tests together and the average shown of these tests. The % seems pretty close to the probability already for such a small sample size. Nothing wrong with the RNG it seems :)

FLOWDANGO
11-01-2013, 11:34 AM
I will give my insight into the games I played yesterday. I played several games, maybe around 15 to 20 games.

During that play time I would say that 25% of the time I had to mulligan down to 1 card and request a rematch because of this. I would either have no resources, all resources or 1 resource and all non.

So I felt it was happening more than it would feel it should with a computer generated shuffling system. So that is my 2 cents, I did notice something but who knows what it could be. Perhaps something they can look at.

Gwaer
11-01-2013, 12:08 PM
You guys are hilarious we're talking about gathering a ton of data to overcome the possibility that the issue is a small sample size. Currently there's a major problem in the numbers. Something that should happen less than 80 times in 100,000 hands has happened twice in 500.

The guy going on about every hand being improbable is just a moron. That's not how it works. Please check yourselves at the thread. This isn't online poker or mtgo, this is a new client. Moreover random is actually quite difficult for computers to do, and depending on their implementation the bug can exist between the shuffler and the client, if it's just not sampling frequently enough for example. I realize the mantras that you're spurting at me have been very helpful in washing the ignorant masses of their take on seeing examples and calling them trends. At this point however the data is still saying trend. So your concerns are noted. When I come out declaring that something is broken, then take a look at the sample size I used and pour over the data and please do point out any flaws you find. At this point it's just a quirk that could get ironed out with more data.

Kersed
11-01-2013, 12:29 PM
Seems fine to me..

Willd
11-01-2013, 12:38 PM
Claiming that something happening twice in 500 iterations is a "major problem" when the true rate is 80 in 100,000 shows a pretty flawed understanding of statistics. That isn't even within an 80% confidence level that there is a problem. To get to a 95% level, which is the normal level before you would start to think there was potentially a problem you would need to see that same rate over 1500 iterations. Even then that means you would expect to see that ratio once every 20 times you ran 1500 iterations.

Similar to Slish I come from a poker background so have heard talk about shufflers being non-random far more than I care to remember. Extreme events happen and humans are hard-wired to remember them. Even just thinking about this from a pure programming standpoint it seems like it would be far more difficult to code a shuffler to throw up the anomalies people think they are seeing than it would be to code a truly (pseudo) random shuffler. Unless Hex are doing something fundamentally flawed with their random number generation then it seems much more likely that these anomalies are just that, anomalies.

All that said the act of having the community collecting a large sample and being able to point at it is very useful and so this effort is still definitely worthwhile, even if I think the chance of there being a statistically relevant problem is minimal. My experience with poker tells me that some people will never be convinced but it would definitely help ease some people's minds to have it.

Blargrag
11-01-2013, 01:47 PM
I think the problem here might be that most of us come from a paper TCG background. We are used to weaving land into the deck and therefore never experiencing a truly random distribution. In the digital space flood/screw is going to be more likely than what we are used to. This is perhaps the only weakness I have seen with the digital environment.

Though I have no idea how CZE runs their shuffler, I would love to see one that splits the deck into resources and other cards, randomizes those cards, and then weaves them together. After that, simulate a few 'real world' shuffles and we are good to go. Screw/flood would be mitigated because the draws would not be truly random, but simulate the physical card experience.

Anyway....my 2...

Willd
11-01-2013, 01:53 PM
Fwiw land weaving is cheating in paper TCGs (or at least MTG, I assume others). If you were deemed to deliberately do anything other than completely randomise your deck in any official tournament you would be immediately disqualified.

Damascus
11-01-2013, 02:04 PM
Cool thread. While you guys are researching this I figured I might throw in my 2c with regard to something I may have noticed.

There is no way for me to confirm this without doing a lot of work, but I've made at least a dozen decks and I seem to always get a very non-random looking hand the first time I play a new deck.

For example - I make a blue deck, I join a pvp game. The first hand I draw is 3 flock of seagulls, 2 sabotages, 2 thunderbirds.
I make a mono-green deck; the first draw I get is 3 chlorophyllia, 3 pack raptors, 1 resource.
I make a red deck; the first draw I get is 5 resources, 2 heatwaves. etc.
I seem to get a lot of duplicate card-draws.

I'm totally aware this could easily not be anything meaningful, and I actually have no problem with the card draw in all other situations - I just noticed this specifically when playing most of my decks for the first time. Maybe someone who is inclined to test this could tell me if there is anything to it or if I've just been unlucky in these instances?

Blargrag
11-01-2013, 02:29 PM
Fwiw land weaving is cheating in paper TCGs (or at least MTG, I assume others). If you were deemed to deliberately do anything other than completely randomise your deck in any official tournament you would be immediately disqualified.

Link?

As long as there is sufficient shuffling after the weave there is no problem...

Does this 'stack' the deck?
Yeah a bit, shuffling is not perfect.

Should this be illegal?
Its not in MtG as far as I can tell, and if it makes more games playable I don't see a reason why it should be.

And no, I don't want to have the playability vs. better deck construction argument. I am fine with whatever is implemented.

Willd
11-01-2013, 02:35 PM
http://forums.mtgsalvation.com/showpost.php?p=9601295&postcount=7

That post quotes from the official MTG rules (and is part of a thread where someone has asked if mana weaving is allowed). The most relevant line is "Intentionally stacking a deck with the intent to take advantage of an insufficient shuffle is defined as Cheating — Manipulation of Game Materials."

Mana weaving before you do your shuffle is allowed but it is either
a) A waste of time as it has literally no effect or
b) Cheating
depending on whether or not your subsequent shuffle is thorough enough. A shuffle is only thorough enough if the state the cards were in before the first shuffle has no discernible effect on the state they are in after the shuffle, ie the mana weave can not "stack" the deck in any way.

Blargrag
11-01-2013, 03:03 PM
http://forums.mtgsalvation.com/showpost.php?p=9601295&postcount=7

That post quotes from the official MTG rules (and is part of a thread where someone has asked if mana weaving is allowed). The most relevant line is "Intentionally stacking a deck with the intent to take advantage of an insufficient shuffle is defined as Cheating — Manipulation of Game Materials."

Mana weaving before you do your shuffle is allowed but it is either
a) A waste of time as it has literally no effect or
b) Cheating
depending on whether or not your subsequent shuffle is thorough enough. A shuffle is only thorough enough if the state the cards were in before the first shuffle has no discernible effect on the state they are in after the shuffle, ie the mana weave can not "stack" the deck in any way.

Hrm...you must be a much better shuffler than me. Even after the 'perfect' 7 shuffles to randomize the cards still seem be influenced by their starting positions...though this is likely my perception and has little to do with reality. I guess I'll just continue to 'cheat' 'cause none of my friends seem to care.

also this:
http://en.wikipedia.org/wiki/Shuffling
plus one tidbit of it to highlight:
"..seven shuffles of a new deck leaves an 81% probability of winning New Age Solitaire where the probability is 50% with a uniform random deck.[8][3]"

None of us have ever truly experienced randomness, just things that approximate it well enough.

Regardless my point is as follows:
In the digital space we can get closer to truly random than in the physical space. Is there actually a noticeable difference? I think so and that is what freaks people out about the MtGO and the Hex shufflers.

Willd
11-01-2013, 03:17 PM
I think the biggest difference between physical and digital space is often the sample size. You tend to be able to do things more quickly and play more total games with a digital game than a physical game, which means you see the anomalies more often (in total, not proportion) and because we tend to remember the anomalies and not normal occurrences we get it into our heads that it happens more often proportionally. This is obviously a much bigger factor in poker than TCGs but I think it is still a factor.

As for the concept of 7 shuffles being "perfect", that is just the first number at which every ordering of the cards is possible, that doesn't mean each possible ordering has the same likelihood and as such is obviously not truly random. You are right that no physical shuffle will ever be truly random but you can go a pretty good way towards it by mixing different types of shuffling (riffle, weave, overhand) and if you do that sufficiently there shouldn't be any noticeable difference. Of course in casual play people do shortcut a lot (mana weaving, pile shuffle) and aren't thorough enough subsequently there might be a noticeable difference if that's what you're used to.

Blargrag
11-01-2013, 03:39 PM
You are probably right. My riffle sucks with sleeved cards :-)

My general strategy is draft, build deck leaving ordered piles, shuffle piles individually, weave the lands (just once - never between games), shuffle continuously over a cigarette. Between games I shuffle the cards that see play separately and then shuffle them into my deck.

I dont see this as cheating, but I do have the superstition that the weave helps my curve. Though as you pointed out, my sample size is depressingly low.

Willd -
Have you been watching the streams? It does seem to me that they draw into doubles/triples an awkward amount. I wouldn't be shocked if there is a bug there (or again, just shitty human perception acting up). Also, it looks like the Inspiration Engine RNG doesn't seed until the game state is updated (new phase/new card played) leading to the endless stream of dwarven turbine that we have seen a few times. Perhaps a similar bug is in the card draw script?

Willd
11-01-2013, 03:55 PM
Having done some reading on this (as a programmer and math nerd I find it very interesting) it's actually a lot easier than I anticipated to do a shuffle poorly. Depending on the PRNG they are using it could easily result in biased shuffles. However I'd still be very surprised if it resulted in the specific type of bias people are noticing.

That said, the specific bug people have been seeing with certain escalation cards always ending up at the bottom of the deck after being played could suggest there is something fundamentally flawed about they way they are handling shuffling/drawing of cards.

I'd actually be very interested in knowing if the full order of the deck is stored anywhere or if they just randomise from the cards remaining in the deck for each draw step. This bug implies it's probably the former, which conceptually makes more sense but does make it easier to get badly wrong. It also opens more potential ways to determine what the next draw is going to be - both knowing the PRNG method and seed (there would only be a single seed for the whole shuffle) and snooping the way it's stored allow you to know the coming draws. Randomising from the remaining cards at each draw step means there is a new random number generated each time, making cracking the PRNG more difficult, and there is no data structure to snoop.

BenRGamer
11-01-2013, 04:12 PM
Ok what people should do is keep track of how many opening hands have how many resources in them. I'm guessing in the long run they'll see its statistically average.

EDIT: You can even just start bot games to see your opening hand/ first mulligan to do this

Er, you know there's an opening hand tester in the Deck Stats area of the deck builder, right?

Storm_Fireblade
11-01-2013, 05:53 PM
Er, you know there's an opening hand tester in the Deck Stats area of the deck builder, right?

Omg, I didn't see that myself. Thanks for the hint :)

Banquetto
11-01-2013, 07:48 PM
Er, you know there's an opening hand tester in the Deck Stats area of the deck builder, right?

Are you 100% confident that the opening hand tester uses the random number generator in exactly the same way as an actual game shuffle? :cool:

Xtopher
11-01-2013, 08:13 PM
Are you 100% confident that the opening hand tester uses the random number generator in exactly the same way as an actual game shuffle? :cool:
lol. Total deja vu.

Or maybe the shuffler only has trouble randomizing for left-handed people.

Gwaer
11-01-2013, 10:13 PM
As I stated before. The bug could be unrelated to the actual shuffler and instead between there and the client in game. I knew about the hand viewer but have not been using it and would prefer if people didn't but that's not something I can control.

Mahes
11-02-2013, 07:53 AM
I am wondering if instead of shuffling and then dealing the cards, if instead the dealer just gives a player the next 6 cards and then the next 5 cards and so on. The whole problem might be as simple as " Not reshuffling a deck and then dealing" if a player takes a mulligan.

rcl
11-02-2013, 11:19 AM
I am wondering if instead of shuffling and then dealing the cards, if instead the dealer just gives a player the next 6 cards and then the next 5 cards and so on. The whole problem might be as simple as " Not reshuffling a deck and then dealing" if a player takes a mulligan.

This thread has basically accomplished nothing except to demonstrate that some folks are probably cheating in IRL TCGs and that tin foil hats are still in style...

...until this very good point. It's important to understand how a mulligan works. And when "reshuffles" happen in general. If I draw a hand with 7 resources, and mulligan, and those go to the bottom of my deck.. That is a huge deal for a variety of reasons.

I doubt that cards are represented in a specific shuffled order at any point on the server. You have a deck of n cards, and drawing a card draws card #[rand() modulo n] .. But an official CZE confirmation of that would be very well appreciated

Willd
11-02-2013, 02:07 PM
I am actually pretty confident that they are stored in their shuffled order on the server. For one thing the bug with the escalation card always being at the bottom of the deck doesn't make any sense if the card is chosen at random from the remaining deck for each draw. Secondly there are cases where the player will know the location of some of the cards in the deck, in which case that needs to be stored and it makes sense that the whole deck would be stored to allow that sort of situation.

Mahes
11-02-2013, 04:57 PM
The thing is this:

When I get my first 7 cards and take a mulligan what exactly happens next? Do the cards go to the bottom of the deck and then I am handed the next 6 cards without any shuffling?

Do the 7 cards get added into the deck and I just get the next six cards that would have been next with a potential repeat from the random reinserted card?

Or does the system place the cards back into the deck, then shuffle the entire deck and then hand you 6 cards from the top?

What happens after the game has started does not have much of a factor on this portion of the game. This is what I would like to know.

The first part could be tested simply by writing down the 7 cards, then taking a mulligan and then play through the deck and see if the last 7 cards were what you originally drew. Consider possible small alterations if you play cards that go back into the deck.

Ditsch
11-02-2013, 08:05 PM
After playing a few days i found out there are some problems with the random.

The coin flip, around 75%-80% of the time i begin an not my oppenent very obvious there is a problem.

Card stacking if you play 4 times the same card you get very often 3 of them in the first 20 cards, way to often does that happen.

Escalating cards also seem to show up more often then they should.

Mana flood is a bit rarer then in mtg online.

I have no data on "color" screw only tried mono decks.

We need a better mulligan rule going down to 6 cards for a mulligan does not really help that often. At least a free mulligan is needed.

Starting hand resources i play 28 resources that's close to 50% and i have way to much hands with only 1 land starting, i loose 5 to 10% of my matches only due mana screw even with that high count of ressources, from 200 matches i did loose 1 or 2 due to mana flood.

Alas i have no statistical data but i am pretty sure something is wrong/different with the "random" it's at least totally different then the mtg online random. In the state that the random actually is and how the mulligan rule is i would not recommend to play hex on a competitve pvp level as it will never turn e-sports due those problems.

Thank you generating data sets, when i have time next week i might check them to cross check them with what i did find out/think.

DeusPhasmatis
11-02-2013, 08:56 PM
I understand what you're saying Deus. What I'm not seeing, I guess, is how to statistically analyze the data. For example, if it was just a matter of testing how often a player gets 1 resource in his opening hand, that's a fairly easy analysis to determine a confidence level for. I have no idea, though, how to do that for the entire distribution of opening hand resource possibilities. What's the procedure to test whether the shuffler is working right for a confidence level of 95%, for example? My degree is in Mathematics, but I only took one stats class, so this is outside my area of expertise.

I'm a Computer Science major, so I only have a basic knowledge of statistics, but the Z-test (http://en.wikipedia.org/wiki/Z-test) and the F-test (http://en.wikipedia.org/wiki/F-test_of_equality_of_variances) look like what you want (though they don't appear to have explicit confidences).

Bossett
11-02-2013, 09:15 PM
If you want to get into the math, you just need to model the probabilities (http://hexmetrics.ni.tl/tools/r/d60x24t30f7a0e0#results), which will give you a bell curve (graph % chance by #). Then, you just need data - plot out the resource count from a hundred or so games.

Once you've got both - you check goodness of fit using something like Pearson's chi^2: http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Goodness_of_fit.

(Likely the easiest way is just to draw both the curve and put your data on it and see if it looks superficially the same - actually doing the math is likely limited by how many times you can start a new game, because the mulligan method - is it a shuffle, do they bias those cards toward the middle of the deck, etc. - isn't really known.)

DeusPhasmatis
11-02-2013, 09:27 PM
If you want to get into the math, you just need to model the probabilities (http://hexmetrics.ni.tl/tools/r/d60x24t30f7a0e0#results), which will give you a bell curve (graph % chance by #). Then, you just need data - plot out the resource count from a hundred or so games.

Once you've got both - you check goodness of fit using something like Pearson's chi^2: http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Goodness_of_fit.

(Likely the easiest way is just to draw both the curve and put your data on it and see if it looks superficially the same - actually doing the math is likely limited by how many times you can start a new game, because the mulligan method - is it a shuffle, do they bias those cards toward the middle of the deck, etc. - isn't really known.)

Thank you.

Xavon
11-03-2013, 12:57 AM
Interesting analysis. Been a long time since I did any statistics, but as a pro-programmer, I can say that psuedo-RNGs are a huge pain in the nikta.

What would everyone here say if the mulligan where changed so if you have 0, 1, N, or N-1 resources (where N is cards in hand) you are not penalized a card on your mulligan?

DeusPhasmatis
11-03-2013, 01:13 AM
Interesting analysis. Been a long time since I did any statistics, but as a pro-programmer, I can say that psuedo-RNGs are a huge pain in the nikta.

What would everyone here say if the mulligan where changed so if you have 0, 1, N, or N-1 resources (where N is cards in hand) you are not penalized a card on your mulligan?

Too abusable in deck-building. Though I'm Okay with everyone getting a single free mulligan.

Bossett
11-03-2013, 01:32 AM
The problem with free mulligans or adjusting resource draw probability is that you end up with people building decks that contain fewer resources. This is a problem, because decks that contain fewer resources can be an unfair advantage against people that play a deck with a 'safe' number of resources, and since they rely heavily on the favour of the RNG you'll end up with those decks being either dominant or awful, neither of which is much fun to play against.

There needs to be some penalty to redrawing, even if you only do it once, because the penalty encourages considered deck-building.

Ditsch
11-03-2013, 05:09 AM
The problem is there is no safe number i am already 28 ressources and i get more often ressource screwed then in real life mtg (where i normally play with 24 ressource). And there is also no way to deck build decently to avoid the problem except you are playing a weenie horde deck where you only need 1 or 2 ressource to play everything. The not really encouraging game tactics and diversity way to go to hit a road block and never considered a worthy e-sports game.

The penalty of 1 card less would only be okay if you can switch out cards for resources, so i have a hand of 1 resource and 6 other cards, i go down to 6 card and change 2 of my cards against random resources (from your deck) then i have a hand of 3 cards and 3 resource i am at disadvantage due having one card less but the mulligan did bring the ressources i needed to be able to play a game.

I would say a free mulligan doesn't give decks with low ressource a unfair advantage and that's from my experience of nearly 20 years of playing magic where they also used the free mulligan rule.

Gothmor
11-03-2013, 06:55 AM
They can do masses of creature ressources or spells that are recources too, when they would do it from the beginning it would be no prob.

Maphalux
11-03-2013, 10:17 AM
The problem is there is no safe number i am already 28 ressources and i get more often ressource screwed then in real life mtg (where i normally play with 24 ressource). And there is also no way to deck build decently to avoid the problem except you are playing a weenie horde deck where you only need 1 or 2 ressource to play everything. The not really encouraging game tactics and diversity way to go to hit a road block and never considered a worthy e-sports game.

The penalty of 1 card less would only be okay if you can switch out cards for resources, so i have a hand of 1 resource and 6 other cards, i go down to 6 card and change 2 of my cards against random resources (from your deck) then i have a hand of 3 cards and 3 resource i am at disadvantage due having one card less but the mulligan did bring the ressources i needed to be able to play a game.

I would say a free mulligan doesn't give decks with low ressource a unfair advantage and that's from my experience of nearly 20 years of playing magic where they also used the free mulligan rule.

As I showed at the beginning of this thread, and as the percentages of the collected data confirms, there is no excessive screw happening.

In addition, Magic got rid of the free mulligan a long time ago for good reason. It rarely got used anyway since it could only be done with 0 or 7 resources in hand.

This thread is for collecting factual numbers not arguments for changing the mulligan rules or how the games resource system operates based on anecdotal evidence. Please avoid bringing that nonsense in here.

Ditsch
11-03-2013, 12:13 PM
@Maphalux i didn't see any factual data here which was big enough in size to draw a worthy conclusion. Please don't bring in your nonsense because your are expecting there is no problem.

Computer random models are always screwed and have patterns which a real random wouldn't have and it's even possible to see those patterns or even abuse them (only exception is quantum mechanics which seem to be truely random but i doubt they used a random based on that).

So right now i am sure there is a problem the question is only if the problem is significant enough to have a impact on the game. And to see that we should collect more data.

GrinningBuddha
11-03-2013, 12:22 PM
This thread is the Hex equivalent of threads that inhabit every online poker forum everywhere. There are people who are 100% convinced that major poker sites rig their deals for more action or don't have a 'random' distribution of hands. What they don't realize is that live poker is every bit as brutal and unfair to players as online poker, you just see it much slower. You're lucky to get 30 hands an hour at a live poker table where at an online table you're getting around 100. Seeing 3x the hands, you're going to see 3x the bad beats which are much easier for a human brain to remember than the hands where 'the hand that should have won did so.'

When you combine that with the average player's lack of understanding of odds and probabilities and you have a recipe for "This site is rigged." Now I'm not saying that Hex has their random distributions down pat yet, it certainly seems like that coin flip at the very least is bugged, if not the shuffle mechanism. Just know that this thread and threads like it will never die. They can be maintained with hard data over millions of trials that show probablilities to be within acceptable tolerances, but you'll always find someone who feels they're being screwed. Hopefully one of the gearheads at CZE has a simulation designed to test the shuffle mechanism to that end.

Werlix
11-03-2013, 12:51 PM
Yeah I think what we have here is a case of http://en.wikipedia.org/wiki/Confirmation_bias.

Also even if the shuffler isn't 100% random, the shuffler has no idea which cards in your deck are resources and which aren't so there's no reason for them to clump or not clump together.

It also has no concept of which 7 cards are in your opening hand or which 6 are a new mulligan hand so there's no reason for any particular cards (which the shuffler has no idea about) to be in a specific hand (which the shuffler also has no idea about).

Xtopher
11-03-2013, 01:17 PM
If it's going to be checked, this is the time to do it. In theory, if all decks have an identical pre-shuffled state (e.g. all lands clumped, all multiples of a card together), it's possible there could be a detectable bias when the cards are shuffled.

I agree, though, that this will end up being a never-ending issue.

McKizz
11-03-2013, 02:27 PM
Forgive me, as I haven't read through the entire thread. I just wanted to echo that I too have been having randomization issues that occur more frequently than statistics would suggest should happen.

Bossett
11-03-2013, 03:39 PM
Computer random models are always screwed and have patterns which a real random wouldn't have[...]

This isn't really accurate - we're only dealing with the ordering of a small number of elements, and it's relatively easy to generate numbers that are statistically random (check out http://www.random.org/analysis/), and it's relatively easy to tell if they're random enough. You only need to generate a series of numbers that are equal to your deck size, so you don't really need to even worry about having insanely long periods such as in http://en.wikipedia.org/wiki/Mersenne_twister.


Forgive me, as I haven't read through the entire thread. I just wanted to echo that I too have been having randomization issues that occur more frequently than statistics would suggest should happen.

To make a claim like this, you would have to start from what "statistics would suggest should happen". If you grab stats from around 100 games (pick one card, so your resources will be ideal) and draw them on a graph next to the actual distribution and see if your shape is substantively different.

With 60 cards and 24 resources, you would expect to see on a first 7-card draw:

https://dl.dropboxusercontent.com/s/dvno1vw20su69o9/draw-dist.png?dl=1

If you just start noting down the results from your starting hand for all your games, you should be able to tell approximately how 'fair' things are relatively soon.

Werlix
11-03-2013, 04:15 PM
From this spreadsheet: https://docs.google.com/spreadsheet/ccc?key=0AgrNS7FanmrudFhmYVhaT0UzUk5rampSMF9VR0diN mc#gid=0 the result so far look like this (Results from 500 starting hands):

1217

"Ideal" meaning what we'd expect if it was truly random.

Bossett
11-03-2013, 04:25 PM
Those curves look fairly similar to me - wonder what it'll look like with ~1000 games in it.

Quick update: You shouldn't be averaging the %s for users, you should be adding them together and doing a % out of total hands - Gorgol is skewing your results.

Another edit: With your figures, using only first hand since we don't know exactly how the mulligan redistributes cards:

https://dl.dropboxusercontent.com/s/2jqigxdzz8ga8bn/draw-dist-w-actuals.png?dl=1

For the 300 hands there is data for in your sheet, that's actually not that good a fit, but 300 goes isn't actually all that many.

Ratticus
11-03-2013, 05:32 PM
Opening hand data is fine.

Chi-square goodness of fit with the 300 raw data points
chisq(7) is 5.93, p = .5483
The results are not significantly different from the expected values.

Ditsch
11-03-2013, 05:48 PM
@Bossett i totally see it different and i disagree numbers are already only different 10 states for each digit and are not really random when generated with a "formula", a card deck is even more complex in my eyes.

I can only hope your actual data gets more close to the probability but the 300 hands seems to indicate that there could be a problem. Still thank you for your hard efforts Bossett and providing great data.

The only way to end these doubts about the random used here is to accept that software random is unacceptable. It should never be used in these 3 fields , cryptography , simulations, gaming , and hex is actually for me in the gaming field like poker ect. If you really want to do a good job with your random you need a hardware random solution, perfect solutions are quite recent but they do exist the question is if it is possible to integrate them and is it needed.

Ex given of a possible solution http://www.idquantique.com/component/content/article.html?id=9

I am working in the IT field and i have friends in who work in crypto and also program simulations alas i have no one who program online poker clients. But i will ask them and see what they think of it if a software random is sufficent enough for a complex card game to be random enough.

Werlix
11-03-2013, 05:49 PM
Those curves look fairly similar to me - wonder what it'll look like with ~1000 games in it.

Quick update: You shouldn't be averaging the %s for users, you should be adding them together and doing a % out of total hands - Gorgol is skewing your results.

Another edit: With your figures, using only first hand since we don't know exactly how the mulligan redistributes cards:

https://dl.dropboxusercontent.com/s/2jqigxdzz8ga8bn/draw-dist-w-actuals.png?dl=1

For the 300 hands there is data for in your sheet, that's actually not that good a fit, but 300 goes isn't actually all that many.

That spreadsheet isn't mine and yes I also noticed the error with the averages. In the graph I attached it did take that into account, so I think our graphs are actually the same?

Looks like a pretty good fit to me.

Bossett
11-03-2013, 05:55 PM
Sorry - that should read "that's actually a pretty good fit", I had an error in my calculation I fixed when uploading the graph and didn't edit my comment :)

@Ditsch A deck of cards is only complicated if you want every possible outcome, we don't, we want one outcome. Picking a random integer between 1 and 60 isn't particularly tricky, and I would expect that CZE is likely just using the system's RNG, which probably collects entropy from RTT, clock skew between components, etc. - stuff that is actually affected by cosmic rays.

One more graph, which makes it clear that if anything (and I'm not saying there is anything), the distribution favours the user (slight bias toward 3+):

https://dl.dropboxusercontent.com/s/4lnx9rjns7u4bbb/draw-dist-w-actuals-and-counts.png?dl=1

Ditsch
11-03-2013, 06:12 PM
@Bossett well we want 7 random out of 60 where in our case there is only 0 1 in it ressource or no ressource. No mulligan no further card drawing. And in a such simplistic setup the data should totally be closer to the probability if not we already have a huge problem. For me both graphs don't sound like a pretty good fit especially the 2 land hands seems be totally off.

Well i hope some dev can show up here and explain which random they did use, it might be the random or the implementation which is making a problem like the coin flip is already not working correctly.

Bossett
11-03-2013, 06:42 PM
Ok, spending far too much time on this.

Here is all the data from that sheet, including all the mulligans, drawn out. (This gives us some insight to the mulligan method too - appears to be a reshuffle, not cards to the bottom or the like.) The initial, first mull and second mull look like pretty good fits. The third mulligan looks almost exactly on ideal.

https://dl.dropboxusercontent.com/s/6euo0ktr3wbwhmt/draw-dist-w-mulligans.png?dl=1

(Sheet available at https://skydrive.live.com/redir?page=view&resid=E2489690E02D998A!1007&authkey=!AGgk6TIL3rraW9Y)

Niedar
11-03-2013, 06:42 PM
Stop saying there is a problem unless you can show a statistical test with numbers to support that statement. On the topic of random numbers, if they are running the servers on modern intel processors they can use RDRAND.

http://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide/

NemesiN
11-03-2013, 10:11 PM
Actually, those numbers are fairly close to what they really should be. I am sure the algorithm is fine. Random is random and if you never got screwed 10 times in a row, that wouldn't be random.

Based on purely calculating probability, these are the expected values compared to the actual values with a 60 card, 24 resource deck.

Opening Hand Expected Actual
0 Resources - 3% - 3% 0
1 Resource - 13% - 7% -.06
2 Resources - 26% - 23% -.03
3 Resources - 29% - 34% +.05
4 Resources - 20% - 21% +.01
5 Resources - 8% - 9% +.01
6 Resources - 2% - 2% 0
7 Resources - 0% - 0% 0

Given that the numbers show a healthy bell curve and that the margin of error was under .05 with only the 1 resource hand being .06, I would say that the randomizer is doing a fine job. If your sample size were larger, I bet that we would see those numbers iron out and begin reflecting the expected values more closely.

Source: Statistics/Quantitative Business Analysis major

Bossett
11-03-2013, 11:59 PM
In all cases, we're within 5% of expected - I've got a variance table in https://skydrive.live.com/redir?page=view&resid=E2489690E02D998A!1007&authkey=!AGgk6TIL3rraW9Y

Gwaer
11-04-2013, 12:09 AM
If you're so interested why do you insist on trying to disprove such a small data set, and furthermore only include a portion of that already listed. Come back when we have a more meaningful sample size. Or better yet assist in gathering a more meaningful sample size. It's too early to draw conclusions in either direction. But just from the 500 hands we already have one of the stats is over 1000% higher than it should be.

Bossett
11-04-2013, 12:18 AM
If you're so interested why do you insist on trying to disprove such a small data set, and furthermore only include a portion of that already listed. Come back when we have a more meaningful sample size. Or better yet assist in gathering a more meaningful sample size. It's too early to draw conclusions in either direction. But just from the 500 hands we already have one of the stats is over 1000% higher than it should be.

What's 1000% higher? (All my numbers, etc are from the 300 hands listed in the raw data in the Google sheet earlier.)

(The only issue with sample size I can see at the moment is that we don't have enough to have an expected 7 card resource draw - need about 1200 draws so all the expecteds are >= once)

Gwaer
11-04-2013, 12:28 AM
There are 500 hands in that doc, 100 from udare, me, maph, 200 from gorgol. And yes. It's the 7 resource hand that is too high. And yes we do need to get 700-1000 more hands on the sheet just as a start. I don't really want to submit them myself because I'd rather get a spread from many people just incase any particular client is bugged in some way.

Bossett
11-04-2013, 12:45 AM
There are 500 hands in that doc, 100 from udare, me, maph, 200 from gorgol. And yes. It's the 7 resource hand that is too high. And yes we do need to get 700-1000 more hands on the sheet just as a start. I don't really want to submit them myself because I'd rather get a spread from many people just incase any particular client is bugged in some way.

I've only been using the 300 hands that were in the raw data - need the underlying to make it work (and the math is wrong anyway in that sheet - it averages the %s, and that means Gorgol skews the data).

We have enough in those 300 to say that they behave as expected - they substantively match the expected distribution to a fairly high confidence value.

Gwaer
11-04-2013, 12:54 AM
However we slice it, there's not enough data to draw a conclusion in either direction. Still without plotting any numbers we can see that the 7 resource hand is thousands of percent higher than expected which could entirely be due to a very small sample size. It's still disingenuous to say something that should be happening .0X percent of the time being in a 5% variance is acceptable. Just need more data before anything meaningful can be drawn from the numbers.

Bossett
11-04-2013, 01:11 AM
I think what you mean is there was one 7 resource hand, which we expect to see 0.09% of the time. Instead, we saw it 0.33% of the time.

The sample becomes more significant the fewer cards you're drawing (as your expected field of results narrows), and we can observe that happening as we go through the mulligans - we observe that the line fits better as the possible outcomes decreases. In fact, if you include the mulligans, we're looking at 1200 instances, and they are all consistent with each other.

I'm fairly convinced by the data, and for it to show a skew, we would need to see the next 300-1200 results be wildly inconsistent with the above.

Gwaer
11-04-2013, 01:31 AM
We've actually seen 2 7/0 in 500 hands. And as stated repeatedly throughout this thread the data that you're trying to disprove on far too little information can potentially be useful in determining any number of potential shuffler problems. So it'd be much more helpful to gather additional data which may support your assertion, than to try to make the same argument. We hit a little under half the minimum required hands to even try to make a guess in the first day. If there's not enough interest to get us there in that time it's probably not worth the effort of making unfounded conclusions in either direction.

Bossett
11-04-2013, 02:42 AM
We've actually seen 2 7/0 in 500 hands. And as stated repeatedly throughout this thread the data that you're trying to disprove on far too little information can potentially be useful in determining any number of potential shuffler problems. So it'd be much more helpful to gather additional data which may support your assertion, than to try to make the same argument. We hit a little under half the minimum required hands to even try to make a guess in the first day. If there's not enough interest to get us there in that time it's probably not worth the effort of making unfounded conclusions in either direction.

I am a bit confused as to which argument I'm making exactly - I find the data relatively convincing in favour of a random distribution. We've got a p-value of about 0.55, which means the data is statistically convincing in favour of a random distribution (and we likely have a sufficient sample size, though more will round out the extremes). I'm in this thread talking about the quality of the shuffler and how choosing random numbers works, a topic I actually know quite a bit about, and I drew a bunch of graphs to test the idea: and the graphs (and the data and calculation behind) support the claim.

Ditsch
11-04-2013, 06:09 AM
@Nemesin Of course we need a bigger sample size but i find a difference of 5% and one of 6% in the sample data to how it should be a bit much i hope a bigger sample size can get that difference down as anything over 3% is in my view a problem. So why should be 5% be still okay from where comes that idea that it is okay ?

Growin
11-04-2013, 08:25 AM
u should beable to trade cards in and out during the starting hand just like hearthstone, its the only amazing thing in that game.

DeusPhasmatis
11-04-2013, 08:35 AM
@Nemesin Of course we need a bigger sample size but i find a difference of 5% and one of 6% in the sample data to how it should be a bit much i hope a bigger sample size can get that difference down as anything over 3% is in my view a problem. So why should be 5% be still okay from where comes that idea that it is okay ?

It's not a question of what you think is "a bit much". We can statistically quantify the likelihood that two samples/spreads are caused by the same distribution/probability. 5% is Okay because the samples are so small that the expected divergence is high. Ratticus did the math for us:

chisq(7) is 5.93, p = .5483

Which shows there is over 54% probability that the sample is what it should be.

Ditsch
11-04-2013, 08:37 AM
@Growin i should check Hearthstone out and the HEX devs should definitly not copy magic on the starting hand rules but maybe then hearthstone.

Ditsch
11-04-2013, 10:28 AM
@DeusPhasmatis thank you, but the 54% don't do it for me at all, as there is 46% chance that it is not correct. With more data it should be better and the divergence should go down, if not there is a problem. But then we reach 1k and people well tell us we need 10k ect. so i think we will need a fast method to generate that data, maybe the devs could help us out there.

DeusPhasmatis
11-04-2013, 10:47 AM
@DeusPhasmatis thank you, but the 54% don't do it for me at all, as there is 46% chance that it is not correct. With more data it should be better and the divergence should go down, if not there is a problem. But then we reach 1k and people well tell us we need 10k ect. so i think we will need a fast method to generate that data, maybe the devs could help us out there.

Random is random, a 46% probability that they aren't identical (they are incredibly similar, as can be seen from visual inspection of the graphs people have so kindly made) isn't sufficient to indicate a problem. In order to be persuasive about the outcome of random sampling, you need a high degree of certainty. Academic standards in the Social Sciences is 95% confidence, but they publish a lot of studies that turn out to be wrong later (1 in 20). Physics has a much tighter standard at 99.9% confidence (1 in 1000 wrong). We're nowhere near either of those.

styk182
11-04-2013, 04:59 PM
Meanwhile, while everyone is making all these claims and conjecture, the engineers at CZE have been using their own methods (which would be much more efficient and accurate than any that we have available to us) to test their randomizer after the OP brought this potential issue to light thereby making this entire thread pretty much moot.

Ditsch
11-04-2013, 05:30 PM
@Styk182 if it's true then they did do what i did expect of them to check their random if it holds up, of course they would have better and faster means to test it. But i doubt they will share any insights with us and if there was a problem they will fix it with out telling us.

Bossett
11-04-2013, 05:37 PM
Actually, Ratticus is wrong, the p-value isn't 0.54.

You want to look up the chi-squared distribution in a table like the one in the wiki: http://en.wikipedia.org/wiki/Chi-squared_distribution#Table_of_.CF.872_value_vs_p-value

The p-value for this data is between 0.01 and 0.05. That is, the confidence that we have of the model matching the data is between 95 and 99%. Said another way, the data overwhelming supports the conclusion that the shuffler produces a randomly ordered deck when you consider drawing resources in your first hand (including mulligans).

styk182
11-04-2013, 05:52 PM
@Ditsch that's pretty much the point I was trying to make. A lot of people on here have wasted a lot of their own time gathering large amounts of data that CZE was probably able to gather and analyze in a matter of minutes. The thread just seemed to be derailed from the original point of making CZE aware of a possible problem with the randomizer (which could have been a legit issue) to a debate on statistical theory and analysis. I don't see CZE reading anything past the first page of this thread unless for pure entertainment value. I suppose this could be considered trolling on my part as I am very late to this thread and didn't really contribute anything meaningful to it but seriously, this horse was dead 14 pages ago.

Gwaer
11-04-2013, 07:51 PM
Can you post what parameters you used to divine that p-value from this information? I'm keenly interested in where your miscalculations are coming from.



http://i.imgur.com/GRMt4Dt.jpg

The probability of a hand like that is; .000014470984848
An opening hand like that should happen less than 1.5 times in 100,000 hands. I've already seen one like it twice in less than 5,000, I've seen 7 land hands at a much higher rate as well. Needs more data so that these anomalies that have come up can fall into the background as outliers, and not a trend.

Bossett
11-04-2013, 08:14 PM
All the data so far has concentrated on resource v non-resource; the chance of any particular makeup of cards is too hard to analyse without what is probably more data than CZE has at the moment. It's worth noting that while I'm convinced it's distributing resources correctly, that doesn't exclude any other shuffler bugs (such as a bias toward multiples).

The chance of that hand through the resource lense is 2.16% - so you should see one every 50 games.

I got the p-value by going through a lookup table (linked above); assuming Ratticus got chi-squared correct (I don't have the tools for this at the moment; and the excel functions don't work without an 'expected' matrix, which is also a pain to produce in excel).

DeusPhasmatis
11-04-2013, 08:31 PM
Actually, Ratticus is wrong, the p-value isn't 0.54.

You want to look up the chi-squared distribution in a table like the one in the wiki: http://en.wikipedia.org/wiki/Chi-squared_distribution#Table_of_.CF.872_value_vs_p-value

The p-value for this data is between 0.01 and 0.05. That is, the confidence that we have of the model matching the data is between 95 and 99%. Said another way, the data overwhelming supports the conclusion that the shuffler produces a randomly ordered deck when you consider drawing resources in your first hand (including mulligans).


I got the p-value by going through a lookup table (linked above); assuming Ratticus got chi-squared correct (I don't have the tools for this at the moment; and the excel functions don't work without an 'expected' matrix, which is also a pain to produce in excel).

Ratticus's p-value is correct. Are you perhaps looking at the first row of the table? The problem we're looking at has 7 degrees of freedom (i.e. you need to know 7 out of the 8 probabilities for the Random Variable to exactly specify the distribution), so we should be using the 7th row. Additionally, the table gives the probability that the sample matches the model (specifically, that the model will produce a sample this different or more from the average), which is our confidence (no need to invert it). A p-value of 0.05 or 0.01 would mean that our model is incorrect.

Bossett
11-04-2013, 08:40 PM
We have 1 degree of freedom (one value which is allowed to vary, the number of resources in the initial hand).

This is confused because we're trying to show the results are consistent with random chance, not that they are not - which is usually what you're after when doing this sort of analysis - so I think we're working of different playsheets.

A p-value under 0.05 rejects the null-hypothesis. The null-hypothesis here is that there is no relationship between the distribution of resources seen and what you would expect from random chance.

Gwaer
11-04-2013, 08:46 PM
so you're saying that the resources are somehow shuffled differently than the non resources in the shuffler? If there is a tendency for card grouping that should affect all cards equally including resources, which means that the data would if extended bias towards 6-7 land hands more than it should, which seems to be playing out, if we had more data, rather than arguing over the pittance of data that we do currently have.

Also, the other guy is right, the degree of freedom does not indicate that a single variable is changing but rather the final state of the problem, which is either 0,1,2,3,4,5,6,7 lands. 8 possible outcomes, -1 = 7

your incorrect p value is actually the problem, the fact that you believe you can get 95% confidence out of 300 test hands is ridiculous. Absurdly ridiculous.

Werlix
11-04-2013, 08:54 PM
...which means that the data would if extended bias towards 6-7 land hands more than it should, which seems to be playing out...

What does this mean?


if we had more data, rather than arguing over the pittance of data that we do currently have

Discussions about the p-value are relevant regardless of sample size as this is the value that tells us whether we do or don't have enough data to make a conclusion. Simply saying "we don't have enough data" isn't a scientific statement.

Bossett
11-04-2013, 08:58 PM
Degrees of Freedom refer to the number of variables that are free to vary in the final outcome, not the number of values you're likely to see in the final outcome. Feel free to look that up (http://en.wikipedia.org/wiki/Degrees_of_freedom), but if we're going to argue about terms this basic, we're not going to get anywhere.

I'm not saying anything about how the shuffler works, except that, with a high degree of certainty, it produced the resource outcomes we expect from random chance. That's all.

I'm saying that the data we do have matches very well with what we expect, and that it would take significantly more data that is totally out of tune with what we have (100 more results that are all 0 or 7 would do it) to be convincing otherwise. (In a nutshell, this is what the p-value tells us, independent of sample size.)

mudkip
11-04-2013, 09:00 PM
The probability of a hand like that is; .000014470984848

Is that right? It seems a little low.

Gwaer
11-04-2013, 09:05 PM
No, you're right. that's true. The problem is entirely from the fact that you're calculating things in terms of large percentage points. we're dealing with probabilities in some of these hands in the .0X percent range that are way off, thousands of percent off. I have no idea how to prove to you that the current amount of data is insufficient because when dealing with that small of a difference it's obvious that we don't have enough data, we need thousands of hands. If you know how a chi-squared table works how can you not understand that the finer grain the information you need the more data points you need to have. It's literally incomprehensible to me.

Gwaer
11-04-2013, 09:06 PM
Is that right? It seems a little low.
That's a very simple number to get yourself, just calculate the odds of getting 3 of 4 cards in a 60 card deck in 7 cards, then multiply that result with itself for it happening twice.

mudkip
11-04-2013, 09:15 PM
That's a very simple number to get yourself, just calculate the odds of getting 3 of 4 cards in a 60 card deck in 7 cards, then multiply that result with itself for it happening twice.

Ahh, I thought you were saying it was that %age for no lands :)

What's the composition of your deck? What you're saying is that's the chance of getting 2* 3/4 in the oppening hand - but how many 4's do you have?

I also feel like your example is looking for patterns a bit much. It's like the next hand your cards come out in alphabetical order and you call shenanigans.

Bossett
11-04-2013, 09:18 PM
In a deck with 60 cards and 24 resources, the chance of drawing from the 36 non-resources is 0.0216.

Assuming they're all groups of 4, you draw one card (any card), then from the remaining 35, you have a 3/35 chance of drawing it's twin.

We have: 0.0216 * 3/35 * 2/34

From the remaining 33 cards, draw any card, repeat:

0.0216 * 3/35 * 2/34 * 3/32 * 2/31

This gives:

6.58715099 * 10^-7

Which makes Gwaer's number look a bit high, tbh.

The important thing though is that *any* specific hand is highly unlikely, if you got that twice in a row, there may be something in it.

Edit: I am assuming those are draw order, if it's different, it's a little different, since you're drawing from a pool of 6 possible cards to make that result.

Gwaer
11-04-2013, 09:19 PM
the rest of the composition of the deck doesn't matter for that particular hypergeometric calculation. It's a 60 card deck with 22 resources, and no more than 4 of each of those cards in it. If they only have 3 of those cards the probability of that hand jumps up quite a bit. Also, that's a statistical fact, not a pattern that I'm looking for, a hand containing 2 sets of 3 of 4 in a 7 card opening hand should happen less than 1.5 times in 100,000 hands even if the entire 60 card deck was made of up different sets of 4 cards.

once again bosset, I do not understand how you get the math but not its application. It's not any specific hand, ANY hand that contains 2 sets of 3/4 cards in a 60 card deck is at that probability, it doesn't matter which cards as long as they are from the same group of 4. Getting that exact hand again is a much much much larger improbability.

DeusPhasmatis
11-04-2013, 09:29 PM
We have 1 degree of freedom (one value which is allowed to vary, the number of resources in the initial hand).

This is confused because we're trying to show the results are consistent with random chance, not that they are not - which is usually what you're after when doing this sort of analysis - so I think we're working of different playsheets.

A p-value under 0.05 rejects the null-hypothesis. The null-hypothesis here is that there is no relationship between the distribution of resources seen and what you would expect from random chance.

We are comparing 8 values (the 8 probabilities in our Random Variable) with 1 constraint (that the probabilities sum to 1), which gives us 7 degrees of freedom (we would need to know at least 7 values of our Random Variable to specify it exactly). We are comparing the distribution of probabilities because this is a goodness of fit test for two distributions.

Our null hypothesis is that the sample is a result of a hyper-geometric distribution.

Straight out of the Wikipedia article about Chi-square distributions (http://en.wikipedia.org/wiki/Chi-squared_distribution#Table_of_.CF.872_value_vs_p-value):

The p-value is the probability of observing a test statistic at least as extreme in a chi-squared distribution.

Which means we reject the null hypothesis if we have very low confidence.

Looking at the example of Pearson's Goodness of fit (http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Goodness_of_fit), that is how they do it. They have a null hypothesis of even distribution, their random variable has 2 outcomes and 1 constraint, thus 1 degree of freedom, and they get a p-value of 0.23 which is not sufficiently unlikely to reject the null hypothesis.

Werlix
11-04-2013, 09:32 PM
Gwaer the problem with doing things the way you're doing them is that you can draw patterns any way you'd like after the fact and say that they're statistically unlikely. Eg, "I drew a hand with cards in ascending resource costs!", "I drew a hand with 4 sets of 2!", "I drew a hand with 2 sets of 3!", "I drew a hand with all costs > 5!", "I drew a hand with 0 resources!", "I drew a hand with 7 resources!"...

Any specific hand is incredibly unlikely, we can all look at one of these hands and draw some kind of arbitrary pattern with it.

If you think a problem exists, come up with a hypothesis then run tests on that hypothesis and analyse the data. For example, while running the tests on # resources you could add an extra column to your data and mark it with a 1 when you get 2 sets of 3 in your 7-card hand. Then we can analyse that one too :)

Gwaer
11-04-2013, 09:37 PM
I have had a hypothesis this entire time, I posted it in this very thread a long time ago. But collecting data and drawing conclusions upon it is not incorrect either. Probability is probability, either something is highly improbable and happening more than it should, or it isn't. It is currently happening more than it should. I am however not trying to draw conclusions. I'm trying to gather data, that's all I want is more people to submit data. It's other people that are drawing conclusions that everything is fine.

Niedar
11-04-2013, 10:16 PM
The problem is you are identifying a pattern, something the human mind is built for and then calculating the chance of that specific pattern happening. It is low and so you claim there is a problem. The problem with this is while the likelihood of that specific pattern is low there are many different possible patterns all with low probabilities but the probability that any pattern at all will appear is high.

Gwaer
11-04-2013, 10:21 PM
I actually have claimed repeatedly that the problems can easily fall back in line with enough data.

But you're wrong. Mathematically, plain and simply wrong. It's not a pattern. It's a highly improbable thing happening more than it should be with the tiny amount of data we have, that is not a construct of my mind. It's a fact.

Niedar
11-04-2013, 10:22 PM
Guess you still don't get it then, no one claimed it wasn't a fact that it happened and the probability of that specific event is low just that it doesn't mean anything.

Niedar
11-04-2013, 10:23 PM
Edit: double post

Gwaer
11-04-2013, 10:27 PM
It does mean something since that's just one example of it happening, I've drawn a hand just like that twice. and I haven't played even one 20th of 100,000 hands. That is statistically significant. As I said, more data can bring that number more in line, and make it insignificant, for what we have now it's a flag. I suspect that the two 7 resource hands that we drew in 500 hands in that spreadsheet are a symptom of the same problem, and that number will continue to be thousands of percent higher than it should be, which can also prove to be false with more data.

Xtopher
11-05-2013, 12:29 AM
Gwaer the problem with doing things the way you're doing them is that you can draw patterns any way you'd like after the fact and say that they're statistically unlikely. Eg, "I drew a hand with cards in ascending resource costs!", "I drew a hand with 4 sets of 2!", "I drew a hand with 2 sets of 3!", "I drew a hand with all costs > 5!", "I drew a hand with 0 resources!", "I drew a hand with 7 resources!"...

Any specific hand is incredibly unlikely, we can all look at one of these hands and draw some kind of arbitrary pattern with it.

If you think a problem exists, come up with a hypothesis then run tests on that hypothesis and analyse the data. For example, while running the tests on # resources you could add an extra column to your data and mark it with a 1 when you get 2 sets of 3 in your 7-card hand. Then we can analyse that one too :)

In agreement with this. It's like the b-day test. If you have 30 people in a room and say I predict that two of you were born on Aug. 4th, the odds are very low that you'll be right. OTOH, if you say I predict that two of you will have the same birthday, the odds are greater than 50% that this will be so. This is analogous to what's happening here (except here, not even a prediction of two duplicate oddball hands was made first making this event even more meaningless) -- there are thousands, tens of thousands, millions, billions of "odd" hands and sequences of concurrent or every other or every third, etc. hand that could have come up twice in the first 500 samples. The odds are much higher of this occurring than if you had chosen the pattern in question before you saw it come up twice (or hadn't chose any pattern and just noticed the match, after the fact).

Gwaer
11-05-2013, 01:30 AM
...The odds are the odds. When it's noticed or how it's noticed is irrelevant. The fact of the matter is, in random chance a hand that has 7 resources has a set amount of times it should show up in a large enough set of data. A hand that contains 3/4x2 has a set amount of times it should show up in a large enough set of data, the even more rare hand which I have not seen 3/4-4/4 should come up a set amount of times as well.

If that hand had only come up once in the games I was playing it could very well be just the 1 game in 100,000 games that is destined to happen in, the fact that it has come up twice in such a short period of time is a potential indication that there is a problem. More data will help find out if it is a problem or a statistical fluke.

Xtopher
11-05-2013, 02:23 AM
Well, you've calculated the probability wrong, I guess. The correct probability of an "odd" hand occurring twice in the first 500 tests, in this case, would be the sum of ALL probabilities of duplicate hands that would catch your eye as being "odd".

In someone's list of data earlier, I spotted my phone number from 10 years ago. What are the odds?? Maybe everyone should start tracking for their own past phone numbers in the strings of data. If it happens again, we'll definitely be on to something.

By all means, though, collect data. It doesn't hurt.

Gwaer
11-05-2013, 02:41 AM
What I find odd isn't a factor. The only factor is the probability of the hands, and how frequently they're coming up. I could calculate the odds of any of those things happening if it would make you feel better, but that's not what we're talking about. We're not getting esoteric with costs of cards or anything. It's very simple. What are the odds of duplicate cards coming up? They've been listed repeatedly. The one I posted most recently as an example just happens to have very low probability and fit as an example of what I based my interest in this thread on, and it just happens to be happening more frequently than its probability would dictate.

The odds are the odds though, it's a pretty easy to determine number. Just find any hypergeometric calculator, and put in the figures. 60 card deck, 4 cards, 3 of which showing up in 7 cards. That's the odds of getting any three cards in a 4 card set in your opening hand. Then multiply that number by itself to determine the probability of getting the hand pictured.

What it spits out will likely be in notation of e-X. Just move the decimal place to the left X times. Then move the decimal place to the right 2 times to get the percentage. that's how many times in 100 something should happen, move the decimal to the right one for odds in 1000, again for odds in 10,000 Etc. It's just math, it's not something that I am fabricating.

Niedar
11-05-2013, 02:49 AM
Its math that doesn't mean that there is a problem is what it is, as has been explain many times.

Gwaer
11-05-2013, 02:57 AM
It does indicate a trend towards a problem with the shuffler however, which is why I am trying to get people to submit more data. I have not once drawn a conclusion from anything that we have gathered. There is a trend for doubles and triples to appear more often than they should. Which I think probably includes resources as well. Does that even out overall so that the doubles and triples from non-resources make the resource distribution curve look accurate? It may. But the fact that we're currently trending towards 400 7 card draws in 10,000 rather than 9~ is statistically relevant. Since the sample size is so small however, there's no way to tell if that is a trend or a fluke, which is why I say it's too early to draw conclusions on the data we already have.

LLCoolDave
11-05-2013, 03:06 AM
What I find odd isn't a factor.

Yes, yes it is, because it leads you to calculating the wrong probabilities and then confirming your suspicion. The event that you have noticed to happen really isn't "I've drawn 3 of the same card in my opening hand twice in the past 200 games" or whatever the actual numbers are but rather "Some odd sequence of hands has occurred in the past 200 games that strike me as wrong". You didn't predict the specific scenario of drawing 3 of the same card as being something that happens more frequently than it should and then observed that that was the case. Instead, you retroactively look back at the last 200 hands and search for things that seem strange to you. It doesn't really matter what the specific event is, you'll settle for anything to make your case. What you completely fail to acknowledge is the tons and tons of odd and individually unlikely things that would just as much have made you take a second look but failed to show up. If you sum up over all the possible sequences that would raise your suspicion, you'll find that the number of them that actually happened in the data set to be a lot closer to what it should be than you currently claim.

You're looking for a pattern and find a pattern, but that's not at all surprising. Real random data is a lot less homogeneous than people expect it to be. I'd wager that you'd find something to worry about in any 500 hand sample I provide to you. In fact, I think it is much more likely that a 500 hand sample set that strikes you as perfectly ok and valid is fabricated by hand than by the proper probability distribution.

Furthermore, your calculations for drawing 3 of the same card in your opening hand is still very wrong. What you try to do is calculate the odds of, say, drawing 3 heat waves in your opening hand. That is not at all the event you noticed though. What you want to calculate is the odds of drawing ANY 3 identical non-resource cards in your opening hand. This is much much more likely to happen than just 9 times the odds of any specific card showing up as triplicates (assuming you run 9 playsets and 24 ressources).

Xtopher
11-05-2013, 03:10 AM
Alright, I understand where you're coming from Gwaer. I'm actually quite pleased this thread hasn't been bombarded with random people posting random, vague, anecdotal events. I guess when I see you've called attention to a potential problem, I worry that it's going to have a cascading effect and all the tin foil hat people will jump out of the virtual woodwork. That's not happening so I guess I'm worrying at the methodology needlessly.

Gwaer
11-05-2013, 03:26 AM
Sigh. Post 53 in this thread I mention that my theory about the problem has been relating to grouping. This is something I have been keeping quite close track of for a while, yes, I noticed a pattern of getting groups of the same card more than I thought it should be happening. So I've collected quite a lot of hands trying to see if that feeling is skewed by emotion. As is pointed out in this thread an uncountable number of times. I do not know for a fact that anything is wrong. But the probability that seems skewed now is the same one that seemed skewed then, so if you'd like to explain to me how the math is wrong, that I'm mistaken on how rare any given event should be feel free to do so. The idea that it's in my head is ludicrous though. Since I'm as far as I can tell the only person in the thread that is open to the numbers changing my mind.


also as pointed out before those are the odds for drawing any three non resource cards that exist in a group of 4 in a 60 card deck. If you feel that hyper geometric math does not cover this please feel free to explain which formulas you feel fit better and why.

mudkip
11-05-2013, 06:58 AM
In a deck with 60 cards and 24 resources...
Nicely Math'd

I'm starting to hope it comes out that Gwear is right and cards are stuck together (the dealer needs to put down the wings!). We would owe him so many beers.


I noticed a pattern of getting groups of the same card more than I thought it should be happening.
Gwear, what is the composition of a hand that wouldn't confirm your theory? What I'm trying to get at is e.g. the chance of seeing a hand with no groups is pretty slim as well.

Willd
11-05-2013, 07:06 AM
It does indicate a trend towards a problem with the shuffler however, which is why I am trying to get people to submit more data. I have not once drawn a conclusion from anything that we have gathered. There is a trend for doubles and triples to appear more often than they should. Which I think probably includes resources as well. Does that even out overall so that the doubles and triples from non-resources make the resource distribution curve look accurate? It may. But the fact that we're currently trending towards 400 7 card draws in 10,000 rather than 9~ is statistically relevant. Since the sample size is so small however, there's no way to tell if that is a trend or a fluke, which is why I say it's too early to draw conclusions on the data we already have.

The bolded phrase has a very specific meaning and these results absolutely do not show anything statistically relevant. In fact your very next statement directly contradicts it by saying that we don't the sample size to say whether it's a fluke, which means exactly that the results so far aren't statistically relevant.

Your calculation as to the probability of the hand you got happening is also off by more than an order of magnitude. You were on the right lines but your calculation was 3 of each of these two specific cards. You need to multiply it by the number of possible pairs of sets of 4 cards you could get, which in this case is 9C2 (assuming 36 non-land cards all in 4-ofs) - 36, making the actual probability 0.00052. Also that includes hands where you get 4 of one and 3 of another, if you want the probability of exactly 3/3/1 then you need to multiply it again by 52 (number of cards that aren't one of the two sets of 3).

Ratticus
11-05-2013, 07:44 AM
The df are correct. I used STATA with the csgof plugin but you can check it on any stats program you like.

Odds of your starting hand having multiple copies of a card occurrences happening. Assuming your deck is 24 resources 36 other cards with those consisting of 9 cards with 4 copies each. (call them orc, dwarf, elf, human etc)
2+ of a non-resource 44.44%
3+ of a non-resource 3.43%

Having multiple copies of a non-resource card in your starting hand is not particularly rare. The reason is that while having 3+orcs is only .39% (hypergeometric distribution) having 3+orcs or 3+humans is .77% and having 3+ orcs or 3+ dwarfs or 3+ elfs or 3+humans etc becomes reasonably likely.

As a side note it is a common mistake when calculating this is to use .39%*.39%. This is the chance of BOTH events occurring (getting 3+ dwarfs AND 3+orcs) when what we want is the chance of EITHER event occurring (getting 3+ dwarfs OR 3+ orcs). To calculate the chance of either event occurring calculate the chance of the event not occurring (1-.39) so 99.61% and the chance of neither event occurring is 99.61%*99.61%=99.23%. You get back to the chance of the event occurring by 1-chance of not occuring or 100%-99.23% or .77%.

If you specifically want me to test this and get me the raw data it's trivial for me to test. Literally copy and paste the data in write 1 line of code and I'm done. All of the data would need to come from decks with the same proportions of cards. So 24 land 9x4 is fine or 24 land 5x4, 8x2 is fine but mixing and matching would be an issue.

Xtopher
11-05-2013, 07:45 AM
Yeah, that is an issue I've been having. When people (not just Gwaer) use terms like "statistically relevant" I'm not sure if they mean it from a layman's perspective or how a mathematician would use the word. A lot of statistical jargon has a very specific meaning in mathematics and a looser meaning within the general population.

Niedar
11-05-2013, 08:21 AM
Actually multiplying the two probabilities gives you the chance of drawing three orcs in one hand and then in the next hand drawing three orcs again. To determine the probability of getting 3 orcs and 3 dwarves in the same hand you need to use a multivariate hypergeometric distribution which ends up actually being more likely than drawing 3 orcs twice in a row.

Edit: In fact I knew there had to be articles written about this for mtg, and I found a very nice explanation just now after searching.

http://www.gatheringmagic.com/chrismascioli-100512-of-math-and-magic-part-1-the-hypergeometric-distribution/

zaril
11-05-2013, 09:23 AM
I'm not going to jump in on the statistics, but seeing as the thread started with someone saying early "I clicked through 100 games" I just wanted to say, there's a draw hand function in the game through the deck manager. The dropdown where you load/save has deck stats, where you can draw your hand. Anyway, intriguing discussion, I'll shush now. :)

Willd
11-05-2013, 09:59 AM
I did some of the calculations that gwaer was using myself and realised that what I said in my previous post wasn't accurate, I think because trying to use a normal hypergeometric calculation to determine multivariate hypergeometric results isn't a good idea.

Anyway, I got that the actual probability of a 3/3/1 hand containing no resources to be 0.0000418, so aabout 3x higher than gwaer's original number, not as far out as I claimed previously. If you allow the singleton card to be a resource I think the probability goes up to 0.000216 but I'm not completely sure about that calculation.

Ebynfel
11-05-2013, 10:01 AM
I'm not going to jump in on the statistics, but seeing as the thread started with someone saying early "I clicked through 100 games" I just wanted to say, there's a draw hand function in the game through the deck manager. The dropdown where you load/save has deck stats, where you can draw your hand. Anyway, intriguing discussion, I'll shush now. :)

Theyre using the in game draw as that is the only way to be 100% certain that the mechanics work exactly the same as their control for the testing. If even a misplaced punctuation or extra letter exists in difference with the check hand tool, it would flaw the entire dataset.

Gwaer
11-05-2013, 12:18 PM
Once again, when I say that is statistically relevant, I mean it is a flag that more data may make irrelevant. But it's unexpected from what we have.

Also, my odds are the odds of any 3 cards of any 4 card group into a hand at the same time. Not specific cards, it doesn't matter how many different 4 card groups are present in the pool. Those are the odds of it happening in a 60 card deck period.

the odds for seeing any hand that only contains 1 or two of any particular card is a good question, I'll work it out when I'm not on my ipad.

Werlix
11-05-2013, 01:05 PM
Ok Gwaer so from looking back your hypothesis is that cards "clump" together more often than they should. Therefore instead of calculating the exact probability of specific "clumped" hands (eg 2 sets of 3 cards) you should be testing against drawing anything that could be defined as "clumped".

You either define what a "clumped" hand is then test that, or make a deck with 15 x 4 cards then draw 7 card hands and note how many 2 ofs, 3 ofs and 4 ofs you draw.

To me this seems like the only way to test your hypothesis. Resource vs non-resource obviously doesn't matter so I think just doing the 15 x 4 cards option would be the easiest to test.

Gwaer
11-05-2013, 01:23 PM
If enough people Contribute to this very easy question that this thread is about it could provide a very high confidence that my theory is wrong and it has just been random chance, rather than collecting much more annoying data. And answer the first question at the same time.

McKizz
11-05-2013, 11:37 PM
1232
Regarding clumping, I cannot begin to tell you how many of these I have drawn back-to-back-to-back-to-back. Nearly every game I've drawn 3+ in a row, even with them consisting of <10% of my deck. Not that I mind, seeing as I get 4 2/2's for 4 mana and a 5th card drawn. My opponents don't take too kindly to it though.

Anecdotal, yes. But still factual.

Ditsch
11-06-2013, 06:38 AM
@McKizz yeah i had also already 2 games (in 4 games in total) with people playing that card and they got it way to often , had also the same thing with the pack raptors. But yeah it's quite difficult to have data on that one. :( I am pretty sure there is a clumping problem but to focus on that we might need a other thread and also data on it.

rcl
11-06-2013, 08:39 AM
This is like when I would go to bingo with my grandmother and the old ladies with lucky trinkets laid out in front of them would get mad and yell "shake 'em up!!" Or "New caller!!"

Please don't be "that person" ...

Tinuvas
11-06-2013, 09:03 AM
If enough people Contribute to this very easy question that this thread is about it could provide a very high confidence that my theory is wrong and it has just been random chance, rather than collecting much more annoying data...

And I think this is why this thread still has life. One person would like to see more work done by others (for a legitimate reason as far as it goes, but still, work done by others) to satisfy him about something that most everyone else is currently satisfied with. This results in most others trying to convince him that it's all good already because they are currently satisfied with the result as is, even though he isn't, for reasons that are fully valid for him. I don't have an issue with any of this, just pointing it out.

This all is complicated by the moving target of the conversation...scope creep as it were. Is there a problem with the shuffler in handing out resources? It takes x amount of analysis to determine (with relative certainty) whether this is so. Throw in 'clumping' of any sort? New problem, start over, more work for everyone. Clumping for particular cards (Ancestral Spectre)? New problem, start over, more work for everyone. Mulligans just being 6 card redraws with the previous hand hitting the bottom? Escalation card reshuffles? <insert any observed odd randomization here>? New problem! start over! more work for everyone!

I'm not saying the shuffler IS flawed or is NOT flawed. I think what I'm saying is that WE are flawed. Gwaer, I wish you the best in getting enough data to satisfy your concerns. As I am satisfied with the randomizer ATM, I will not be spending my resources (read: time) to assist in this project. I think many others feel the same based on how many further data sets have appeared in the last couple of days. Perhaps that is a bad attitude, or maybe just a lazy one. *shrug*

TL;DR Whether or not the shuffler is correctly randomizing is less important that that most people (based on what I am reading in this thread) perceive that it is so based on current data. That perception will significantly slow down the collection of more data on a volunteer basis.

Hatts
11-06-2013, 09:06 AM
This is like when I would go to bingo with my grandmother and the old ladies with lucky trinkets laid out in front of them would get mad and yell "shake 'em up!!" Or "New caller!!"

Please don't be "that person" ...

Don't listen to him! Buy my anti-clumping hex trinkets, starting at just $19.99. It works best with the anti resource flood and anti resource screw trinkets.

Get the full set of trinkets for just $49.99!

Buyers will get 50% off if they've posted 'statistical proof' of their bad luck in this thread. Order now, supplies are limited!

rcl
11-06-2013, 09:56 AM
Don't listen to him! Buy my anti-clumping hex trinkets, starting at just $19.99. It works best with the anti resource flood and anti resource screw trinkets.

Get the full set of trinkets for just $49.99!

Buyers will get 50% off if they've posted 'statistical proof' of their bad luck in this thread. Order now, supplies are limited!

No no no DO NOT buy those trinkets or else you will experience too much anti-clumping (scattering) common beginner mistake you get 1/4 of the cards you are trying to pull BUT then the rest get glued to the bottom of your deck

Gwaer
11-06-2013, 12:15 PM
You're right tinuvas, I am trying to get more people to assist. I've done quite a lot of data gathering, but to get a good baseline I need at least 5 instances of every possible outcome to happen, that should be around 5,000 hands. Though at our current rate it might only be 1000 more. I'll do it for myself eventually just by keeping track as I play games. Though I'm not interested enough to grind out hands constantly until it's done anymore, this thread was about a topic that was a passing interest but also interesting as a social experiment. It's also not pressingly important at this juncture. Does it really matter if hands are random in alpha? The shuffler allows card testing to happen it's not broken to the point it is unusable. This is more of a polish question and there's a long time to go before release.

Werlix
11-06-2013, 01:35 PM
So all these concerns basically boil down to one question and one question alone - is the randomiser used in Hex "random enough"? If it is truly random then there will be no problems with over clumping, under clumping, too many resources, not enough resources. I figure if we can prove it's random then there's nothing else to prove.

Therefore I've been conducting a simple experiment where I put 15 x 4 cards in my deck then draw hand after hand in the "test draw" part of the UI. I know you'll say it doesn't count cause it's not in the real game but I think it's still a valid way to test the randomiser because there's no way they would program two separate randomisers to use.

So in my tests I just test one thing: are the first two cards in my hand the same card? (I also mix it up and sometimes look at the last two instead, just in case) This is a simple way to test one variable over and over and get some reliable results on how random the randomiser is.

Given true randomness the probability of the first two cards being the same should be 3/59, am I right? This is 0.0508474576271186.

Currently my results are:

Hands drawn......: 520
Duplicates.......: 27

Expected result..: 0.0508474576271186
Actual result....: 0.0519230769230769

Can someone calculate the p-value for that? I'm not sure how that works :)

DeusPhasmatis
11-06-2013, 02:13 PM
So all these concerns basically boil down to one question and one question alone - is the randomiser used in Hex "random enough"? If it is truly random then there will be no problems with over clumping, under clumping, too many resources, not enough resources. I figure if we can prove it's random then there's nothing else to prove.

Therefore I've been conducting a simple experiment where I put 15 x 4 cards in my deck then draw hand after hand in the "test draw" part of the UI. I know you'll say it doesn't count cause it's not in the real game but I think it's still a valid way to test the randomiser because there's no way they would program two separate randomisers to use.

So in my tests I just test one thing: are the first two cards in my hand the same card? (I also mix it up and sometimes look at the last two instead, just in case) This is a simple way to test one variable over and over and get some reliable results on how random the randomiser is.

Given true randomness the probability of the first two cards being the same should be 3/59, am I right? This is 0.0508474576271186.

Currently my results are:

Hands drawn......: 520
Duplicates.......: 27

Expected result..: 0.0508474576271186
Actual result....: 0.0519230769230769

Can someone calculate the p-value for that? I'm not sure how that works :)

Using Python (with numpy and scipy), I get ~91% p-value for Pearson's chi-square goodness of fit test (assuming your calculation for the expected result is correct).

Showsni
11-06-2013, 09:39 PM
So, I carried out a simple null hypothesis test for myself.

A 60 card deck consisting of 30 wild shards and 30 ruby shards was created. Opening hands of 7 cards were drawn in the Deck Stats trial hand mode, and the number of ruby shards in each hand counted.

H0 (Null Hypothesis): In a seven card starting hand, I expect to draw 3.5 ruby shards, on average.

H1: I draw more or less than that.

Raw data! I drew 250 starting hands, recording the number of ruby shards. Here are my results. I also ran into an interesting and repeatable bug: http://forums.cryptozoic.com/showthread.php?t=29757

22244344535243414344435241543655664344362443742446 343375324456354344452304314553
46332453745252044543523533453354434423442434122463 045133535343134435324243433555
43555334445553343344531333456454532250232544444354 333423334335544554524634243655
65353454214

Or,

0: 4 times
1: 8 times
2: 27 times
3: 67 times
4: 79 times
5: 50 times
6: 12 times
7: 3 times

Which gives me an average hand of 3.688 ruby and 3.312 wild. The chance of getting that is 30.636779783%, which is well above the 10% significance level.

So, by that test I'd keep the null hypothesis, and say there's nothing necessarily wrong with the shuffler. Other than, you know, causing the game to crash when you run it 250 times in a row.

Xtopher
11-07-2013, 12:12 AM
Crazy, tho, that you drew 7 resources 3 times in only 250 lands.
(just sayin'... don't judge me)

Ditsch
11-07-2013, 02:56 AM
I am more shocked about getting 4 times no ruby in 250 draws with 50% of the cards being ruby shards. That seems to be how i perceive it with my own deck getting quite often 0 shards (well i only run 28 shards). The average seems to be okay, i never had problem there, but 0 and 7 shard hands showing up a bit to much.

@Werlix thank you for the data, alas 2 card clumping was not a problem that i did see, it was 3 out of 4 cards and getting 2 of them in starting hand and drawing the 3rd one in the next 3 draws to (so 3 of 4 in 10 cards).

Niedar
11-07-2013, 07:29 AM
A 3 card clump would have an affect on his test.

Werlix
11-07-2013, 01:05 PM
@Werlix thank you for the data, alas 2 card clumping was not a problem that i did see, it was 3 out of 4 cards and getting 2 of them in starting hand and drawing the 3rd one in the next 3 draws to (so 3 of 4 in 10 cards).

This is how the game works: deck is shuffled randomly, cards are drawn from the top one at a time.

If we can prove that the shuffler works randomly then we can prove there is no problem with any kind of anomalies (eg 3-card "clumps").

Are you claiming 3-card clumps turn up too often? If so you're also claiming that the randomiser is not random. There's no way the randomiser can be truly random AND somehow give you more 3-card clumps than it should.

Xtopher
11-07-2013, 01:41 PM
Crazy, tho, that you drew 7 resources 3 times in only 250 lands.
(just sayin'... don't judge me)
Ah, lol, nm. The deck was all resources. I was thinking it was 24 like the others have been.

Showsni
11-08-2013, 05:19 AM
Crazy, tho, that you drew 7 resources 3 times in only 250 lands.
(just sayin'... don't judge me)


I am more shocked about getting 4 times no ruby in 250 draws with 50% of the cards being ruby shards.

The chance of getting all ruby and the chance of getting no ruby are the same - 0.00527126753. With 250 draws, I'd expect to hit that chance about 1.31781688428 times; so hitting it three times (for all ruby) and 4 times (for no ruby) really isn't that far off.

(Actual chance of getting exactly 3 all ruby hands in my setup is 0.102149589018982 (10%) and 4 no ruby is 0.0334259670935629 (3%), so within the realms of possibility.)

If my initial hypothesis had been "The shuffler is random, so drawing a no ruby hand should come up every 189.7 hands or so"
Alt hyp: "It comes up more often than that"

then I suppose that those 4 no ruby hands would be significant at the 5% level... So maybe that is something to test. In other words, the experiment says that hands do average out to what you'd expect, but maybe the overall distribution is a bit off, with more outliers than you'd expect.

Ditsch
11-09-2013, 01:30 AM
@Showsni true all the small data sets seems to point to average okay but a bit to often the outliers. The outliers are a bit of problem as those are the ones that force you to mulligan and so you are at a disadvantage more often then you should.

@Werlix of course it can it can be random and still have recuring sequences.

UDareUTake
11-09-2013, 06:35 AM
The math has gone way beyond my understanding =/

MuffLord4
11-09-2013, 05:21 PM
I feel like half of my games are decided on resource screw, either the cardpool isn't big enough to warrant us good games yet or the algorythm is effed up like hell.

I feel like cards are stuck together sometimes, having very often the same cards in my hand up to 3 times. You can try that out when using inspiration engine, you'll often get the same card multiple times, it just doesn't feel random.

My opponent just created 8 turbines in a row while i was sitting on 3 red mana for ~15 turns. while having 16 purple, I had basically 3 pairs of the same cluster of cards while not having seen some of my cards at ALL unless i drew until infinity.


Edit: now he's just getting monoliths.

Niedar
11-09-2013, 07:04 PM
Inspiration engine is a known broke card. It determines what card it is going to create at the beginning of the turn and keeps on generating that card throughout it.

Showsni
11-10-2013, 01:22 AM
I think the next thing to do would be to test the complete randomisation - set up a deck of 60 cards numbered 1 to 60 (i.e. all different), then shuffle and draw the whole deck many times, keeping note of the order the cards come in, and look for patterns. Unfortunately the scroll bar in the Deck Stats window is broken (it won't scroll) so you'd have to get into an actual game to test this, making it take far longer than it should.

Werlix
11-10-2013, 01:56 PM
@Showsni true all the small data sets seems to point to average okay but a bit to often the outliers...
@Werlix of course it can it can be random and still have recuring sequences.

What?

So you're saying the deck shuffler can be truly random AND at the same time produce "too often the outliers"?

I'm particularly interested in your use of the words "too often" - implying that there's something wrong with the deck shuffler despite its randomness.

Werlix
11-10-2013, 01:58 PM
I think the next thing to do would be to test the complete randomisation - set up a deck of 60 cards numbered 1 to 60 (i.e. all different), then shuffle and draw the whole deck many times, keeping note of the order the cards come in, and look for patterns. Unfortunately the scroll bar in the Deck Stats window is broken (it won't scroll) so you'd have to get into an actual game to test this, making it take far longer than it should.

We don't really need to go to these lengths. See my earlier post where I tested the occurance of the first two cards in the deck being the same card in a deck of 4 x 15 cards. If the randomiser was true then we know how often to expect this result.

VoidInsanity
11-11-2013, 03:23 AM
I too am starting to experience this. Every game now if I have a card in my hand and there is multiples of it in my deck, I will draw the other copies of it within the next few turns. I have ended up with a hand of entirely the same cards (4 of one, 3 of the other) multiple times now.

Xtopher
11-11-2013, 09:41 AM
Sigh.

Werlix
11-11-2013, 12:48 PM
Sigh.

Werlix
11-11-2013, 12:57 PM
I kind of feel defeated at this point. When we use research, math and probability to come to conclusions then people post things like "I'm starting to experience this too"...

VoidInsanity, please read the results of my last experiment then answer these two questions:

1. Is the deck randomiser truly random?
2. Are there more "clumps" of cards in the deck than than there should be?

Cheers.

zaril
11-11-2013, 02:47 PM
Theyre using the in game draw as that is the only way to be 100% certain that the mechanics work exactly the same as their control for the testing. If even a misplaced punctuation or extra letter exists in difference with the check hand tool, it would flaw the entire dataset.

True, eventhough they could as much stealthpatch the randomizer in midtest as well depending from where it draws its numbers. :) But your point is definitely valid, just figured, since I'm a programmer that atleast I would use the same functionality to draw cards, you can't have different randomizers when drawing in different places. But yeah, it's just guesswork. :)

Gwaer
11-11-2013, 03:35 PM
Also, if there's a bug in the shuffler it won't necessarily come out in the first two cards of a hand, why didn't you just do the entire hand werlix? Those aren't results, those are as much of an unfounded hypothesis as what I put forth. I've honestly lost interest in this thread, I admit I did a bit of a social experiment here, I put obviously flawed math in this thread to backup my real hypothesis as a test to see if anyone would call me on it, and in doing so show that they were able to actually have a meaningful discussion about this. Basically no one did. So there's very little point in continuing to argue with people that don't understand what they're talking about. Long story short, anyone that doesn't work for CZE who says they know that everything is fine, is lying or stupid, anyone that doesn't work for CZE who says they know there is a problem is lying or stupid, and there's just no way to know with the information we have available, and I'm not willing to devote time to gathering that information, when as I said a bit ago, It's alpha, the cards we are getting dealt are passable in this situation, it's not a high priority to fix it if there is anything wrong, so CZE certainly shouldn't be devoting manpower to it now.

I may pick this up after release, there's talk of a feature that lets you log games, and or share those games so it will be easier to get an unbiased sample of a lot of hands then.

Werlix
11-11-2013, 04:00 PM
Also, if there's a bug in the shuffler...

All the shuffler does is randomise the deck. To test randomness it doesn't matter where we look inside the deck. The top two cards of the deck should be in random order just as the rest should be. Pretty simple. Any other problems with the data or was that it?

Gwaer
11-11-2013, 04:12 PM
The bug could actually be anywhere between randomizing the cards to choosing the cards. That stack of images you see to the right doesn't have to be an already ordered deck. Unless you have some inside knowledge into how the entire process is programmed to work, your method leaves a lot of ground available. Moreover, if the problem does lie in the shuffler it could present itself in any number of ways that your "test" would miss.

If for example it is being fed random numbers from an external source, and not getting them fast enough during high load, it may take the 9th card 5 or 6 times in a row for example. If all the cards are stacked together in groups, those 5 or 6 times could be 4 of the same card and then 2 of the next set of cards, that bug would only be presenting itself when whatever is providing the RNG is under heavy load, which seems more likely to happen at some point during the deck than early on, or if it's an intermittent problem could be happening at any point in the deck other than the two cards you sampled in each hand. A 7 card sample would provide a much higher rate of hitting it, since you're tripling the available information you're looking at in any given deal.

You don't have the information required to make the declarations you're making. Honestly, no one not at CZE does, because no one has done the required amount of information gathering the make any real determinations.

noragar
11-11-2013, 04:42 PM
With effect like peek which place certain cards in certain positions in the deck, Shrine of Prosperity which affects the card beneath it, a couple cards that place cards on the top or the bottom of the deck, and who knows what strange effects that may come along in the future, I would be shocked if they programmed it any way other than having it place each card in the deck in a random order whenever a shuffle is called for and store that order until the next shuffle.

Gwaer
11-11-2013, 04:50 PM
Also, notice. None of those cards are in the game yet. We have no idea what sorts of temporary implementations are in effect right now. That's why this conversation might as well be tabled until release, or at the very least after some polishing has been done in beta and all the cards are in the game.

Vorpal
11-11-2013, 05:36 PM
SI know you'll say it doesn't count cause it's not in the real game but I think it's still a valid way to test the randomiser because there's no way they would program two separate randomisers to use

I'm not sure this is correct.

Yes, they wouldn't build a separate 'randomiser' but computers don't really make random numbers anyway. They key off of seed values.

The seed values used in the two instances may well be completely different. There could easily, depending on how they are choosing the seed value, be a bug in one situation but not the other.

Unless we know how they are generating their seed values, we can't make this assumption.

Vorpal
11-11-2013, 05:41 PM
I'd also like to suggest a possible solution, if indeed there is a problem, that is heavily employed in boardgames that use a deck of cards.

Let's say you have 4 random events in a deck of 60 cards, and you want them to occur roughly evenly, but not have players aware of when they will happen.

You split your deck into 4 piles of 14, place one random event in each pile, to make a deck of 15, shuffle those individual decks together, then stack them one on top of the other to make the deck of 60. While there is still a great deal of randomness, the 'random events' are spread out much more evenly than they would be if you just shuffled them in with every other card.

HEX could do the same thing with resources. 20 resources in your 60 card deck? Make 20 decks of 3, each of which contains a resource, then stack them all up to form the draw deck.

Vorpal
11-11-2013, 05:49 PM
All the shuffler does is randomise the deck. To test randomness it doesn't matter where we look inside the deck. The top two cards of the deck should be in random order just as the rest should be. Pretty simple. Any other problems with the data or was that it?

I think this too is an assumption we can't necessarily make.

Why would you assume the mechanic for drawing your opening hand is the same as that for drawing subsequent cards? You might think it makes sense to do it this way, but I could very easily see it being implemented a different way programmatically, because of the mulligan.

Presumably HEX knows the location of each card in your deck - that is, it doesn't have an unordered pool of cards and just hand you a random one when you ask for it. This is because quite a few cards refer to modifying the cards in the deck that are above or below them, and you can interact with the top card on your deck in some ways.

Do they build the deck before or after you get your opening hand? If you mulligan, do they rebuild the deck before or after your new opening hand? How are the seed values chosen in this case?

Werlix
11-11-2013, 06:02 PM
The bug could actually be anywhere between randomizing the cards to choosing the cards. That stack of images you see to the right doesn't have to be an already ordered deck. Unless you have some inside knowledge into how the entire process is programmed to work, your method leaves a lot of ground available. Moreover, if the problem does lie in the shuffler it could present itself in any number of ways that your "test" would miss.

If for example it is being fed random numbers from an external source, and not getting them fast enough during high load, it may take the 9th card 5 or 6 times in a row for example. If all the cards are stacked together in groups, those 5 or 6 times could be 4 of the same card and then 2 of the next set of cards, that bug would only be presenting itself when whatever is providing the RNG is under heavy load, which seems more likely to happen at some point during the deck than early on, or if it's an intermittent problem could be happening at any point in the deck other than the two cards you sampled in each hand. A 7 card sample would provide a much higher rate of hitting it, since you're tripling the available information you're looking at in any given deal.

You don't have the information required to make the declarations you're making. Honestly, no one not at CZE does, because no one has done the required amount of information gathering the make any real determinations.

Even if cards were randomly drawn from the stacked deck instead of randomised once at the start (this would be horribly innefficient, illogical and also pretty impossible given cards like noragar mentioned) there's no reason why the same random number would be used for drawing consecutive cards because of server load - this doesn't make sense from a programming point of view.

I can appreciate that we don't know exactly how everything works but I think making assumptions like "the deck is randomised then cards are drawn from the top" is pretty ok.

I'd like to see any evidence that anything is "wrong" with the deck shuffler. So far two sets of data collection have pointed (with a small amount of uncertainty) towards the deck shuffler being fine. Those people who genuinely think there is a problem should make a hypothesis, conduct some tests and post their results. At the very least state a hypothesis rather than just saying "something feels wrong".

Xtopher
11-11-2013, 06:25 PM
Well, Gwaer, it was very obvious from the beginning that you didn't fully understand stats. I didn't call you on it because you seemed to mostly understand probability and I was being polite. It's interesting to find out you were faking it.

VoidInsanity
11-11-2013, 06:50 PM
I kind of feel defeated at this point. When we use research, math and probability to come to conclusions then people post things like "I'm starting to experience this too"...

VoidInsanity, please read the results of my last experiment then answer these two questions:

1. Is the deck randomiser truly random?
2. Are there more "clumps" of cards in the deck than than there should be?

Cheers.

1. = Definitively no. I have made decks around exploiting this fact. 50 mana decks with 10 cards should not give me reliable draw. I am noticing patterns with the draw based around the ratio of cards in a deck, it is definitively not random.
2. = Yes, without question.

Werlix
11-11-2013, 07:15 PM
1. = Definitively no. I have made decks around exploiting this fact. 50 mana decks with 10 cards should not give me reliable draw. I am noticing patterns with the draw based around the ratio of cards in a deck, it is definitively not random.
2. = Yes, without question.

Oh great we have some claims :) Interested in your use of the word "definitively" however. I'd like to see your data that backs this up.

So did you see my experiment where I made a deck of 4 x 15 cards then made test draws to see if the first two cards were the same card? I drew 520 hands and experienced the 2-card "clump" 27 times. The expected probability of this happening is 0.0508474576271186 and my observed rate was 0.0519230769230769 - in other words the observed rate of clumping was not significantly different than the expected rate from true randomness.

Also did you see the earlier results of test draws based on number of resources drawn in opening hands and post-mulligan hands? Basically the results showed that the numbers also didn't significantly differ from what was expected.

Xtopher
11-11-2013, 07:54 PM
I wonder if it's possible for the shuffler to be random for some people and not for others, based on their system specs or the method by which the random seed is generated.

From my experience playing, I can't say anything definitive, but as far as I can tell the shuffler is working fine for me. But then another person declares that absolutely it's broken. If it's a case of the shuffler working incorrectly for some (and probably a minority, from what I'm seeing) but fine for the majority, this is going to be an impossible problem to track down and correct.

Niedar
11-11-2013, 08:17 PM
If you think the shuffler is client side then there are bigger problems than it not being random.

Xenavire
11-11-2013, 08:19 PM
I have no idea how to feel about this. I had 2-3 no resource hands today alone (24 resources), but I have also had several optimal hands. I have yet to see any resource only hands, which is good. I doubt I have even gotten close to 200 games yet, but I am well over 50. (And I mean completed games, I am sure I have seen a good deal more hands overall. Thanks priority bug, glad I found the end key workaround.)

I am really interested in this though. I tend to see a lot more hands with less resources than more. And I have yet to make a deck under 24 resources. I think the most resources I have seen so far is 5 at a time.

Xtopher
11-11-2013, 08:32 PM
If you think the shuffler is client side then there are bigger problems than it not being random.
I haven't indicated what I think beyond saying I haven't noticed a problem with the shuffler.

I'm considering the proposition that some are adamantly saying it's broken while others are saying it seems fine, without assuming the broken camp is a bunch of nutters.

Xtopher
11-11-2013, 08:35 PM
I have no idea how to feel about this. I had 2-3 no resource hands today alone (24 resources), but I have also had several optimal hands. I have yet to see any resource only hands, which is good. I doubt I have even gotten close to 200 games yet, but I am well over 50. (And I mean completed games, I am sure I have seen a good deal more hands overall. Thanks priority bug, glad I found the end key workaround.)

I am really interested in this though. I tend to see a lot more hands with less resources than more. And I have yet to make a deck under 24 resources. I think the most resources I have seen so far is 5 at a time.
I can say with certainty that I've seen no zero resource hands and am fairly certain I've seen no one resource hands. Also, the most resources I've had is 5, never 6 or 7.

It sounds like the number of games I've played is in the ballpark with your total games played.

VoidInsanity
11-11-2013, 08:40 PM
Oh great we have some claims :) Interested in your use of the word "definitively" however. I'd like to see your data that backs this up.
Create a deck with 59 mana and one unit and generate hands. The odds of you getting that unit on the first draw are 60-1 (1.666666666%).

I just did a quick test, here are the results. A 1 means I got the card, a zero means only mana. This is using a deck of Green/Purple 50% split.

01100000011000000011000000000000000101001000000000 000000010000111

13/65 which is 20%. Also I got two lots of draws where i got it twice in a row (0.027777777% of happening) and at the very end one of 3 in a row (0.000462962% chance of happening). Also if you notice the times I did manage to draw that lone card are very close together in the test, forming "clumps".

Now the same test again, but with two of that card in the deck. Once again, 0 for no, 1 for 1 of the card drawn and 2 means two.

00010001001000010001000010000000011000000000000010 1010001000010000100001

At point the alpha crashed but I did not manage to draw two of the same card twice in the opening hand but I ended up with a 20.5% draw rate on getting that single card in my opening hand.

Now when I use 4 of that card in a deck, I get this.

02010001111100010311100110100010000001210011210000 0101100000100212011002

Notice once again how clumped the results are. I am too tired to do the maths myself but I am sure someone can take the above and work out what the Probability of it happening should be and what the Probability I ended up experiencing was. Either way without even going such maths it looks extremely unbalanced to me

noragar
11-11-2013, 11:45 PM
You have 7 chances to draw the one card, so the odds are 7/60 = 11.66667%, not 1.666667%. The rest of your calculations are similarly off.

So your results were higher than expected in the first trial (20% vs 12%), lower than expected (20% vs. 22%) in the second trial, and higher than expected (45% vs. 40%) in the third trial. With the small sample sizes, though, none of the differences are statistically significant (they don't show any evidence that the underlying distribution the samples were drawn from is different than expected.)

VoidInsanity
11-12-2013, 02:49 AM
I did say I was too tired for maths for a reason. However from experience the system is iffy. Always draw clumps of cards, certain cards never seen and everyone I have played against has agreed with this. Single coloured decks are far more reliable for this reason because the system is clearly not random.

Edit - As an example - I just drew a hand of 1 mana and 3 lots of a pair of cards for the 3rd time this night.

SealFate
11-12-2013, 03:45 AM
From my observations (and I have not bothered keeping a record) when referring to cards that I have 4 or more copies in a deck, the following happen:
A Card in hand increases chance of drawing the same card.
A Pair in my starting hand and I will draw a 3rd copy by turn 6 with out fail (ok may be 9/10).
A Three of a kind in starting hand (has rarely happened outside of shards) and I will have the 4th by turn 5.
A starting hand of 4 shards and I will draw another 3 shards over the next 5 draws (in a deck containing 24 shards)

Niedar
11-12-2013, 06:48 AM
Just so you know, the chance of getting a card there is only one of in your opening hand is 11.6 % not 1.6%.

Edit: Oh looks like this went to next page, and has already been said.

Vorpal
11-12-2013, 08:13 AM
there's no reason why the same random number would be used for drawing consecutive cards because of server load - this doesn't make sense from a programming point of view.

Are you a programmer? It makes perfect sense.

If there is heavy server load or lag and your seed values get stuck, that's exactly what would happen. Given that one of the most popular ways of picking a seed value is the server time in milliseconds, you can see exactly how this type of problem might arise.

Right now our cards routinely disappear into thin air with heavy server load. Claiming there's no way heavy server load could cause any other ill effects just seems bizarre to me - and making far too many unfounded assumptions.

Werlix
11-12-2013, 01:00 PM
Are you a programmer? It makes perfect sense.

If there is heavy server load or lag and your seed values get stuck, that's exactly what would happen. Given that one of the most popular ways of picking a seed value is the server time in milliseconds, you can see exactly how this type of problem might arise.

Right now our cards routinely disappear into thin air with heavy server load. Claiming there's no way heavy server load could cause any other ill effects just seems bizarre to me - and making far too many unfounded assumptions.

Yes I am a programmer and no it still doesn't make sense to me that server load would produce the same random number hundreds of milliseconds apart. Yes it could affect the random number generation but it wouldn't cause it to product the same number. Also if it did this in any significant way it would have affected previous experiments but as has already been shown, all experiments point towards correct randomness.

And as for your last paragraph I never claimed "there's no way heavy server load could cause any other ill effects". Please refrain from straw-manning :)

Werlix
11-12-2013, 01:03 PM
From my observations (and I have not bothered keeping a record)

Thanks for your observations... They sound like nicely formed hypotheses ready to test :)

I'd suggest taking one of them and conducting an experiment then posting your results, we're still yet to see an experiment that suggests the randomiser is not truly random :)

I'm in the middle of an experiment which should address at least two of them so that should be interesting, will let you know how I get on...

Werlix
11-12-2013, 01:16 PM
You have 7 chances to draw the one card, so the odds are 7/60 = 11.66667%, not 1.666667%. The rest of your calculations are similarly off.

So your results were higher than expected in the first trial (20% vs 12%), lower than expected (20% vs. 22%) in the second trial, and higher than expected (45% vs. 40%) in the third trial. With the small sample sizes, though, none of the differences are statistically significant (they don't show any evidence that the underlying distribution the samples were drawn from is different than expected.)


I did say I was too tired for maths for a reason. However from experience the system is iffy. Always draw clumps of cards, certain cards never seen and everyone I have played against has agreed with this. Single coloured decks are far more reliable for this reason because the system is clearly not random.

Edit - As an example - I just drew a hand of 1 mana and 3 lots of a pair of cards for the 3rd time this night.

Don't give up yet, VoidInsanity! You need to keep drawing more of those hands from your previous experiment. If there's a problem as you say, the observed results should start to trend away from the expected results.

Remember don't try to find patterns after they happen (like your 'as an example'). If you think there's a problem, define it (eg drawing 3x 2-pairs in opening hand) then test it :)

Vorpal
11-12-2013, 01:53 PM
Yes I am a programmer and no it still doesn't make sense to me that server load would produce the same random number hundreds of milliseconds apart. Yes it could affect the random number generation but it wouldn't cause it to product the same number

Suppose they make a call to a serverside function to return the seed. Suppose there is heavy server lag and two different calls get processed at the same time or an incorrect time or otherwise return the same time based seed. Then you could easily wind up producing the same random number.

joseph5185
11-12-2013, 02:27 PM
I'm impressed (to say the least) that this thread is now somehow 24 pages long over resources.

I have played several games and I have not experienced any issues with randomness.

I think people simply don't like it and feel they need the edge when drawing their opening hand and try to come up with these mathemtical formulas, therories, algorithms, etc..

Have a 60 card deck, give yourself 25 resources (I don't personally use that many) and I think you'll find your odds of pulling multiple resources on opening hands in a very favorable fashion.

I'm not going to go into the exact numbers percentage wise, but I think you get the idea..

Werlix
11-12-2013, 03:02 PM
Suppose they make a call to a serverside function to return the seed. Suppose there is heavy server lag and two different calls get processed at the same time or an incorrect time or otherwise return the same time based seed. Then you could easily wind up producing the same random number.

Ok sure so if the seed is the time and somehow two function calls requested hundreds of milliseconds apart run at identical milliseconds for some reason - this could happen.

It doesn't really matter anyway as the deck is only randomised once rather than on each draw. And even if for some reason it did work this way, if it skewed things enough we would be picking this up in our experiments - which we aren't.

Werlix
11-12-2013, 03:04 PM
I'm impressed (to say the least) that this thread is now somehow 24 pages long over resources.

I have played several games and I have not experienced any issues with randomness.

I think people simply don't like it and feel they need the edge when drawing their opening hand and try to come up with these mathemtical formulas, therories, algorithms, etc..

Have a 60 card deck, give yourself 25 resources (I don't personally use that many) and I think you'll find your odds of pulling multiple resources on opening hands in a very favorable fashion.

I'm not going to go into the exact numbers percentage wise, but I think you get the idea..

Yep we've done this exact experiment (except with 24 resources). You can find the results with graphs earlier on in the thread. The results found were very close to the expected results.

mudkip
11-12-2013, 03:27 PM
I'm impressed (to say the least) that this thread is now somehow 24 pages long over resources.

Really? This is one of those subjects like religion and politics. And like those subjects, most "evidence" is anecdotal.

joseph5185
11-12-2013, 03:58 PM
Really? This is one of those subjects like religion and politics. And like those subjects, most "evidence" is anecdotal.

I couldn't agree more.

Werlix
11-12-2013, 05:21 PM
Really? This is one of those subjects like religion and politics. And like those subjects, most "evidence" is anecdotal.

It is interesting to see that all anecdotal evidence supports the theory that the deck randomiser is not random yet all researched evidence supports the opposite...

Xenavire
11-12-2013, 05:32 PM
I think there might be something to the whole 'serverload' idea. Had a couple more no resource hands (24 resource decks) today, and had a lot of other issues besides. Yet on days with little to no major bugs, I have had nearly flawless numbers of resources.

Perhaps the randomiser is getting scrambled by server load.

Vorpal
11-12-2013, 06:07 PM
Ok sure so if the seed is the time and somehow two function calls requested hundreds of milliseconds apart run at identical milliseconds for some reason - this could happen.

It doesn't really matter anyway as the deck is only randomised once rather than on each draw. And even if for some reason it did work this way, if it skewed things enough we would be picking this up in our experiments - which we aren't.

Do we know for sure the deck is randomized only once?

Do they create the whole deck and then draw the top 7 cards?

Or do they just hand you a random 7 cards from the 60 available, and then only once you've decided not to mulligan, decide to construct and randomize the draw deck?

If you don't like your initial hand and mulligan, do they draw the top 6 cards of your draw deck and give them to you? And then do they reshuffle the deck or not reshuffle? Or do they reshuffle everything, then give you the top 6 cards from the new deck?

There's a lot we don't know about how their process works.

noragar
11-12-2013, 06:19 PM
I think there might be something to the whole 'serverload' idea. Had a couple more no resource hands (24 resource decks) today, and had a lot of other issues besides. Yet on days with little to no major bugs, I have had nearly flawless numbers of resources.

Perhaps the randomiser is getting scrambled by server load.

On your hands with no resources, did you have clumps of non-resource cards that would have been next to each other in your decklist? If not, what would cause the server load to randomize 60% of your deck, but not 100% of it?

Same question for those saying that the cards are "clumped", but not immediately next to each other (i.e. if I have two in my opening hand, then I'm more likely to draw a third in the first few turns, etc,) Server load may theoretically cause the deck not to be randomized, but how would it cause a "weak shuffle" where cards get broken up but not very far from each other? The deck is either randomized or it's not. Unless it's programmed really strangely (like simulating an actual shuffle and server load causing it to crash in the middle of the first riffle), then how does it wind up "slightly randomized"?

joseph5185
11-12-2013, 06:30 PM
Do we know for sure the deck is randomized only once?

Do they create the whole deck and then draw the top 7 cards?

Or do they just hand you a random 7 cards from the 60 available, and then only once you've decided not to mulligan, decide to construct and randomize the draw deck?

If you don't like your initial hand and mulligan, do they draw the top 6 cards of your draw deck and give them to you? And then do they reshuffle the deck or not reshuffle? Or do they reshuffle everything, then give you the top 6 cards from the new deck?

There's a lot we don't know about how their process works.

I want to say the entire deck is randomized again when you mulligan. At least it should be.

To back this, I have mulliganed and received the same card again. However, I only had one instance of this card. That would pretty much prove that ALL the cards are randomized again after mulligan.

Niedar
11-12-2013, 06:30 PM
Werflix, you should just give up. Every time you prove someone's feelings wrong with data they will always have another theory on when or what is not working right. Soon we will be talking about lunar cycles.