PDA

View Full Version : PSA: The AH Data is Incomplete [FIXED]



israel.kendall
05-24-2015, 05:30 PM
I often see people in game chat saying "x" card only sells 3 copies a day, it's not worth listing, or should be lower priced. So I'm making this post to let you all know that the AH data is not showing all the sales of cards. If you think only 2 CMK sold yesterday, well it could quite possibly have been 10x of them sell yesterday.

I will present the evidence, and this is not the first time I have tested this. This morning I bought 9x Zodiac Shaman. 8x for 4 plat, and 1x at 7 plat.

http://i.imgur.com/dm6Vdlm.jpg

The CSV data for today is out, and here is a screen from my reader:

http://i.imgur.com/92zmOIs.jpg

As you can see, only 2 Zodiac Shaman are listed as being sold for plat on May 24, when we know for a fact that at least 9 were sold.

This effects volume as well as averages / medians. And it is something I feel like everyone should be aware of when they are buying and selling cards.

Also, I notice a correlation between the missing sales and buying multiple cards at the same price.

Kami
05-24-2015, 05:35 PM
Only other thing I can think of is that it's a time zone issue.

israel.kendall
05-24-2015, 05:38 PM
Only other thing I can think of is that it's a time zone issue.

Done the test multiple times. The sales never show up, not a time zone issue. AH data usually comes out 7-8pm my time, this was done at almost 6am. I would not post this if I were not 100% sure.

KingGabriel
05-24-2015, 05:42 PM
Not the first time it's broke either. >.>

israel.kendall
05-24-2015, 05:46 PM
Not the first time it's broke either. >.>

Really it's been like this the entire time. Just test it from time to time to see if it is ever fixed yet.

Tazelbain
05-24-2015, 05:56 PM
Ya there is no volume data. A while ago Chark said he was going ask Datadragon and get back to us. I bet they don't want use to give us that for business reasons.

israel.kendall
05-24-2015, 05:59 PM
Ya there is no volume data. A while ago Chark said he was going ask Datadragon and get back to us. I bet they don't want use to give us that for business reasons.

I'm not sure, but I contacted DataDragon a while back about it. He replied a few days later that it was fixed, but it wasn't. I sent a reply that it was not fixed but I never heard back after that. This was all end of March, early April this year.

Bmon
05-24-2015, 07:01 PM
This post is 100% true. Multiple copies of a card sold for the same price seem to show up in the AH data as one sale. I've tested it myself on 4 bulk sales, and I've noticed it in general on a few cards I've watched but didn't buy myself.

poizonous
05-24-2015, 07:04 PM
In more shocking relevant news, I wanna know how the worst card in the game is continuing to sell on the AH? This card is not an upgrade to any starter deck card

IronPheasant
05-24-2015, 10:08 PM
In more shocking relevant news, I wanna know how the worst card in the game is continuing to sell on the AH? This card is not an upgrade to any starter deck card

Collectors.

Since it costs almost nothing, many think little of throwing down ~500 gold for a playset. Gotta get a playset of all the commons for pauper, gnome wat I'ma saying?

Spending anything above 1.2 plat on them is indeed an affront against humanity though, I agree.

israel.kendall
05-24-2015, 10:18 PM
The low volume of sales is the reason I chose zodiac shaman for the test. Last reported plat sale before this was May 14 I think.

Kroan
05-25-2015, 04:27 AM
I suspected this when I saw the amount of Booster trades each day. They are a lot lower then what I imagined (I mean, sometimes I sell those amounts all by myself during a day).

silverlocke
05-25-2015, 04:41 PM
There must be a reason that they're not fixing it, and not addressing it.

RamzaBehoulve
05-25-2015, 05:43 PM
I've been observing that lately as well.

Not that it matters much anyway. The cards showing as selling a lot are indeed the ones selling a lot.

Selanius
05-25-2015, 10:28 PM
There must be a reason that they're not fixing it, and not addressing it.

Its probably because they are busy getting Set 3 and other content out.

Yoss
05-26-2015, 09:59 AM
Please fix data errors. :(

Mokog
05-26-2015, 10:56 AM
Could be due to how the query calls on the database returning a single value for some transaction groups. Queries and large database structures can sometimes have items fall through the cracks. Alt art items are an issue with some of my numbers. The delimiter I used was based on name, currency, date but alt arts will carry the same differentiating values in those fields. So I have to add in an additional field to query the right data.

We could be seeing a more complex version of a similar problem.

nicosharp
05-26-2015, 11:03 AM
Finally, Zodiac Shamans have a use!

Khazrakh
05-26-2015, 11:06 AM
Finally, Zodiac Shamans have a use!

I happen to have "some" more if you need a broader test group... ;)

Gwaer
05-26-2015, 11:07 AM
Also could be intentional, this gives us a good idea of what things are selling for but not an exact volume so that you can make definitive statements about volume.

Kroan
05-26-2015, 11:10 AM
Queries and large database structures can sometimes have items fall through the cracks. Only if your sql is flawed. It's not like a database won't return items because of the size. :P

Mokog
05-26-2015, 11:22 AM
Large was probably the wrong word to use. Complex maybe a better fit.

hex_colin
05-26-2015, 11:23 AM
If it were my business, I'd intentionally be giving enough information to be useful to the community, but certainly not every transaction. And, I wouldn't be telling anyone how much of the data they were getting (since you could extrapolate from there).

Tazelbain
05-26-2015, 11:38 AM
Except...

We think that we're posting all of the data. Of course given a few concerns in this thread earlier, I've asked DataDragon to take a look at our SQL. We don't want to obscure AH sales data as I believe availability of this information leads to liquidity in the market (i.e. more people want to pull the trigger on buying/selling stuff when they feel that they are getting fair market pricing info).
So, some clarity would be awesome.

Assassine
05-26-2015, 12:51 PM
If it were my business, I'd intentionally be giving enough information to be useful to the community, but certainly not every transaction. And, I wouldn't be telling anyone how much of the data they were getting (since you could extrapolate from there).

Would you, however, tell the community that that is whats going on? Because im pretty sure people wouldnt complain if Hex Ent would state that this is their stance to it.

hex_colin
05-26-2015, 01:02 PM
Would you, however, tell the community that that is whats going on? Because im pretty sure people wouldnt complain if Hex Ent would state that this is their stance to it.

Of course. "Here's a representative subsection of the AH data to help you gauge trends, generate approximate pricing information, etc."

A moot point since Chark has publicly stated that his intention is to provide everything (for now). Busted SQL FTL?

DataDragon
05-26-2015, 03:19 PM
Greetings, data lovers.
I have been monitoring the situation, and our intention is to share all available data with you, worry not.

The reason why this has been an ongoing thing is because there are more moving pieces involved than just the report.
The query used is a rather simple query that does not group data together, and the only clause on it is the date and escrow status.
Our report also can not produce any auctions that have not been cleared due to rollbacks, cancels, and reserves not met.
One of the issues that is causing problems is that we are pulling from the active auction table and the archived auction table to ensure we don't miss any auctions that may be in flight, and applying a union to the data that may de-duping what thinks is duplicate data.
The report doesn't care about the time of day, only the date, so the union action thinks this is all duplicate data and is throwing it out.

I will need to manually rerun the reports for each day in the past to ensure all missing data is produced.
I can confirm that after rerunning the report for 2015-05-24, the missing Zodiac sales appear, which seems to confirm the theory.

Your interest in the data is greatly appreciated, as it helps to confirm suspicions I have had but couldn't quite track down until now.
We will be discussing these options internally, and hope to get things resolved to satisfaction.
The data is there, it is just a matter of making sure the processes are working correctly.

Yoss
05-26-2015, 03:21 PM
You sound just like the database guru I work with at my job. :)

Keep up the good work.

EDIT:
Why can't you just pull from archive and let the next day catch what you missed?

DataDragon
05-26-2015, 03:37 PM
The archive is always a day behind, so you would be waiting essentially 2 days to get the completed data.
That is why we pull from both, and only accept settled auctions.
We are trying to get the data to you as soon as is possible.

israel.kendall
05-26-2015, 03:37 PM
Greetings, data lovers.
I have been monitoring the situation, and our intention is to share all available data with you, worry not.

The reason why this has been an ongoing thing is because there are more moving pieces involved than just the report.
The query used is a rather simple query that does not group data together, and the only clause on it is the date and escrow status.
Our report also can not produce any auctions that have not been cleared due to rollbacks, cancels, and reserves not met.
One of the issues that is causing problems is that we are pulling from the active auction table and the archived auction table to ensure we don't miss any auctions that may be in flight, and applying a union to the data that may de-duping what thinks is duplicate data.
The report doesn't care about the time of day, only the date, so the union action thinks this is all duplicate data and is throwing it out.

I will need to manually rerun the reports for each day in the past to ensure all missing data is produced.
I can confirm that after rerunning the report for 2015-05-24, the missing Zodiac sales appear, which seems to confirm the theory.

Your interest in the data is greatly appreciated, as it helps to confirm suspicions I have had but couldn't quite track down until now.
We will be discussing these options internally, and hope to get things resolved to satisfaction.
The data is there, it is just a matter of making sure the processes are working correctly.

Sounds like the right man is on the case!

http://www.healthyobsessions.net/wp-content/uploads/2010/03/data_trim21.png

Yoss
05-26-2015, 04:30 PM
The archive is always a day behind, so you would be waiting essentially 2 days to get the completed data.
That is why we pull from both, and only accept settled auctions.
We are trying to get the data to you as soon as is possible.

I see. Rather than a data archive (which you obviously still need to have, but might not be ideal for this task), have you all considered using a data warehouse solution that's updated more frequently? The client that I built at work connects to a data warehouse that's updated every 20 minutes. Granted, it's not a huge database, so perhaps I'm gaining benefit from the small transaction volume. Still, 48 hours is a rather large jump from 20 minutes.

Anyhow, I'll stop speculating. You know how to do your job better than I do. :)

PureVapes
05-26-2015, 11:50 PM
Looks like it may be fixed.

Kroan
05-27-2015, 01:53 AM
Yeah. The stats now make much more sense. 230 trades in Set 2 is much more to what I thought should be traded rather than 19 (see hexprice.com for a better view of the impact)

ossuary
05-27-2015, 04:47 AM
I laugh every time people say Zodiac Shaman is useless. I've sold at least a couple dozen of them on the AH for non-zero amounts of gold, and even a handful for plat. There's always SOMEONE who wants/needs a card and would rather just pay a bit for it than wait for it to show up in an opened pack (or god forbid waste a real draft pick on it!). :)

Jemy000
05-27-2015, 04:54 AM
Greetings, data lovers.
Our report also can not produce any auctions that have not been cleared due to rollbacks, cancels, and reserves not met.

What's this about reserves? AH improvement incoming?

israel.kendall
05-27-2015, 08:41 AM
Looks like it may be fixed.

Yeah the data for the 26th appears to be much more accurate. I'm seeing 20x burn sold rather than 4 or 5, and same thing for other popular commons I've checked. Also over 400k in plat sales on the day.

Tazelbain
05-27-2015, 08:46 AM
Well this going to be another shock to the economy. We have had quite a few shocks in the past 3 weeks. Interesting to see where things finally settle to.

Glad we are going to start to see the whole picture.

israel.kendall
05-27-2015, 09:02 AM
Just to show the magnitude of what these changes will mean, here is a chart (from hexsales.net) assuming the data on the 26th is at least mostly complete.

http://i.imgur.com/Gu2kdSb.jpg

Assassine
05-27-2015, 12:40 PM
Gonna be interesting to see what effect this is going to have on card prices.

darkwonders
05-27-2015, 01:27 PM
It'll probably affect packs more than cards.

Since the data pull was ridding itself of duplicate prices, more pack prices got discarded because people would probably be listing more of them at the same price. I know I would put 5-10 packs up at a time at the same price.

On the card side, people would probably only list 1 or 2 at a time and place it at a price lower than the current lowest. I do that all the time too. So there'd be a bit more unique price sales due to this.

It is nice to finally see the true volume of packs/cards being purchased :)

ossuary
05-27-2015, 01:39 PM
Now if only hexprice would parse some of that to show more of the gold prices (how about plat and gold side by side for each card, eh???).

Tazelbain
05-27-2015, 01:40 PM
But packs affect the gold : plat ratio which has major implications in this gold-hungry environment.

israel.kendall
05-27-2015, 01:43 PM
It'll probably affect packs more than cards.

Since the data pull was ridding itself of duplicate prices, more pack prices got discarded because people would probably be listing more of them at the same price. I know I would put 5-10 packs up at a time at the same price.

On the card side, people would probably only list 1 or 2 at a time and place it at a price lower than the current lowest. I do that all the time too. So there'd be a bit more unique price sales due to this.

It is nice to finally see the true volume of packs/cards being purchased :)

It will be most noticeable with common and uncommon cards that are often listed at the same prices. But is still important for rares, legendaries, and AA.

israel.kendall
06-17-2015, 06:09 PM
I tested again today with 12x Zodiac Shaman and it seems all 12 showed up. Thanks DataDragon for the quick fix, and gratz to whoever unloaded all these Zodiac Shaman on me.

Jonesy
06-19-2015, 06:05 PM
This is pretty funny reading data dragons post as just today at work I had an issue with UNIONing four tables together forcing a distinct on each table instead of just making sure no duplicates existed in multiple tables and voila I change it to a UNION ALL and bug fixed. Weird seeing my work and my play merge.