Sunday, November 16, 2014

How It's Made: Price Flagging

I owe my partner in crime, Etienne Erquilenne, a huge debt for adding a much needed second pair of hands onto this whole Prosper project.  His IRL expertise is completely invaluable, and has freed me up to accelerate the development schedule measurably.  But, as I've expressed on the blog before, I'm very much a fan of open sourcing.  So let's look under the hood.

Volume Flagging

Volume data was straight forward.  Since they never go negative, and rarely jump by orders of magnitude, it was pretty easy to wrap the values up into a normal-ish histogram.  Below is Tritanium:
1yr of Tritanium Volumes - The Forge

Not perfectly normal, but close enough where we can use percentiles to check the "sigma levels".  
For those who aren't statistics nerds, we can say things about values depending on how far they are deviated from the normal.  +/- 1 deviation (sigma) should be relatively normal behavior.  +/- 2 deviations should be extremely rare.  The further we deviate from the norm linearly, the exponentially fewer values we should see at those levels.

Since volumes are largely well behaved, I used this principle of sigma flagging to highlight extreme outliers to report for the Prosper show.  

Price Flagging

Price values aren't so well behaved, and using the same approach is not going to flag useful data:
Just looking for straight price outliers isn't useful.  The only things that will flag are long rises/declines which represent the extremes of the last year.  We'd like to use the same extremity methodology for prices, but a different approach would be required.

Deviation From a Trend

The inspiration came from the Bollinger Band chart.  Simply, it puts a simple-moving-average trend line and then moving-deviation bars around the chart (red lines on matchstick above).  If we instead characterized the distance from a trend, we'd be able to say things about "this is an extreme deviation".

This is a far more "normal" plot.  Also, Etienne rolled in simple-moving-median to compensate for items that might have outrageously bi-variate behavior because of a paradigm shift due to a patch.

Unfortunately, without some sort of second filter, we're going to flag everything every week, and that's not a useful filter.  So Etienne added a voting scheme and "highest votes" binning technique to properly classify the outcoming flags.



Results So Far

So far, this is my favorite validation that the pull is working as intended:


Here we see a peak last week, and a drastic crash in progress.  Though we would have reviewed the data anyway (fuel is a forced group in the tools), finding it in the expected flagging group is a great sign.

To explain what I see in the first graph: we see a spike in pre-Phoebe stockpiling, then a rapid dump off once the patch hit.  What I also see in the above is an heavy overcorrection in the price, dumping it much lower than really makes sense.  If you were watching this product, this would be a great opportunity to buy hoping for a snap-back.  Especially looking at the bottom RSI chart, closer inspection shows that the product is crossing heavily into "oversold" territory, and is strongly signalling an artificially low price.  Of course, balance these price signals against the volume flags (perhaps slightly anemic) to temper expectations. 

What's going on under the hood is the tool is checking the closing average against the white-dotted moving average line.  Though the moving average will catch up, right now the distance from the trend is WAY out of whack.  Especially since it's voted for "very abnormally low" for 5 days, along with 2-3 votes for "abnormally high", this was going to end up in the charting group regardless.

Great... but

Now I have a new problem... too much good data.  To keep the outlier segment inside 15 minutes, I have to filter the pick list to 15-25 items.  Etienne's new tool flagged 500 items, and the true-positive rate is astounding.  For those looking to get into some powerful market automation, this methodology is extremely powerful and should help boil down opportunities like nothing EVE has seen before.  Though we still lack the means to automate "black swan" events like expansion releases, the flagging methodology is very useful for the active trader.

Also, for all of its power, we're up against a problem where the show format and the goals don't match.  Two of the chief goals of the show is to showcase investment opportunities and general trend information going into the weekend.  Unfortunately, the flags are very good for a very short period.  Many of the flags show high pops during the week, after the action has expired.  So, if the trend isn't cooking on Tues-Weds, the show will miss the opportunity to report it.

Lastly, it's a little hard to use this as a direct day-to-day trading tool because of the way the CREST feeds update.  Rumors are that CCP is going to roll out a "one day" CREST feed to simplify keeping the dbs up to date.  

What's Next?

We have some tasks on the table, but the short term goals are:
  1. Bring in destruction data
  2. Do inter-hub analysis
  3. Build indexes
Things are moving along pretty well.  I expect that zkillboard data will be live by the first week of December, and a few more QOL updates should make the show prep move along easier.  Also, rumors of new CREST feeds should improve the quality of data we're pulling (or make our lives even more difficult).  Regardless, with new hardware coming in at home, the ability to automate more should make things move more smoothly.

4 comments:

Unknown said...

Im sure you are but make sure your not just trying to take data and filtering it to get some form of standard deviation. The purpose of stats are to apply rules (oh which there are many) to see if there is anything of interest such as outliers.

Croda said...

perhaps if you also filtered by average volume. For example, an outlier that normally trades 2 per day is less useful than an outlier that trades 50 per day.

Also, given Eve is on one server then the moment a trade opportunity is highlighted on your show then it will be arbitraged out. For me, the best parts of your show are the information (i.e. highlighting the movements) and the explanations behind the movements.

Unknown said...

Blogger ate my first try...

The trends are built on a per-item basis. So as long as there are sufficient samples, even if they are low per-day, we should be able to build decent trending. Though there are hard filters for minimum volume, and number of samples, to help eliminate the most extreme outliers. The methods allow us to define a methodology and apply it to every item in the market, rather than hand-tuning 10k individual black boxes.

Now, we do have a problem of reducing false-positives. As we add more crunching methods, we will increase the number of items flagged. Right now, we're at ~500 to sift through by hand. If we cross the 750 threshold (~10% of items checked), flagging routines will need to be re-tuned to better hone in on the <5% we have time to cover in the show. We also have been having trouble with being "too late" to a trend, where the action is played out before the show even airs... but we will need better prediction methodologies to root that out.

Lastly, you're mostly right about the arbitrage point. I disagree about how efficient the EVE markets are. The existing toolset is extremely rudimentary, and only the most obvious arbitrage flags will be cashed out. Part of this show is to help encourage a new generation of market tools that will lead to an even more efficient market. We have already identified a few arbitrage opportunities that are not being exploited with our investigations and (hopefully) we will have reports on that in the near future once some more experimentation is done.

Agatir Solenth said...

Wow... I am glad I stumbled upon your efforts. I'm not a huge industry/trader, but you have introduced me to another aspect of the game that I'd like to get into. (I started in 05) You guys and your tools, blog, and weekly reporting are making this economic and statistical stuff understandable! Thanks!

Post a Comment