Looking at the 'Market price not available' problem again...
GetMarketValue and PDFs (Probability Density Functions) Quick summary of how it works: GetMarketValue requests a price PDF for the item from each Stat module. The Stat modules also provides a price range for their PDF (lower - upper limits). All the useful values of the PDF will fall within this range, and outside that the the PDF will return (near) zero.
GetMarketValue samples all the PDFs at intervals to find the price midpoint of them combined (using integration by the rectangle method).
Sampling starts from the lowest of the lower limits, and the initial sample interval is based on the widest range for the PDfs (1/100 of the overall range from lowest lower limit to highest upper limit).
Suppose one Stat module produces a PDF with a very wide range, and another produces a very narrow one. Then the sampling interval, based on the very wide ranged PDF, may actually be larger than the entire range of the narrow one, and the sampling process could then skip over the 'useful' part of the narrow PDF. This situation throws off the integration calculation, and can cause the 'Market Price: not available' problem, or it can cause wildly high incorrect 'midpoints' instead.
BellCurve Most of our Stat modules use a BellCurve model for their PDFs. This has the advantage of being simple and easy to calculate using just a mean and standard deviation. The lower/upper limits are usually set as mean-3*stddev to mean+3*stddev.
There are some disadvantages: It's symmetrical, while most price distributions are lopsided, with a longer tail to the right. It's non-zero for prices less than or equal to zero; prices less than or equal to 0 are not possible, so the PDF should really return 0 probability for those values.
If stddev exceeds mean/3, the lower limit will be negative, meaning that some of the 'useful' part of the PDF actually covers negative prices. This isn't really valid but we can't avoid it, due to the symmetry of the PDF, but it means GetMarketValue must spend some time sampling (impossible) negative prices.
When stdddev reaches mean*2/3 the lower limit will be -mean. Assuming sampling stops approximately when we reach +mean, this means GetMarketValue is spending as much time sampling negative values as it spends sampling positive ones - a bit of a waste of CPU cycles.
StdDev type Stat modules These record the last X auctions, and calculate mean and stddev from those records.
From watching tooltips it looks like Stat-StdDev normally produces stddevs in the range 1/3-2/3 mean, occasionally exceeding mean (and rarely exceeding several multiples of mean). Stat-iLevel seems to produce much higher stddevs, often exceeding mean.
For these modules the lower limit will usually be negative, sometimes hugely negative, and the range can be extremely wide.
EMA type Stat modules E.g Stat-Simple, Stat-Purchased These record 3 EMAs (plus a daily average), and calculate the mean and stddev of those values. The stddev is then a measure of price drift over time.
If 'time' (seen days) is low the stddev will be very low - it can be near zero, and unfortunately our code only checks if it actually is 0. A PDF with a low stddev has a greater influence on GetMarketValue's calculations (effectively 'high confidence'), which is the opposite of what we should have for a low sample size.
Also, I don't think this produces a stddev on the same scale as other Stat modules, it usually seems to be much smaller.
A rough gathering of suggestions from assorted places:
* Vary the size of the sample interval during the pass, using smaller samples if we are within the bounds of narrower PDFs. (I forget who made this suggestion -sorry) We don't want to add too much extra processing to the main calculation loop, but I can see in theory how this could be done with some extra pre-processing.
* Clamp the stddev used when making a PDF, such the the stddev never gets too small, relative to the mean, and the lower limit never gets too negative. E.g. clamp the stddev between 0.01*mean and 1*mean (or some other values). Simple to implement, but really only a stopgap. It's pretty rough treatment for the stddev, but then I suspect not much rougher than trying to force it into a bellcurve in the first place...
* If GetMarketValue detects very narrow and very wide PDFs, discard the widest ones. A wider PDF indicates lower confidence, so if we are to discard any it should be those.