Aggregation of Non-Binary Predictions: Difference between revisions

add context to vertical and horizontal combinations of distributions
No edit summary
(add context to vertical and horizontal combinations of distributions)
 
Line 7:
[[File:Combination of the CDF of different forecasts.png|thumb|Combination of the CDF of different forecasts]]
 
Probability distributions can be combined either based on their probability density functions (PDF) or their cumulative density functions (CDF). Usually, forecasts are combined using the CDF.
 
=== Vertical combinations of several CDF ===
Several CDF can be either combined horizontally or vertically. A horizontal combination of several CDF is equal to a combination of the quantiles of the CDF. A vertical combination of the CDF is equal to a mixture distribution that combines the cumulative densities of the individual forecasts.
A vertical combination of the CDF is equal to a mixture distribution that combines the cumulative densities of the individual forecasts. If the density of the ensemble CDF is a linear combination of the densities of the individual forecast CDF, then this ensemble is called a linear pool<ref>https://ewifo.econ.kit.edu/downloads/kk_revision.pdf</ref>.
 
One known issue with linear pools is that they are not necessarily calibrated, given that all individual member forecasts are calibrated (see Theorem 3.1(c) of the linked paper)<ref>https://projecteuclid.org/journals/electronic-journal-of-statistics/volume-7/issue-none/Combining-predictive-distributions/10.1214/13-EJS823.full</ref>, but rather are systematically over-confident. Linear pools may, of course, be well calibrated in instances where member forecasts are under-confident.
 
=== Horizontal combinations of several CDF ===
A horizontal combination of several CDF is equal to a combination of the quantiles of the CDF. These quantile ensembles are often used for example in Epidemiology<ref>https://pubmed.ncbi.nlm.nih.gov/35791416/</ref>. There exists no such theoretical argument that would state that a quantile ensemble needs to be miscalibrated, if all members forecasts are well calibrated.
 
=== Combinations of PDF ===
When combining forecasts based on their PDF, then only a vertical combination is sensible. When combining using the mean, then it does not matter whether we combine functions based on their PDF or CDF, as the sum of integrals is the same as the integral of a sum of two distributions. For combinations based e.g. on the median of the cumulative or non-cumulative density at a given point, differences may occur (although these will typically not be very large).