How to Start Forecasting

This article is work in progess. The author is working on it and it is not yet ready for review.

Picking a Community[edit]

There are several active Prediction Platforms. They all have advantages and disadvantages. Sign up for one!

Picking a Question[edit]

There are a variety of types of questions. For this discussion, I'm going to leverage an imaginary question about the length of the Cold War. Of course, we already understand how this should resolve. The point is not to predict something novel, but to understand the variety of ways of looking at a forecasting topic.

Binary Questions[edit]

The simplest type of question is probably binary questions. They are typically structured like this:

Will the Cold War end before 2000?

There are two key components to note here. First is the resolution criterion. That is the thing that we're trying to forecast. In this case, the end of the Cold War. The other is the resolution date. That's the date by which we're predicting a the resolution criterion will happen (or not). We need to have a resolution date because over very long timelines, most questions stop making sense. For example, if we predict on whether the Cold War will ever end, the answer is "almost certainly" and you should predict something very close to 100%. But this isn't particularly informative!

The forecaster's job is not to say whether or not this will happen, but to assess the probability that it will happen. So, we could respond by entering "85%". That is, we claim that at the time at which we make the forecast, we are 85% certain that it will resolve positively.

Categorical Questions[edit]

Categorical Question

Who will win the Cold War? USSR, USA, Stalemate, No one

Here we are dividing the probability across four possible outcomes. (We may note that Binary Questions are simply the special case of Categorical Questions for which the outcomes are "resolution criteria happened before resolution date" and "resolution criteria did not happen before resolution date").

Continuous Questions[edit]

Another question type is continuous questions. These are structured a little bit differently:

How long will the Cold War last?

Here it is taken as given that the Cold War will end, the question is merely a matter of how long it will take.

We may note that this harkens to the same kernel as the exemplar Binary question. Date questions yield significantly more information than Binary Questions, because they output a distribution over outcomes, rather than a probability at a given specific date. Importantly, we can then leverage that distribution to provide probability estimates at any point or range within the outcome space.

Date Questions[edit]

Date Questions are those in which we predict at what date an event will occur. For example,

When will the Cold War end?

Again, the Date Question is merely a special case of a continuous question in which the forecast value is the measure of time between when the question is asked and when the question is resolved.

Predict the Question[edit]

Start from the Outside View[edit]

The Outside View is simply to observe that most questions have a measurable base rate. For questions of length, we could determine the base rate by asking: How many cold wars have there been? How long did each one last? The average length of a cold war is the base rate. Once you know this number, you can literally predict just that.

If you only ever do one thing to improve as a forecaster, do this.

Look for growth trends[edit]

Often, the topics of forecasting questions will exhibit trends over time. Perhaps ancient cold wars only lasted a few years, while more modern cold wars started lasting for decades. In this case, the average will probably be a less useful basis for prediction than the length of the last cold war. But that length is growing, so we should reasonably expect that the current cold war will be at least a little longer than the last cold war. To robustly determine how much longer requires regression, but an off-the-cuff estimate from a scatterplot or line chart is usually more than sufficient.

The Community is Smarter than You[edit]

Aggregates of a many predictions tend to converge on well-calibrated predictions^{[citation needed]}. As an individual predictor the correct answer is colloquially known as the Wisdom of the Crowd. Another powerful tool is the community estimate. You probably weren't the first person to forecast this question. The aggregate of all the older forecasts is usually available, so you can usually see what the other people are predicting.

The Community is USUALLY Smarter than You[edit]

Sometimes the community can be foolish. There are a bunch of reasons for this. For example, it's common to have forecasting questions whose probabilities should degrade over time (simply because there's less time for the the resolution to occur), which the community simply forgets about, so the community estimate will be vastly higher than appropriate. Alternately, too many people relying on the community prediction as a basis for their predictions creates conditions for group-think, in which the community is merely predicting on itself and not paying attention to the actual topic of the forecast. So when you look at the community prediction, look skeptically! Ask yourself: Why you shouldn't believe this bandwagon?

Wait for Reality to Catch up[edit]

In the meanwhile, Pick another Question and repeat!

References[edit]