Statistics to Predict Future

overview

In this page, learn how data collected can be used to * predict future*. This is an important lesson, and it is not found in any other books or websites.

predict future

Consider the number of glasses of water students drink during normal day and sunny day. Today is a hot sunny-day. Can one predict how many students would require $4$ glasses of water today?

One student says : As per the data in the table, we can expect more or less $9$ students will drink $4$ glasses of water.

Another student says : The data represents the number of students who had $4$ glasses of water yesterday. There is no way we can predict how many will take today.

The first student's deduction is what we can best use. Though the data is for yesterday, We can expect more or less to the value in the data. *The objective of data-collection is to predict future.*

predict data

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of $40$ students* is considered.

Even though a different class is considered, The data would be more or less same. The data may not be exactly equal, but one can expect that the data would be more or less same.

scale up the prediction

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of $80$ students* is considered. For this set of $80$ students, the data would be more-or-less double of the data for $40$ students. The data may not be exactly double, but one can expect that the data would be more or less double.

scale down the prediction

Consider the number of glasses of water students drink during normal day. The data is collected for 40 students. Next day another *class of $20$ students* is considered. For this set of $20$ students, the data would be more-or-less half of the data for $40$ students. The data may not be exactly half, but one can expect that the data would be more or less half.

scale down to one

Consider the number of glasses of water students drink during normal day. The data is collected for $40$ students at the end of the day. Students are lined-up and one by one students provide information how many glasses of water they drank.

• First student says, $2$ glasses

• Second student says, $3$ glasses

• Third student says, $1$ glass

The data is understood in a different form for each student.

$11$ students out of $40$ would say $1$ glass

$18$ students out of $40$ would say $2$ glasses

$6$ students out of $40$ would say $3$ glasses

$2$ students out of $40$ would say $4$ glasses

$3$ students out of $40$ would say $5$ glassses

*only $1$ student is considered. The big question is
Can one predict how many glasses one student would drink?
The prediction for one student can be given, but only in the context of large data.*

The possible prediction based on the recorded-data is given below.

If the data-collection is repeated $40$ times,

• the data-value $1$ glass would appear $11$ times out of the $40$ times

• the data-value $2$ glasses would appear $18$ times out of the $40$ times

• the data-value $3$ glasses would appear $6$ times out of the $40$ times

• the data-value $4$ glasses would appear $2$ times out of the $40$ times

• the data-value $5$ glasses would appear $3$ times out of the $40$ times.

This is referred as : "**probability**" of the data value $1$ is $11/40$.

statistics of coin toss

Let us consider another form of data. A person is tossing a coin, and recording the data. The data is shown in the tally and tabular form for $40$ tosses.

If the coin is tossed $10$ times, the coin will have more or less $5$ times heads and $5$ times tails.

transition

Considering data from tossing a coin.

If the coin is tossed once, the best one can say is the result will be come as heads for *$20$ times in $40$ repetitions*.

This is referred as "**probability**" of the data value "heads" is $20/40=1/2$.

This is a transition from statistics to probability.

Statistics presents the collective data as it is. eg: When a coin is tossed $40$ times, the heads appears $20$ times and the tail appears $20$ times.

Probability specifies the same for one event. eg: When a coin is tossed, the probability of heads is $20/40=1/2$.

summary

**Predicting Based on Representative Data**: Data can be used to predict the outcome of events.

Data is collected over a large number of iterations/repetitions.

It is known the result of one iteration can be one of many possibilities.

The result of one iteration is predicted in the context of the large-number-of-repetitions.

Outline

The outline of material to learn "basics of statistics and probability" is as follows.

Note: * Click here for detailed outline of "Basics of Statistics and Probability"*

• Introduction

→ __Introduction to Statistics__

→ __Organizing Data : Tally Table__

→ __Pictograph__

→ __Bargraph__

• Data Analysis

→ __Cumulative Frequency__

→ __Representative Values of Data__

→ __Central Tendencies__

→ __Bargraphs & Piecharts__

• Probability Fundamentals

→ __Predicting Future__

→ __Random Experiment__

→ __Probability__

→ __Standard Experiments__

• Statistics Grouped Data

→ __Grouped Data__

→ __Probability in Grouped Data__

→ __Class Parameters of Grouped Data__

→ __Methods to find Mean of Grouped data__

→ __Mode of Grouped data__

→ __Median of Grouped Data__