Saturday, February 5, 2011

Predicting your Lunch with an Improper Linear Model

A friend asked me how the linear model prediction works and I want to explain it now with an example.


Klaus cares about his health and controls his body weight. When he feels heavy he eats salad for lunch. Roland has eaten often lunch with him, since they worked in the same company. Eventually Roland moved to another city. After a year he visits Klaus and has the idea to predict what Klaus will eat, by having a glance at his shape.
Fortunately he recorded past eating behavior of Klaus:
  1. his estimated weight in a certain week (estimated by guessing, therefore the model is improper)
  2. his frequency of ordering salad at a certain week
With this data he constructs a linear model, and as soon as he sees Klaus, guesses his current weight and calculates the probability that he will eat salad with the spreadsheet model.
The model is y = 0.0048842 x + 0.1751
and for y = 50% (deciding for salat is equaly good as for meat)
we have x = ~67 kg. If Klaus weights more than 67 kg, Roland will predict "salad".
Now lets say Roland believes Klaus weights now 86 kg, then a quick calculation tells him that the probability for salad is ~60%. But if he believes Klaus weights only 56 kg then the probability for salad drops to 45%.
The paper claims that if you don't have the formula y = 0.0048842 x + 0.1751 and predict only by intuition, you would doing worse than with the model. Guessing is worse than the regression formula.


N.B.: The most important parameter is a=0.0048842, it tells how strong the relationship between x and y is. With a=1 for every increase in x, y increases the same amount. a=0.0048842 tells that there is a much weaker relationship between x, y, a weak "correlation".

No comments: