Pythagorean Formula

Baseball standings dataset:

the data represents MLB standings foe 2018: there are 12 teams with variables explained below:

 

GLOSSARY of the data:

  • W:Wins
  • L:Losses
  • PCT:Winning Percentage
  • GB:Games Back
  • HOME:Home Record
  • AWAY:Away Record
  • RS:Runs scored
  • RA:Runs allowed
  • DIFF:Run Differential
  • STRK:Current Streak
  • L10:Record last 10 games
  • POFF:% chance of making the playoffs
  • * –:Clinched Best League Record
  • Y –:Clinched Wild Card
  • X –:Clinched Division

Pythagorean Formula and residual plot:

The first graph below is the scatter plot of log (W/L) against log (RS/RA)  which demonstrate the Pythagorean Formula
, and the second plot is the residuals against log (RS/RA).

It seems to have a linear relationship between log (W/L) and log (RS/RA). fitting a linear model log (W/L) ~ log (RS/RA) will yield to: slope =1.4 and intercept almost equals to 0.

With R square is around 82% and pretty small p-value= 5.359e-05, we concede that the model fitted the data pretty well so the best k value is the slope 1.4.

The Colorado Rockies and Milwaukee teams considered lucky teams because they have the highest residual values so their  wins/losses is higher then predicted value.

The Washington Nationals considered unlucky team because it has the least residual values so its  wins/losses is lower then predicted value.

 

Leave a Reply

Your email address will not be published. Required fields are marked *