Skip to main content

Analyzing Baseball Lineups

This is my statistics project for the end of the semester. I think it's pretty friggin sweet that I will just look up baseball stats and call it school work, MBA = much easier than Bachelor's in Electrical Engineering. Well here's the ole project proposal, I'll be posting the results and analysis whenever I finish the project / do the work...

When constructing a baseball lineup in any league whether it is the major leagues, a high school team or a little league team there have always been general guidelines to go by. Through this regression test I would like to find what if any correlation these guidelines have to the overall production of a major league offense or if all the premises I have been taught my whole life about building the ideal lineup are actually erroneous. With new theories about ideal hitters and looking at statistics developed through the ‘Moneyball’ philosophy it is important to see how or if these age old theories are still accurate.

Dependent Variable: Runs, the easiest way to analyze how a baseball team performed over the course of a season is to use how many Runs a team scored.

Independent Variable #1: DH Rule, the first aspect that needs to be accounted for in this regression test is to account for the difference between the NL and AL teams and their use of the pitcher as a hitter or in the AL a DH. So the sets will be assigned a value of 1 if they are in the AL and 0 if they are in the NL.

Independent Variable #2: Leadoff Hitter Pitches Per Plate Appearance, a common statement for a leadoff hitter is that it is essential for them to take a lot of pitches, especially with their first at bat. Taking several pitches is supposed to give the batters later in the lineup a better look at what the pitcher is throwing.

Independent Variable #3: Leadoff Hitter Stolen Bases Per Game, the ideal leadoff hitter is supposed to be fast. The best measure of a player’s speed and is the stolen base that should help their roster score some cheap runs.

Independent Variable #4: Leadoff Hitter On Base Percentage, another important trait of the leadoff hitter is to get on base. If the leadoff hitter does not get on base than the big hitters that ‘should’ be hitting 3 and 4 in the lineup will not have rbi opportunities.

Independent Variable #5: 2nd Hitter (Strikeouts-Sacrifices) Per Game, the second batter in the lineup is said to need a few characteristics. First to be able to put the ball in play, this can be measured by Strikeouts, the less strikeouts, obviously the better. Secondly, a #2 hitter should be adept at moving a runner over on the base paths via a bunt. To combine these characteristics into one I came up with Strikeouts – Sacrifices.

Independent Variable #6: 3rd Hitter On Base Percentage + Slugging Percentage, the 3rd hitter is ‘supposed’ to be the best hitter in your lineup. The OPS stat is commonly used as the best depiction of a hitters ability.

Independent Variable #7: Cleanup Hitter Slugging Percentage, the cleanup hitter should be the masher in your lineup. A HR hitter, a guy that pretty much hits extra base hits.

Independent Variable #8: 5th Hitter Runner’s in Scoring Position Average, the #5 hitter is there to clean up whatever mess is left on base behind the masher. Batting Average with Runners In Score Position shows how effective he is in knocking home the batters left on base.

Data Collection: All data will be gathered from online websites most likely Sportsline.com and ESPN.com, as it will be difficult to gather the exact data for an entire team at a position in the lineup I will be gathering the data of the player that hit most in each lineup spot for each team. This way I can get accurate data for the majority of the games played in the major leagues last season. I am using all percentage data for my independent variables to extrapolate a per game approach which will to an extent counteract the differences in games played. I will note games played and the player for each team at each lineup position on a separate spreadsheet.

In addition to using percentages for variables, I wanted to make sure I didn’t have variables that held a direct connection to the overall runs scored for a team. Stats like runs scored by the leadoff hitter or Runs Batted In for the 3rd batter or cleanup hitter would not bring much insight to the end results as they are a proportion of the teams production levels.

Predictions: My guess is that none of the variables individually will explain the overall run production of a team but that a combination of the variables

Comments

Anonymous said…
Pretty slick brotha, make sure to post this when you're done so that I can read it at work and if people walk by it will appear as if I am doing something work related. Just make sure to have lots of charts.

Popular posts from this blog

Lou Holtz is the Homer / Annoyingly Delusional

As my buddy Joe and I always joke, if Notre Dame was suiting up against an all Jesus team, aka a team made up of 55 Sons of God, Lou Holtz would probably still pick Notre Dame to win by a touchdown. So of course this weekend when I'm watching Sportscenter and they have him and fellow old man Corso making predictions, Lou picks ND to beat Michigan. Not that big of a deal, Michigan is a big question mark this year, but of course than Lou says that Notre Dame will win 11 games this year. This is the same Notre Dame that lost to a service academy last year. And just when you thought the douchy homerism was going to end ESPN asks which BCS school is going to be the biggest surprise team in the country. Any guesses to whom it was? I'll give you a clue it was another team he coached. If you guessed South Carolina you would be a winner. Next up on Lou's prediction watch, the Jets win the Superbowl, NC St. wins the ACC, Arkansas dominates the SEC West, Minnesota wins the Big

M E T S = Mercifully End The Season

Do it before David Wright gets Hurt!

Ranking the New York Jets Historical Helmets

There's no way you can't go with the Helmet they won the Super Bowl in. You just can't. Next, I really don't understand why they don't where the helmet with the Jet as their throwback uniforms. That helmet is awesome. Then I'm going with the Helmets from the 80s because it's the classic feel and the white face mask is 10 times better than the black one. And the rankings continue until you get to... The Titans Helmet. I hate everything about those Titan uniforms. The Helmets are boring and the colors are GOD awful. Navy blue and Mustard? What the hell is that. Disgusting. If they wore those unis when I was a kid I'd probably be a Giant fan, and be much happier with my life in football.