home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Total Baseball (1994 Edition)
/
Total_Baseball_1994_Edition_Creative_Multimedia_1994.iso
/
dp
/
0127
/
01277.txt
< prev
next >
Wrap
Text File
|
1994-01-31
|
20KB
|
284 lines
$Unique_ID{BAS01277}
$Pretitle{}
$Title{Appendix: Baseball, Computers, and New Statistics}
$Subtitle{}
$Author{
Gillette, Gary}
$Subject{Computers Statistics computer statistic statisticians number numbers
sabermetrics sabermetricians statistical computerized Epstein James Palmer
Elias Stats stat OPS gambling CD-ROM DataDiscman reference CMC Sony Franklin
Electronics statistician}
$Log{}
Total Baseball: Appendixes
Baseball, Computers, and New Statistics
Gary Gillette
With the coming of age of the personal computer, the sea of baseball
statistics in recent years has become a veritable flood. With the baseball
world seemingly inundated with numbers, there has been a backlash against
statistics and the statisticians. Many fans--especially those of the "old
school"--believe that baseball already had enough stats, and that the newer
numbers served mostly to obscure the game they love, rather than to illuminate
it.
Which is true? Certainly, baseball has been blessed over the years with
a wealth of numbers. Far more than any other major sport, baseball has been
described and preserved in the script of Arabic numerals--it is the primary
way in which the stars of yesterday are remembered. All sports have their
legends, with their apocryphal stories about their superstars. Where baseball
transcends other sports, however, is in the numbers. Baseball statistics give
clarity and perspective to the hallowed yet hazy pictures of the past; numbers
make almost tangible the exploits of players long gone. It is, in fact, the
accuracy and scope of these numbers which make meaningful comparisons of
players across eras at all possible.
Who is responsible for this flood of numbers? First, the fans, who have
always looked for evidence to buttress their opinions. Second, the media, who
have latched onto this trend and exploited it commercially. Third, Major
League Baseball, which has aided and abetted the trend for publicity. Fourth,
fantasy baseball players, many of whom were only casual fans before the advent
of Rotisserie leagues. Fifth, authors and analysts, who use these new numbers
to examine and explain the seemingly simple game of baseball.
Of all the new statistics, the type which has become most popular is the
situational. In contrast to traditional baseball totals or averages like home
runs and batting average, situational stats are an attempt to break baseball
statistics into how and when they happened. The most important situational
breakdowns are how batters perform versus lefthanded and righthanded pitchers
(and pitchers vs. same and opposite handed batters) and how players perform in
their home park as opposed to when they are on the road. In reality, these
situational stats have been part of the game for decades, as generations of
fans have argued about the effect of Yankee Stadium on sluggers like Ruth and
Maris, while generations of managers have platooned their hitters. Aside from
these two key categories, several other types of breakdowns have become
prominent: "clutch" hitting measures, grass/turf splits, and individual
pitcher-batter matchups.
The biggest problem with this flood of data is that very little analysis
has been done. For example, when a manager chooses to start a right-handed
batter (who would normally be platooned) against a right-handed pitcher, he
can justify it statistically in any number of ways. He could base his
decisions on that hitter's past performance against the opposing team, or by
his performance against the opposing pitcher, or by his performance in day
games, or by his performance against ground-ball pitchers, or even by his
performance in the past week if he's been on a hot streak. All of these
statistics (and many more) are available to teams on a daily basis. Of course,
many of these numbers will be contradictory: a hitter might have done poorly
against right-handed pitchers, but have done well against that team. He could
also carry a pitiful average against that day's starting pitcher, but have
done excellently in day games.
The real problem, then, is not a lack of available data--the problem is
too little information. Whether you are managing the 1993 World Champion
Toronto Blue Jays or playing armchair manager while watching the home team on
cable, which statistics do you rely upon? Literally everyone these days comes
armed with reams of statistics--managers, teams, players and their agents, the
media, and the fans. Unfortunately, the answer is that no one really knows.
Precious little work has been done in understanding what these numbers mean;
most of the work has been done in generating still more numbers. No one really
knows if individual pitcher-batter matchups are more reliable predictors of
performance than left/right breakdowns, and while everyone talks about
"clutch" hitters, there are a dozen ways of measuring what might be called
"clutch" hitting. Indeed, many studies have been done which fail to show any
consistent, measurable effect which could be labeled as "clutch" hitting. Just
what is true and what isn't?
The answer is both true and false. Most of the statistics quoted and
published are literally true. That is, they are accounts of events which
actually happened on the field, and therefore must be true. Stating that a
player batted .390 with runners in scoring position doesn't leave anything to
argue over. What is manifestly false, however, are most of the claims made
about these statistics. If one argues that batting .390 with runners in
scoring position means that the player is a good "clutch" hitter, the validity
of that statement depends not on the actual numbers, but on the interpretation
of those numbers. The interpretation and misinterpretation of the numbers and
the predictions that flow from them are what cause the arguments.
At their best, these new statistics illuminate the various aspects of the
game, making it easier to understand how and why players and teams win and
lose. The fact that they describe performances in specific situations (hence
their name) is both their strength and their weakness. At their worst,
situational stats divide up the game into irrelevant categories which hinder
understanding. The common parody of situational statistics--how a player hits
"in Tuesday night games at home when facing a southpaw in July with the bases
loaded"--is sometimes all too close to reality. In Game Seven of the 1992
World Series, for example, more than five hundred stats were bandied about by
the network broadcasters or displayed on the screen during the game. With so
many numbers coming fast and furious, the significant ones get lost in the
trivial, and the currency of all analytical statistics is debased.
The schizoid attitudes in baseball toward new statistics and analysis are
shown by the fact that there isn't even general agreement on the use of the
word sabermetrics: many fans have never heard of the term, and while some
professionals in the field call themselves sabermetricians, others eschew the
word and call themselves statistical analysts. The word sabermetrics was
coined by best-selling author Bill James, who modified the acronym for the
Society for American Baseball Research (SABR) for the root of the word and
added the suffix -metrics to denote measurement.
While SABR may have lent these new statistics its name, the Society for
American Baseball Research is not the primary purveyor of these new
statistics. There are many outlets for statistics today, but most of these
come from only a few sources. The Major League Baseball-IBM Baseball
Information System, which operates out of the commissioner's office in New
York City, now compiles baseball's official statistics. (Computer giant IBM is
an official sponsor of major-league baseball and provides the computer
hardware.) This system was set up in the late 1980s to provide statistics to
major league teams, and it now also provides much of the material used by the
media. The Elias Sports Bureau, also headquartered in New York, remains the
official statistician for both major leagues and is another major source of
statistics. Howe News Bureau, which had previously served for many years as
the official statisticians of the American League, became the official
statisticians in the late 1980s for all minor leagues and most of the Latin
and Caribbean leagues. Signifying its new business orientation was a change in
name to Howe Sportsdata International, which now provides minor-league
statistics to numerous publications, making the records of tomorrow's stars
available to fans today.
In the front offices of baseball teams, there have been two primary
usages of these new statistics. Team Public Relations departments, aided by
the computerized MLB-IBM Baseball Information System, have become increasingly
more proficient and prolific at churning out special stats about their
players. Most of these find their way to the fans via the media, who publicize
these stats in print and on the air. The other way in which the new statistics
have penetrated the business end of baseball is through salary arbitration, a
quasilegal proceeding which directly sets salaries for a few dozen players
each year. Indirectly, however, salary arbitration has a much broader impact
on the game's salary structure. Newer statistical measures have been used by
both management and labor in arbitration in recent years, although many
analytical stats are still not admissible in arbitration proceedings by the
mutual agreement of the disputing parties.
Outside the hearing room, new analytical measures have had a greater
impact. Several major league teams have employed full-time professional
statistical analysts in the past decade, and other teams have employed
statistical analysts as consultants. Of these analysts, Eddie Epstein, now
Director of Research and Statistics for the Baltimore Orioles, has risen the
highest and had the most influence. Some other teams have employed computer
systems to compile and analyze performance data, with the Oakland Athletics
and manager Tony LaRussa getting the most credit for successfully using these
tools. It is clear that the impact of computers (and the analytical statistics
they make possible) will continue to grow in baseball's executive suites in
the future.
In the publishing world, the main effect of the new statistics has been
to create a new subgenre of baseball books. Celebrity biographies still sell
the most sports books, but the number of statistically oriented titles
released in recent years is astounding. Bill James, a Kansan who was not a
sportswriter, made the best-seller lists year after year in the 1980s on the
strength of his detailed statistical analysis as well as his witty and
satirical prose. Pete Palmer, a computer programmer by trade, blazed the way
for accurate historical comparisons of players and teams by combining his
tireless research and top-notch computer skills to produce a comprehensive
historical data base. Published for the first time in Total Baseball, a
comprehensive reference work, Palmer brought sabermetrics and serious
analytical measures to the general baseball public.
Many other authors in the last decade have published baseball books which
relied on the numbers. By 1990, well over a hundred baseball books were being
released by major publishers each spring; several dozen of these were devoted
to statistical analysis or intended as reference works. The Elias Sports
Bureau, longtime official statisticians of the National League, made their
mark by publishing their annual eponymous book of situational stats starting
in 1985. These stats, previously available only to major-league teams,
instantly became part of the baseball public's consciousness. The Elias
Baseball Analyst became the best-seller of the annual statistical tomes and is
very widely quoted by the media.
Just as James, Palmer, and Elias were preceded by the members of the
Society for American Baseball Research, they were also followed by many
others. Project Scoresheet, a nonprofit organization founded by James in 1984,
coordinated the efforts of hundreds of volunteers and produced the first and
only publicly available data base of contemporary baseball. Retrosheet,
another volunteer group founded by David Smith (a longtime SABR member and
professor at the University of Delaware), is now collecting scoresheets from
pre-1984 games. Armed with copies of more than 75,000 scoresheets donated by
teams, sportswriters and fans, Retrosheet will soon make public its first
computerized data (for the 1967 season).
As the official statisticians, Major League Baseball and the Elias Sports
Bureau are the primary sources from which both the electronic and the print
media get their baseball statistics. Two independent organizations also
maintain baseball data bases, providing both the media and the fans with ready
access to almost any conceivable baseball statistic: Stats, Inc., of Chicago
(run by John Dewan and Dick Cramer), and The Baseball Workshop of Philadelphia
(run by the author). Stats, Inc., has provided greatly improved and expanded
box scores to USA Today, Baseball Weekly, and other newspapers since 1990.
Stats also provides statistics to ESPN and other media outlets, produces the
annual Scouting Report, and self-publishes other baseball reference books. The
Baseball Workshop maintains and updates the former Project Scoresheet data
base while providing statistics and analysis to publishers and to media
clients. The Workshop's annual Great American Baseball Stat Book features
comprehensive situational statistics for all active major-league players.
On the periodical side, media conglomerate Gannett founded a weekly
newspaper in 1990 devoted solely to baseball--more specifically to baseball
statistics, which take up a large portion of the paper. While the
hundred-year-old Sporting News has deemphasized baseball, USA Today Baseball
Weekly has found a market hundreds of thousands strong by focusing exclusively
on baseball. It is true, though, that a very large share of the Baseball
Weekly audience is composed of fantasy baseball players, and the format of the
statistics published in BBW is designed for their convenience. Baseball
America, a biweekly newspaper which focuses on minor-league and college
baseball, has also gained a large following among dedicated fans and fantasy
baseball players. The ready availability of information about thousands of
minor-leaguers in Baseball America, and the regular publication by BBW in 1992
of OPS stats (On-Base plus Slugging, an analytical measure developed by Pete
Palmer) show just how far the new statistics have come.
Probably the most high-profile and controversial element of the new
statistics trend has been the explosion in popularity of fantasy baseball, a
pastime that occupies several million players. An undeniable attraction of
fantasy baseball is that it brings to baseball one element which football has
had for decades: gambling. It is undeniably true that without the wager, there
would be very few fantasy leagues. The excitement of betting on football is
one of the main reasons for its spectacular growth in popularity, and fantasy
games give baseball fans a chance to partake of the action during the summer
as well as in the Gamblers, both fantasy players and others, make extensive
use of the new stats. Fantasy baseball is firmly established and likely to
increase in popularity.
Another area of great growth has been in baseball games. Baseball board
games, using dice as the element of chance and statistics to recreate
performance, have been played by a small but devoted group since the birth of
APBA Baseball in the early 1950s. In the mid-1980s computer baseball games
took hold and now appear to be the future of baseball gaming. While board
games still have their audience, computer games are much more flexible and are
able to provide fans with a variety of simulated experiences which board games
cannot. Furthermore, the virtues of a computer opponent, when a
flesh-and-blood one is unavailable, are understandable.
The latest computer games simulate sophisticated opposing managers as
well as recreating player performance. Moreover, the era of almost real-time
baseball games has already dawned. In 1991 the computer service Prodigy
debuted a baseball game which used last night's real-life performances to play
simulated games. The participants in "Big League Manager" send in their
lineups by modem before they go to bed; the next morning, they dial up the
Prodigy computer and see a boxscore for last night's game for their team
displayed along with current league standings. Compuserve and other major
computer on-line services also provide an electronic forum which connects
baseball fans across the country. These services run hundreds of electronic
fantasy baseball leagues which attract thousands of players. Fantasy baseball
contests, made possible by the combination of computers and "800" and "900"
phone lines, have also become big business in the 1990s.
Yet another electronic milestone was reached in 1991-1992 with the debut
of the electronic baseball encyclopedia, in two forms: Total Baseball became
available in compact disc, read-only memory form (CD-ROM)--in a mini-disc for
Sony's palmtop DataDiscman player and for the desktop computer, MS-DOS or
Macintosh, in a conventional size disc published by CMC; also, Big League
Baseball, a handheld reference device published by Franklin Electronics which
fit into a shirt pocket. The most recent development, as of 1994, is Microsoft
Baseball, which incorporates the statistical database and prose features of
Total Baseball within a larger reference framework, created for the Windows
graphical environment. The advantage of an electronic baseball encyclopedia is
not simply its portability or its compact data storage and retrieval. These
electronic editions invite the user to manipulate the numbers, to make
customized lists and complicated research requests, all of which make the
numbers more meaningful and accessible to the fan than they are on the printed
page.
Last, and least, is the negative reaction to the new statistics. There
has been a backlash, and it's true that many fans believe the game has
suffered from these stats and their purveyors, but the new statistics don't
change the grand old game; they just provide new ways of looking at it. Almost
all traditional baseball statistics (e.g., batting average, earned run
average, errors) were invented in the nineteenth century or the early
twentieth, and they reflect the way the game was played at that time. The game
on the field is quite different now, and it is appropriate that the statistics
used to describe and analyze the game reflect the way the game is played now.
As with new strategies on the field, the old stats will persist while
their replacements become established. During this time, the improvement in
analysis will be obscured by the clash of stats and the arguments of the
analysts. Inevitably, though, the best of the new stats will oust the worst of
the old, and the game will look a little different in the future.
Not too long ago, you know, nobody bothered to count such silly things as
runs batted in, batter strikeouts, times caught stealing, or saves. Players
change, teams change, ballparks change, strategies change--even "the
unchanging game" itself changes. Why shouldn't baseball statistics change
along with them?