Sep 29, 2013; New York, NY, USA; New York Mets manager Terry Collins (10) waves to the fans after a game at Citi Field. The Mets won 3-2. Mandatory Credit: Brad Penner-USA TODAY Sports

Terry Collins, the New York Mets, and Manager WAR - A Research Experiment

As the title suggests, I’m doing research for a “Manager WAR” of sorts this offseason. Because of the preliminary status of my research, I’m not going to delve too deep into it. What research I have done, however, seems pertinent given yesterday’s decision to extend New York Mets manager Terry Collins. The question everyone’s been trying to solve is: Is Terry a good manager?

sandy-alderson-mets-gmIt’s complex. As most sabermetricians know, one of the big issues with the eye test is that we’re inherently predisposed to remember events that support our biases. We quickly recall times when a bullpen move leads to a loss, or when Justin Turner bats cleanup. We don’t see the decisions that occur behind closed doors, and we don’t see how lineup decisions or days off may positively impact the team long-term. And, of course, there is no easy way to quantify these things. But what if we could try?

We have two objective methods to assess how a team does over a full season. The first – Pythagorean Record – is a system that projects an expected record based on runs scored and allowed. The second – Wins Above Replacement – gives us an expected record based on a team’s talent level. Both sites that compute WAR, Baseball-Reference and Fangraphs, base their metrics on a “replacement level” .294 winning percentage (just under 48 wins). How they determine and allocate their WAR differs between the sites, but the theory is largely the same. Initial research suggests that Baseball-Reference’s figures are much more consistent (the standard deviation for bWAR is around 3.8 whereas for fWAR it’s around 5.4). My somewhat-educated guess is that using runs allowed (RA9) instead of Fielding-Independent Pitching (FIP) and Defensive Runs Saved (DRS) instead of Ultimate Zone Rating (UZR) provides a better measure of team performance in aggregate (or, in other words, that the primary difference comes from how each system measures and grades team defense).

What I’ve done so far is used the two metrics to identify “Expected” (Pythagorean Record) and “Actual” (WAR) results to determine how a team performs relative to its talent level. A 30 WAR team should expect to win roughly 78 games. If their Pythagorean Record suggests 80 wins, then it suggests that team “overachieved” by 2 wins (note: this doesn’t necessarily show up in the final standings as Win/Loss records often fall victim to luck and circumstance). After subtracting Average from Expected, the league average is determined (simply the mean of the calculated figures). That mean is then subtracted from each team’s score to determine a normalized Wins Above Average. Having done so for the years 2011-2013 (my research currently goes back to 1999, but we’re specifically concerning ourselves with the Collins years here), I then added each year to get team totals over the three-year period. Looking at the chart, we see that the Mets have performed quite well:


TEAM TOTAL 2013 2012 2011
STL 16.75 11.09 2.42 3.24
ATL 14.65 5.69 5.62 3.34
NYM 14.15 4.39 4.22 5.54
ARI 12.25 2.09 4.42 5.74
WSN 10.35 5.19 2.72 2.44
MIL 7.75 1.09 5.22 1.44
SDP 7.55 -0.11 1.02 6.64
PHI 7.35 1.29 3.52 2.54
PIT 6.05 -2.11 4.42 3.74
CIN 4.05 1.39 0.02 2.64
MIA 3.85 -1.21 1.62 3.44
OAK 1.45 4.49 -2.28 -0.76
CLE 0.05 3.19 -3.18 0.04
CHC -0.15 -2.91 1.52 1.24
HOU -1.35 0.79 0.52 -2.66
SFG -1.75 -2.61 1.72 -0.86
KCR -2.85 0.79 -4.98 1.34
COL -3.35 -5.11 -1.18 2.94
BOS -5.05 -2.71 1.52 -3.86
DET -5.55 -1.51 -3.18 -0.86
TOR -6.45 -1.81 -1.28 -3.36
LAD -6.65 -4.51 3.32 -5.46
NYY -7.35 1.09 -5.58 -2.86
MIN -7.35 -4.11 -3.88 0.64
TBR -7.35 -0.81 -0.08 -6.46
BAL -7.95 -0.11 -4.18 -3.66
TEX -9.65 -4.71 -1.88 -3.06
CHW -10.45 -2.81 -2.48 -5.16
SEA -11.25 -4.01 -3.88 -3.36
LAA -11.85 -1.31 -5.88 -4.66

(Chart shows results through 162 games and does not include last night’s game 163 between Texas and Tampa Bay.)


Terry sits comfortably toward the top, alongside better-regarded managers Mike Matheny* (whose Cardinals have outperformed their talent by a staggering 11.05 games this season) and Fredi Gonzalez. Not only that, but we can see that he’s been largely consistent in his three seasons. This is interesting as Terry has managed three largely different teams; only David Wright, Lucas Duda, and Daniel Murphy have played in 100 games in all three seasons offensively, only Jonathon Niese has made at least 20 starts each season (he, Dillon Gee, and R.A. Dickey are the only three to make more than 36 starts over the three yeats), and only Bobby Parnell has pitched at least 50 innings in relief each year. david-wright31

At this stage, Manager WAR is still a while from being a thing, and may not end up as one. And we don’t know how much of what I’ve defined as mWAA is actually the direct result of the manager. But the fact that the team has overachieved – on a consistent level – for all three years with Terry Collins at the helm is statistically significant. Though the methods may not be the same, this is exactly what Sandy Alderson was looking for when he said that his evaluation of the manager extended beyond wins and losses. What he saw was a skipper who consistently gets the most out of his players, who rallies his troops consistently and effectively, and is willing to buy into the system he’s working to implement.

Should Collins be able to do what he’s done with the competitive team he’s expected to be handed in 2014, the Mets could be in great shape to contend next year. While there may be bumps, and not all decisions will work out, the Mets appear to be better off in the long-run than many currently think.


So what do you think? Is Terry underrated? Overrated? Will his managing strengths translate to a more talented team? Let us know in the comments, and continue the discussion.


If you have any questions regarding methodology or about the mWAR process, please feel free to Tweet me @danhaefeli. As the research develops, I’ll continue to post my findings and any breakthroughs here on Rising Apple.


*Mike Matheny became manager of the St. Louis Cardinals in 2012 following the retirement of Tony LaRussa. However, in large part due to 2013, Matheny’s 2012-13 would rank third on the chart, behind Fredi Gonzalez and Terry Collins.


Thanks for reading! Be sure to follow@RisingAppleBlog on Twitter and Instagram, and Like Rising Apple’s Facebook page to keep up with the latest news, rumors, and opinion.

Next Mets Game View full schedule »
Tuesday, Sep 22 Sep7:10at Miami MarlinsBuy Tickets

Tags: New York Mets Sandy Alderson Terry Collins

  • Pingback: Terry Collins, the Mets and manager WAR | Metsblog

  • LoveYouGalsAlwaysHave

    This doesn’t make much sense. You are attributing the lumpiness of runs scored to the manager rather than to statistical randomness. I don’t really understand the basis for doing that at all.

    Your analysis of the Mets outperforming their expected record is valid and very interesting, but attributing it to the manager is probably not meaningful. Is the manager really “better” if he wins one game 1-0 and loses another 20-0? The expected W-L would be 0-2 but the actual would be 1-1. Based on this logic you are saying the manager added a win. I think that is a giant leap.

    The only way to really try and prove it is to look at all managers’ career numbers and see if “good” managers tend to be above the mean every season while bad managers are consistently below the mean. Kind of like BABIP. If you see strong patterns you may be onto something. Otherwise, it is not tied to manager performance.

    I think a more likely – and more difficult to measure – way that managers affect the team is actually adding or subtracting runs scored. Good bullpen management reduces runs allowed. Good lineup construction increases runs scored. Etc. But these effects are already fully captured by the Pythagorean W-L record, so they are pretty hard to extract and isolate.

    • Dan Haefeli

      There may be a misunderstanding here. If I’m following you correctly, you’re assuming that Pythagorean record is being compared to actual record, which it isn’t. Also, it seems that you’re considering Pythagorean record as the “expected” for comparison.

      At no point is win-loss record considered. In the scenario you described, Pythagorean record would be 0-2. But regardless of if the team won one of those games, the comparison is made to the bWAR accumulated by the players in those two games (which would likely be negative). In such a case, it’s realistic (and somewhat likely) for a manager to still have a positive mWAA.

      Like most statistics, this is more a sample size issue than anything.

  • ParisWilponCOO

    I think you’re doing some interesting work- but something is very wrong with your formula. It doesn’t pass the “common sense” test. Virtually all observers, from casual fan to veteran scout would say that the Nats seriously underperformed this year. Boston over last year and under this year- really? Baltimore has also been overperforming the last couple of years. If anything, you have discovered a REVERSE proportionality. That would put TC near the bottom of the pile- which also sounds about right to most Mets fans.

  • cliffy44

    In the words of (The Late) Johnny Maestro and the Brooklyn Bridge, re-hiring Terry Collins is “THE WORST THAT CAN HAPPEN”, for the New York Mets. No other manager in the history of organized baseball has had an escalating loss
    record, during his entire 3 year assignment; and yet retained his position.

    It’s almost as if Collins has taken a page from Pete Rose; and bet against his own team; which would make him a VERY rich person.

    Only obama has a worse record than Terry Collins does; and he’s gone at the end of his “contract:.

    So should Collins be.

  • BringBackDaveTelgheder

    I like the beginning of the analysis here but there is a long way to go. I’ve seen similar work done elsewhere but it really seems like you have cancelled out some of the noise but there is way too much extra noise in the number you have. Too much is based on luck, circumstance etc to have been boiled down to a relevant number for manager evaluation.

    Anyone that watches the Mets on a regular basis knows TC is an utter dope when it comes to new age thinking. Utterly ridiculous bunts, not bringing in your best reliever into the correct situation, etc.

  • Destry

    Thats the dumbest thing I’ve ever heard. All you have to do is look at team WAR, and see where a team is picking in the draft. The reason he didn’t wanna use Fangraphs is because the Mets have the 19th best WAR in baseball according to fangraphs. They finished with the 21st our of 30 teams as far as real life record. He clearly underacheived with the talent he had, according to the numbers. Subtract this and don’t add that and take the inverse of what they would’ve won, instead of what they actually won, and Terry Collins is a good manager. The author should be embarassed.

  • Pingback: My Beloved Mets | Terry Collins, the Mets and manager WAR

  • Thomas Hughes

    If you watched or listened to almost every game, as I did, you should easily conclude that Collins may have cost the Mets as much as 10 or more games this season (I would guarantee at least six). He is not a good field manager by any stretch of the imagination — it is plain as day to anyone who has followed baseball for many years. This is just another example of why “advanced” metrics must be taken with a large grain of salt.

    • paqza

      And though I agree with you based on what I’ve seen, the numbers show exactly the opposite. That’s good to know. It’s another example of how “the plural of anecdote is not data”. I buy these numbers and thought it was a useful analysis to show that Terry is doing a much better job than we give him credit for.

  • Brian Reilly

    CNNSI Article on the Pirates: The day Hurdle took over as the Pittsburgh Pirates manager in December, 2010, he spoke about electrifying the city. He preached optimism then went out and practiced it every day while talking about a vision that went far beyond returning a moribund franchise to respectability.

    It’s why Hurdle isn’t satisfied after leading Pittsburgh to a 94-68 record and its first playoff berth in 21 years. It’s why he doesn’t view Tuesday night’s wild-card game against Cincinnati as the culmination of three years of patience, progress and pragmatism.

    Press Hurdle on how detailed he allowed his vision to get and he leans forward for emphasis.

    “To win a sixth World Series,” he said.

    Do you think we’d ever hear anything like this from Terry Collins? How about Sandy Alderson? He has had the same length of time to build a playoff caliber team, yet he’s been beaten to the punch by the formerly hapless Pirates.

    If we make another meek showing in 2014, we need to tell Alderson, Collins and the Wilpons, “Don’t let the door hit you in the butt on your way out of town”