An earnest attempt at projecting Josh Donaldson’s next five seasons

Looking at a contract extension, Jays From the Couch goes to great lengths to figure out Josh Donaldson’s potential future value


Embed from Getty Images



It’s a long post, so here’s a quick summary: Josh Donaldson has just laid down the 20th most wins above replacement (35.6 fWAR) among position players in their age 27 to 31 seasons. At the market rate of $9 million per WAR, his production would have justified a 5 year/$320 million contract. If Donaldson’s skills age at a normal pace and if he gets a modest amount of playing time, a 5 year/$125 million contract would be a fair deal for the Toronto Blue Jays. A contract like that pays him to be less than half as good over the next five seasons as he was over the last five seasons. Obviously, there is a huge range of potential outcomes. A major injury or dramatic fall in quality could mean that he generates little to no value over the next five years, leaving the team with a large, underwater contract.


Or, he could put together an age 32 to 36 stretch similar to Chipper Jones, Adrian Beltre or Scott Rolen and generate anywhere from $150 to $250 million worth of value. The range of likely outcomes points to Donaldson being a very useful veteran. Players like him help elevate the floor and ceiling during the transition towards a youth movement and are worth the investment. This is especially true when the term is much less than the 10 year deals signed by Miguel Cabrera, Robinson Cano and Albert Pujols in their early-30s. While worth their salaries in the first half of their deals, each will likely struggle to play up to his contract in his late-30s and, theoretically, early-40s. In part, it is those contracts that have made front offices hesitant to give term to players in their mid-30s.


Let me preface this post by making an obvious point: trying to project a baseball player’s production five years into the future is a fool’s errand. That said, Josh Donaldson’s future seems to be the biggest Blue Jays story of the off-season. Having some sense of what his future might look like would help us all when thinking about whether it’s objectively a good idea to re-sign him. This post uses comps to project that future. I’ll be honest about the assumptions I make and the methodology I use.


The usefulness of this sort of exercise is in being able to make assumptions about his playing time (both at 3B and DH) and his production (hitting, baserunning and fielding) and plugging them into the WAR equation to build some baseline projections of Josh’s next five seasons.


The aging curve

The easiest approach to projecting Josh’s future is to take his projection for 2018 and apply an aging curve to it—in his case, subtracting 0.5 WAR per season. Fangraphs projects him to be about a 6 WAR player in 2018. With the aging curve, we’d expect him to produce about 25 WAR over the next five years. While that’s down from the 36 fWAR he produced from ages 27 to 31, it would be an impressive total—at $9 million per win, that type of production would justify a 5 year/$225 million contract.


While it’s a high bar to reach (only 17 players have produced 25+ fWAR over their age 32-36 seasons since 1950), Donaldson has already had an exceptional five-year stretch—he ranks 12th in fWAR produced during ages 27-31 since 1950 (!). He is truly a special talent and I echo Shaun’s tweet:


A custom-fitted aging curve

The problem with this method is that it is overly simplistic and assumes that everyone ages in the same ways. Different skills age at different rates, so ideally a long-term projection could account for those differences. This is where comps come in. Using comps is an inexact science (with big grains of salt attached to the process), but they can provide some useful context. Namely, how players who performed similarly to Donaldson in their late-20s/early-30s ended up performing in their mid-30s. I’m going to approach it the way I’m going to approach it, but there are certainly countless other ways to use comps to project future production.


In order to estimate Donaldson’s age 32-36 fWAR, I’ll need to calculate its various components: batting runs, baserunning runs, fielding runs and the positional adjustment. [I’ll use data from the last five years to calculate the remaining three components: the league adjustment, the number of replacement runs and the number of runs per win.] Fangraphs explanation for position player WAR HERE



Josh Donaldson is a great hitter. Among batters in their age 27-31 seasons since 1950, he produced the 40th most batting runs above average.


BB%, K%, BABIP and ISO

I think that comps work best when the stats being used capture something fairly fundamental to the game of baseball. Finding comps by way of AVG, OBP or SLG seems problematic because they each are capturing multiple skills at the same time. A high AVG is attained through a combination of avoiding strikeouts, hitting homers and turning balls in play into hits. A high OBP is attained through all of those things plus getting walks. A high SLG is attained through the same things necessary for a high AVG, replacing hitting homers with hitting for extra bases more generally.


Instead, I’ll focus on four stats that get to the root of hitting production: BB%, K%, BABIP and ISO. Accumulating walks and avoiding strikeouts are obviously important to hitting success. A high BABIP over a large sample size is less a reflection of luck and more a reflection of a player’s innate ability to turn balls in play into base hits. Over more than 3000 PA, Ichiro managed a .353 BABIP, while Jose Bautista managed a .262 BABIP. That chasm wasn’t driven by luck but by the way the two batters approached the game. And finally, a high ISO reflects a player’s pure power.


Importantly, these are also mutually exclusive stats (for the most part). Walks and strikeouts are not part of the BABIP or ISO equations. The only overlap (to the best of my understanding) are the doubles and triples that count as both base hits from balls in play (BABIP) and as extra bases (ISO). Together, these four stats account for every part of a hitter’s production at the plate: walks, strikeouts, base hits and extra base hits.


These four stats also effectively capture Donaldson’s incredible hitting ability. He has managed to accumulate many more walks than average and generate far more power than average, while also managing to strike out less than average and turn balls in play into hits at an average rate. Thus, using these four stats (along with wRC+ for an overall measure of hitting production) seems like the best way to find players like him.


Since I’m examining players across many seasons, I’ll use normalized versions of those four stats that take into account the player’s performance relative to the league average. [Though I should note that using the raw stats doesn’t change the results very much.] Josh himself illustrates why this is important. For example, his 18.4% strikeout rate is not impressive in an absolute sense—among 510 batters with 2500+ plate appearances across their age 27-31 seasons since 1950, Josh has the 431st lowest K%. However, Josh produced a K% that was much better than his average contemporary, so his normalized K% (let’s call it his K%-, since less is better) is the 290th lowest in that group of players.


Mahalanobis comps explained

If you’ve seen comps used in other posts, you may have seen the term “Mahalanobis Comp”. This is based on the mathematical concept of Mahalanobis distance, a way to see how far apart two numbers are (within a larger group of numbers). In order to find comps for Josh, I find the Mahalanobis distance between his normalized BB%, K%, BABIP, ISO and wRC+ and those of the other 509 batters in the aforementioned data set (players with 2500+ PA over their age 27-31 seasons since 1950). Then, I add those up and rank the batters from lowest total Mahalanobis distance to highest.



In general, his closest comps paint a positive picture. Their strong ability to accumulate walks held up very well into their mid-30s. Their ability to avoid strikeouts better than most only regressed as far as the league average. Their exceptional power diminished, but remained well above-average. Their league average ability at turning balls in play into base hits held up very well. Overall, well above-average batters in their age 27-31 seasons (mean wRC+ of 139) still managed to be above-average batters in their age 32-36 seasons (mean wRC+ of 119). For perspective, among qualified batters in 2017, a 139 wRC+ would have ranked 19th in the majors, while a 119 wRC+ would have ranked a still solid 47th.


These comps also bode fairly well for the continuation of quantity, in addition to quality. Seven of the ten closest comps averaged 350+ PA per season in their mid-30s. Three even managed to crack 550 PA per season. The average of the group of ten was 2069 PA over five years (414 PA per season). Remove Boog Powell from the mix and the remaining nine managed an average of 2246 PA (449 PA per season).


There may be some value in narrowing this list down to only those who played their age 27-31 seasons since 1995. A more recent stretch allows us to focus on those who played in an environment much closer to Josh’s. Statistically, there is a high degree of similarity between the BB%, K%, ISO and BABIP of 2017 with those dating back to 1995. But prior to 1995, things change a great deal. (I used Mahalanobis distance at the season level to check this out).



Moreover, there have been great improvements in the approach to physical fitness over the last twenty years, both in terms of the quality of fitness programs and players’ willingness to embrace better fitness practices. In the more distant past, Spring Training was when players got in shape after letting themselves go in the off-season. Nowadays, players tend to start pre-season training in November or December.



This more recent group of comps experienced a similar fall in production (mean wRC+ falls from 144 to 126). In terms of each stat, the same applies. The big difference is that physical fitness improvements seem responsible for an extra 400 or so plate appearances, on average. All ten in this group saw at least 350 PA per season. Seven saw 449+ PA per season. Four even managed to crack an average of 575 PA per season.


Using comps to estimate Josh’s five-year wRC+ and plate appearances

I chose two ways to use this information to project Donaldson’s next five years of hitting value (in each case using both the full dataset and just the data since 1995):

1) Apply the average wRC+ and PA of his ten closest comps

2) Take Josh’s age 27-31 wRC+ and PA and apply the average decrease in wRC+/PA experienced by his ten closest comps




The first method is straightforward. Josh’s ten closest comps since 1950 ended up with an average of 2069 PA and a wRC+ of 119: above-average hitters who managed to maintain a decent amount of playing time. Plugging these numbers into the old fWAR machine results in 46.2 batting runs above average. Not bad. For context, this hypothetical mid-30s hitter would’ve ranked tenth in batting runs over the last five seasons among players aged 32-36.


Repeating this method with the recent group of players produces more positive results: greater quantity and quality. Josh’s ten closest comps averaged 2426 PA and a 126 wRC+ over their age 32-36 seasons. Inputting those numbers into the fWAR machine results in 73.6 batting runs, hypothetically good for fifth highest over the last five seasons.


The second method takes into account the fact that Josh outperformed his closest comps over his age 27-31 seasons. This applies to both his plate appearances and his wRC+. So, instead of simply taking the average PA and wRC+ produced by his closest comps in their mid-30s, I calculated the drop in PA and wRC+ that those comps experienced as they aged and then applied those to Josh’s age 27-31 seasons.


Josh accumulated 3270 PA and a 147 wRC+ over the last few seasons. On average, his closest comps since 1950 saw their PA fall by 963 and their wRC+ fall by 20. Applying that decline to his age 27-31 numbers results in a projection of 2307 PA and a 127 wRC+, which amounts to 71.9 batting runs. Josh’s more recent comps saw their PA (-651) and wRC+ (-18) fall by smaller amounts. Applying this decline results in a projection of 2619 PA and a 129 wRC+ (87.9 batting runs).


This range of projections suggests that players with skills similar to Josh Donaldson tend to remain very productive at the plate into their mid-30s. They get enough plate appearances and hit well enough to generate more batting runs than most players. On the low end, 46.2 batting runs above average would have ranked 61st in the majors over the last five seasons (no age exclusions this time). On the high end, 87.9 batting runs above average would have ranked 27th.


Base running

Josh Donaldson is a slightly above-average base runner. Through his age 27-31 seasons, he maintained a BsR per 600 PA of 1.0. That figure ranks 35th among the 88 players who cracked 2500 PA in their age 27-31 seasons from 2002-2012. I focused on this range as advanced base running stats like UBR and wGDP go back only to 2002.


Focusing on this recent group of players allows me to find Mahalanobis comps using three components of base running: avoiding double plays (wGDP), stealing bases (wSB) and general base running skills (UBR). This seems useful because (as these stats make clear) there are a few different ways to be a valuable base runner. Comparing runners only by BsR would miss that nuance.


In Josh’s case, his value as a base runner has come predominantly from his general base running skills—his ability to know when to try to advance an extra base and when not to. He differed from Brian Roberts, for example, who generated less value from general base running skills than he did, but was far more effective than he was at beating out double plays and stealing bases.


The base runners most similar to Josh saw declines across the board as they aged. There were decreases in their average UBR/600 PA (0.9 to -0.8), wGDP/600 PA (0 to -0.3) and wSB/600 PA (-0.1 to -0.4). Overall, their average BsR/600 PA fell from 0.8 to -1.5.



As was the case with batters, there’s (at least) a couple of ways to use this information to project Josh’s base running value over each of the next five seasons. The simplest way is to assume that he will repeat the average performance of his closest comps (1.5 BsR/600 PA). An alternative is to apply the average decrease in BsR/600 PA experienced by his closest comps (-2.3) to Josh’s age 27-31 BsR/600 PA (1.0). This would result in a projected BsR/600 PA of -1.3 for Josh’s age 32-36 seasons.



For the sake of being conservative in my assumptions, I’ll take the smaller of the two and project Josh Donaldson to produce -1.5 BsR per 600 PA through his age 32-36 seasons. For perspective, the league average BsR/600 PA among players aged 32-36 since 2002 is -1.1.


Defence and the Positional Adjustment

The last two components of fWAR that we need to project can be worked on together. We need to figure out how well Josh’s 3B defence might hold up and how many innings at 3B we can expect him to average in his mid-30s. When looking at the data, it seems that above-average third basemen often remain above-average third basemen into their mid-30s.


As with base running, in order to use advanced stats like UZR, I need to focus my attention to seasons since 2002, as UZR only goes back that far. [I chose to focus on UZR over DRS as UZR represents fielding runs above average in the fWAR equation.] For the sake of consistency, I’d like to focus on third basemen who a) had their age 27 season no earlier than 2002, b) had their age 31 season no later than 2012 (to ensure that I have data on their age 32-36 seasons) and c) played at least half of the innings available (3645) at 3B over their age 27-31 seasons.


Josh Donaldson has been a very good third baseman. Through his age 27-31 seasons, he played in 6120 innings and accumulated 37.8 UZR. As such, I’d like to see how other above-average starting third basemen fared in their mid-30s. So, in addition to the aforementioned criteria, I will also focus only on those third basemen who maintained a positive UZR in their age 27-31 seasons. 6 players met all four of the criteria I’ve laid out.



This set of comps suggests that Josh Donaldson can continue to be an above-average starting third baseman through his mid-30s. Let’s break these comps into a few smaller groups.


Elite everyday third basemen

The three standouts include Adrian Beltre, Scott Rolen and Brandon Inge. Each continued to get more than half of the innings available at third base. Beltre averaged the equivalent of 129 games at 3B per season, with Rolen (104 3B games per season) and Inge (91 3B games per season) close behind. Importantly, each maintained a well above-average UZR, in spite of advancing ages. Defensively, they represent the best-case scenario for Josh’s mid-30s.


The cautionary tale

The whole point of these defensive comps is to figure out how confident we should be that Josh can remain good enough at third base to justify keeping him there for the next five seasons. Joe Crede represents the player in this set of comps who reflects the worst-case scenario. Crede dealt with serious back issues—his playing time was affected starting in his age-29 season and he retired after his age-31 season—and reminds us of the fact that, unfortunately, the careers of good players can be derailed at any time due to injuries.


The other guys

As far as I can tell, the other two comps on our list had much more benign reasons for not sticking at third base into their mid-30s. As such, they don’t really represent cautionary tales for Josh Donaldson’s future.


Morgan Ensberg produced a 36 wRC+ in his age-32 season and was unable to find a team the following off-season. Chone Figgins seems to have been ruined by the Mariners. They signed him to a 4 year/$36 million deal after an age-31 season in which he produced a 116 wRC+ and 16.8 UZR at 3B. Then, for some reason, they started him at 2B, where he proceeded to produce a -11.1 UZR in his age-32 season. Nevertheless, in limited action at 3B in his mid-30s, Figgins produced a positive UZR at 3B.


Josh Donaldson’s future at third base

Unlike with batting and base running, there’s no easy way to turn these comps into a clear projection, both in terms of innings and UZR at 3B. Rather than a spectrum of comps that range in terms of innings and UZR, we have a bimodal situation: three of the comps played a bunch of above-average innings, while the other three (very much) did not.


Another complicating factor is that the number of innings available to him depend on the number of plate appearances we projected for him earlier. On the low end, we projected him to see 2069 PA over his age 32-36 seasons. At about four plate appearances per game, that amounts to roughly 520 games or 4680 innings (like Scott Rolen). On the high end, we projected him to see 2619 PA in his mid-30s, which equates to about 655 games or 5890 innings (like Adrian Beltre).


So, what I’ll do is defer to his projected PA to determine his overall playing time. Then, I’ll divide his defensive duties between 3B and DH: 80% of innings at 3B and 20% of innings at DH seems like a reasonable split. Finally, the quality of Josh’s defence. I’m going to take a conservative approach and assume that he produces average 3B defence in his mid-30s (UZR of 0).


Projecting Josh Donaldson’s mid-30s production value

It’s taken us a bit of time, effort and math, but we finally have the ingredients needed to calculate some baseline projections for Josh’s mid-30s. In each case, I will make the same assumptions regarding Josh’s base running, defence and playing time distribution. He will produce -1.5 BsR per 162 games, just like his ten closest comps. He will produce league average defence at third base. He will spend 80% of his time at third base and the rest as the designated hitter.


What differs across each projection is how I arrived at Josh’s wRC+ and PA. As I showed earlier, I chose two different ways to calculate his projected wRC+ and PA and used two different sets of players to do so. In the first two projections below, I used the average PA and wRC+ of his ten closest comps—first including players going back to 1950, then focusing only on players since 1995. In the last two projections, I applied the average decrease in wRC+ and PA experienced by Josh’s closest comps to his age 27-31 wRC+ and PA (again, first including players since 1950, then focusing on players since 1995).



The four projections range from a low of 11 WAR to a high of 17 WAR. Here’s some perspective for these numbers. Since 1950, 316 players cracked 2000 PA in their age 32-36 seasons. Producing a WAR of 11 would put Josh right around the middle of the pack (tied for 163rd), surrounded by players like Carlos Delgado, David Ortiz, Alfonso Soriano and Fred McGriff. A WAR of 17 would put Josh in a tie with Lance Berkman for 63rd, sandwiched by Ian Kinsler and Tony Gwynn.


It’s pretty easy to visualize what it would look like for Josh to underperform the lowest projection: some combination of injuries and a decline in his hitting and fielding abilities. It’s also very possible for him to overperform the highest of the four projections—after all, he has had one of the very best age 27-31 stretches of all-time. He could maintain excellent health and crack 3120 PA like Beltre did. His wRC+ might only fall into the 130s, let’s say 135. He could continue to be an above-average third baseman, posting a 5 UZR/150. Even if he still splits his time 80/20 between 3B and DH, that kind of production would generate 26 fWAR and be worth a 5 year/$230 million contract.


From the point of view of the Jays’ front office, a fair contract for these five years ranges from about $100 million to $150 million. That range overlaps with the kind of dollar figures people have been discussing regarding a Josh Donaldson extension. I think that this discussion can go on and on, for many reasons: contract-wise, Josh doesn’t have any clear comps; there seems to exist a growing hesitation among front offices when it comes to handing big contracts to players in their mid-30s; his free agent class is likely the most stacked in MLB history.


Discussing what he might get and the market factors at work is for another post. The point of this one was to show that Josh’s skill set should age well enough that the Blue Jays (and the Jays fans reading this) should feel comfortable extending him. It seems pretty likely that his performance over the next five seasons would justify the $125 million contract I think it’ll take to sign him. He just needs neutral luck. That’s a risk I’d be happy to take.





*Featured Image Courtesy Of DaveMe Images. Prints Available For Purchase.









Jeff Quattrociocchi

I'm an economics professor in the GTA whose lifelong love for the Jays was reignited by that magical August of 2015 and the amazing moments since.