Twelve years and ninety-six days.

That is how long it took from Jackie Robinson's debut on April 15, 1947 to Pumpsie Green's debut on July 21, 1959. It is the gap between the league's first integration and its last. Across that span, fifteen other franchises made decisions about when to integrate. Some moved within months. Some waited more than a decade.

The Brooklyn Dodgers integrated in April 1947. The Cleveland Indians integrated eleven weeks later. The St. Louis Browns followed thirteen days after that. By the end of 1947, three of sixteen teams had at least one Black player on the major league roster.

By the end of 1948, still three. By the end of 1949, four. By the end of 1953, ten. By 1954, eleven. By 1955, twelve. By 1958, fifteen.

The Boston Red Sox waited until July 21, 1959. Twelve years and ninety-six days.

This chapter asks a question the standard integration narrative does not. What did each year of waiting predict? What did each year of waiting cost? And which franchises forfeited the most by waiting longest?

12 yrs
96 days
Robinson to Green
1947 3 of 16
1948 3 of 16
1949 4 of 16
1950 5 of 16
1951 6 of 16
1953 10 of 16
1954 11 of 16
1955 12 of 16
1957 14 of 16
1958 15 of 16
1959 16 Boston Red Sox
Fig 01

The Risk Set Over Time

Sixteen franchises entered the risk set on April 15, 1947. One by one, each integrated. The timeline below animates the twelve years and ninety-six days it took to reach sixteen of sixteen.

0 of 16 franchises integrated
Apr 15, 1947
Fig 02

Survival Curve and Hazard Function

The Kaplan-Meier survival curve shows the fraction of franchises remaining unintegrated at each point in time. The hazard function shows the conditional probability that a remaining holdout would integrate in each year. The shape of the hazard is the finding.

Survival function S(t)
Hazard function h(t)
Null hypotheses:
Fig 03

Cox Proportional Hazards Forest Plot

A Cox regression identifies which franchise-level covariates predicted longer integration delay, controlling for the others. Hazard ratios greater than 1.0 indicate faster integration. Ratios below 1.0 indicate slower integration. The headline: the American League integrated at less than half the rate of the National League.

Fig 04

The Forfeited WAR Ledger

For each franchise, the cumulative WAR that was available in the unsigned Negro Leagues talent pool during the franchise's pre-integration window. The team with the highest forfeited WAR is not the team you expect.

Fig 05

The Counterfactual Standings

What if the late-integrating teams had integrated in 1947? A Monte Carlo simulation over 10,000 iterations estimates the range of counterfactual competitive outcomes. Select a team and season range below.

Methodology caveat
This model does not iterate the competitive equilibrium effect. If Boston had signed Willie Mays in 1949, the Giants would not have had him in 1951. These figures represent upper-bound estimates for the late integrators, not point predictions. The model is speculative by design and is presented with explicit uncertainty.
Season Actual W-L Finish CF Win Distribution Pennant Prob

Method

Five models, each documented below. All confidence intervals are 95% unless otherwise noted. The small-n caveat (n = 16 franchises) applies to every model in this chapter.

Model 1 -- Kaplan-Meier

Non-parametric estimation of the survival function S(t) and hazard function h(t). The event is first Black player rostered. Subjects are the sixteen original franchises. Confidence bands derived from bootstrap resampling (B = 10,000).

Confidence label: Modeled. Bands reflect small-sample uncertainty inherent in n = 16.

Model 2 -- Cox Proportional Hazards

Cox regression with the integration event as outcome and a vector of team-level covariates. Time-varying covariates handled via the Andersen-Gill counting process formulation. Schoenfeld residuals test verifies the proportional hazards assumption.

Confidence label: Modeled. Every coefficient reported with 95% CI. Covariates crossing HR = 1.0 flagged as not statistically distinguishable from no effect.

Model 3 -- Forfeited WAR

Multi-step accounting: for each year 1947--1959, identify the pool of Negro Leagues players with positive prior-year WAR unsigned by any MLB organization. Signability weighting via logistic regression trained on actual post-integration signings. Per-team forfeited WAR aggregated across pre-integration window.

Confidence label: Estimated. Bootstrap intervals reflect uncertainty in the signing model and underlying WAR data.

Model 4 -- Counterfactual Simulation

Monte Carlo simulation (10,000 iterations, fixed seed) over team-seasons for late-integrating teams. Counterfactual rosters drawn from signability-weighted available pool. Team WAR recomputed, converted to expected wins via Pythagorean expectation, standings recomputed.

Confidence label: AI-generated. This is the most speculative model. Outputs reported as distributions, never point estimates. Does not iterate second-order equilibrium effects.

Model 5 -- Frailty Decomposition

Frailty extension of the Cox model with two random effects: team-level frailty (persistent across ownership) and manager-owner-period frailty (specific to decision-maker). Variance components estimated via penalized partial likelihood.

Confidence label: Modeled. Variance components with profile-likelihood intervals. Sample-limited (n = 16 teams, approximately 30 manager-owner periods).

Data Sources

MLB official "first Black player" list (August 2020), cross-referenced against NLBM Barrier Breakers timeline. SABR Bio Project Baseball Integration 1947--1986. Baseball Reference team pages for covariates. Seamheads Negro Leagues Database for player WAR. U.S. Census decadal data (1940, 1950, 1960) for metro-level Black population share.

Confidence Vocabulary
Small-N Caveat
Every model in this chapter operates on n = 16 franchises. This is the full population, not a sample, but the small number constrains the complexity of any regression model and widens all confidence intervals. The chapter treats this constraint explicitly throughout. No covariate effect is reported without its uncertainty bounds.