Simulating Miocene Warmth: Insights From an Opportunistic Multi-Model Ensemble (MioMIP1)

The Miocene epoch, spanning 23.03–5.33 Ma, was a dynamic climate of sustained, polar amplified warmth. Miocene atmospheric CO 2 concentrations are typically reconstructed between 300 and 600 ppm and were potentially higher during the Miocene Climatic Optimum (16.75–14.5 Ma). With surface temperature reconstructions pointing to substantial midlatitude and polar warmth, it is unclear what processes maintained the much weaker-than-modern equator-to-pole temperature difference. Here, we synthesize several Miocene climate modeling efforts together with available terrestrial and ocean surface temperature reconstructions. We evaluate the range of model-data agreement, highlight robust mechanisms operating across Miocene modeling efforts and regions where differences across experiments result in a large spread in warming responses. Prescribed CO 2 is the primary factor controlling global warming across the ensemble. On average, elements other than CO 2 , such as Miocene paleogeography and ice sheets, raise global mean temperature by ∼ 2°C, with the spread in warming under a given CO 2 concentration (due to a combination of the spread in imposed boundary conditions and climate feedback strengths) equivalent to ∼ 1.2 times a CO 2 doubling. This study uses an ensemble of opportunity: models, boundary conditions, and reference data sets represent the state-of-art for the Miocene, but are inhomogeneous and not ideal for a formal intermodel comparison effort. Acknowledging this caveat, this study is nevertheless the first Miocene multi-model, multi-proxy comparison attempted so far. This study serves to take stock of the current progress toward simulating Miocene warmth while isolating remaining challenges that may be well served by community-led efforts to coordinate modeling and data activities within a common analytical framework.

• A synthesis of Miocene modeling efforts, and surface temperature reconstructions, is presented within a single analysis framework • Miocene global mean surface temperature estimates span ∼5.3°C-11.5°Chigher than preindustrial, only ∼2°C is explained by non-CO 2 boundary conditions in climate models • Some simulations overlap with reconstructed global mean surface temperature estimates but fail to capture the weak temperature gradient

Supporting Information:
Supporting Information may be found in the online version of this article.

Introduction
The Miocene epoch  encompasses much of the range of climate states between the Cenozoic endmembers of peak Eocene warmth and the modern ice-house state with extensive glaciation at both poles (Figure 1, Steinthorsdottir et al., 2020).Terrestrial and marine records show that Miocene climate BURLS ET AL. 10.1029/2020PA004054 2 of 40 surface warming patterns simulated by a range of different climate models configured with Miocene paleogeography and CO 2 concentrations spanning 200-850 ppm.We also synthesize available Miocene surface temperature reconstructions.The primary factor controlling the amount of global warming seen across the Miocene simulations analyzed is the CO 2 concentration that was prescribed within a given simulation.On average, Miocene elements other than CO 2 , such as Miocene paleogeography and ice sheets, raise global mean temperature by ∼2°C.While some Miocene simulations with high CO 2 forcing overlap with the reconstructed global mean surface temperature estimates for their target Miocene interval, they still generally fail to capture the reconstructed pattern of warming.
Figure 1.The Miocene within the Cenozoic.The Late Miocene (Tortonian and Messinian, and Early Middle Miocene (Burdigalian,Langhian,and Serravallian, are shown, as well as the  shaded in pink.Multiproxy CO 2 reconstructions are from Foster et al. (2017) and Sosdian et al. (2018).CO 2 from leaf stomata is shown in green circles, pedogenic carbonate δ 13 C as black triangles, boron isotopes in foraminifera as pink stars, liverwort δ 13 C as blue octagons, and δ 13 C of alkenones as orange diamonds.The most likely fit through the data is shown as the black line and 68% and 95% confidence intervals are shown as dark and light gray bands.Benthic oxygen isotopic composition (δ 18 O c ) taken from Zachos et al. (2001Zachos et al. ( , 2008)).The time period abbreviations shown in the top bar are as follows: E, Eocene; M, Miocene, O, Oligocene; PE, Paleocene; P, Pliocene; Q, Quaternary.
was significantly warmer than today and highly dynamic.Oxygen isotope records of benthic foraminifera (δ 18 O c ) show that peak Miocene warmth and minimum ice volume occurred in the Early to Middle Miocene (16.75-14.5 Ma), known as the Miocene Climate Optimum (MCO) (Figure 1).During the MCO, current reconstructions suggest that sea surface temperatures (SSTs) were 8°C-10°C warmer than present in the high southern latitudes (Shevenell et al., 2004) and 10°C-15°C warmer in the high northern latitudes (Super et al., 2020), while deep ocean temperatures were ∼5°C-8°C warmer (Lear et al., 2000(Lear et al., , 2015)), leading to an estimated global mean surface temperature anomaly of 7.6°C  2.3°C compared to preindustrial (Goldner et al., 2014).The warmer conditions shifted the bioclimatic zones poleward (Pound et al., 2012).While the extent of the Greenland glaciation remains unclear (Thiede et al., 2011), it contributed negligibly to global ice volume and there is evidence to suggest that even during the late Miocene the Arctic experienced sea ice-free summers (Stein et al., 2016).Regions of Antarctica supported woody temperate vegetation that graded into tundra further inland with a much-reduced ice sheet extent (Pound et al., 2012;Sangiorgi et al., 2018;Warny et al., 2009).Although cooler than the peak warmth of the MCO, in the late , SSTs still ranged between 10°C and 15°C warmer than modern in the high latitudes, and 2°C-4°C warmer in the Tropics (Herbert et al., 2016;LaRiviere et al., 2012), while temperatures in the deep ocean are estimated to have been ∼4°C warmer than present (LaRiviere et al., 2012).Geochemical and ice-rafted debris records indicate an increasing prevalence of colder temperatures as cooling progressed into the late Miocene (Stein et al., 2016;Winkler et al., 2002).The wide dynamic range of Miocene climate implies either strong sensitivity to forcing or strong and highly variable forcings through the Miocene.
The driving forces behind Miocene warmth and its fluctuations remain enigmatic, and several aspects have been difficult to reconcile with CO 2 reconstructions that are generally similar to modern or end-of-century projections (Figure 1).Typical proxy estimates for the Miocene are in the 300-600 ppmv range (Figure 1, Foster et al., 2012Foster et al., , 2017;;Sosdian et al., 2018), although during the MCO, recent reconstructions suggest that CO 2 may have been as high as 800-1,100 ppm (Sosdian et al., 2018;Stoll et al., 2019).A full description of the Miocene CO 2 reconstructions and their uncertainties is given in Section 8 of (Steinthorsdottir et al., 2020).A prolonged carbon isotope excursion between ∼16.7 Ma and ∼13.5 Ma (the Monterey Excursion; Vincent & Berger, 1985) documents a marked perturbation of the carbon cycle at this time, modulated by orbital forcing.It is suggested that this excursion is due to enhanced organic matter burial on submerged continental shelves and higher CO 2 concentrations resulting in higher biological carbon isotope fractionation (Sosdian et al., 2020).A long-term decrease in the global carbon isotopic composition record for the late Miocene coincides with the long-term decrease in global temperatures, suggesting that carbon cycle changes were the driver for late Miocene global cooling.With CO 2 concentrations in the 300-600 ppmv range, it has not been possible to simulate the observed polar amplified warmth of the Miocene, as well as the lack of summer sea-ice more typical of mid-to-late Miocene conditions (Stein et al., 2016).
Numerous studies have therefore explored the role of other potential mechanisms.Although geologically recent, paleogeographic differences between the Miocene and today are substantial.There were major differences in ocean gateway configurations, such as connections between the Pacific and Atlantic Oceans, the Atlantic and Indian Oceans, and the Indian and Pacific Oceans, which today are either restricted or closed completely (Figure 2).Mountain building was occurring in all of the world's major mountain chains in the Miocene (see He et al., this issue), which transformed local hydrological regimes.Changes in the land surface itself may also be important for driving global warmth.Smaller ice sheets at both poles likely contributed to the polar amplified warmth and the weaker latitudinal temperature gradient of the time.Vegetation distributions resulting from a warmer global climate may itself provide a stabilizing feedback mechanism to maintain that warmth (e.g., Bradshaw et al., 2015;Knorr et al., 2011).As described in more depth in the following section, while these mechanisms have been shown to contribute to warming in models, they fall short of explaining the full extent, and the polar amplified spatial structure, of Miocene warmth (also see summary in Steinthorsdottir et al., 2020).
In this study, we synthesize available Miocene modeling efforts, together with available terrestrial and ocean surface temperature reconstructions, within a single analysis framework.We evaluate the current range of model-data agreement, highlight robust mechanisms operating across Miocene modeling efforts, as well as the regions where the differences across models (coming from a combination of model differences in imposed non-CO 2 Miocene boundary conditions and model feedbacks) result in a large spread in warming re-
The red boxes highlight prominent paleogeographic features that evolved over the Miocene and into the Pliocene, namely: the Panama Gateway, Bering Strait, Barents Sea Landmass, Indonesian Seaway, Tethys Seaway, Greenland-Scotland Ridge, and the Canadian Archipelago.See Table 4 which describes the key characteristics associated with the various baseline paleogeographic forcings used across the MioMIP simulations.
sponses.In Section 2, we provide some background on Miocene modeling and the motivation for this study.
In Section 3, we detail the Miocene modeling efforts analyzed, as well as the Miocene surface temperature records that we have synthesized to facilitate a model-data comparison and Miocene global mean surface temperature estimates.The intermodel analysis is presented in Section 4, where we discuss global mean warmth, simulated warming patterns and present the model-data comparison.We conclude with Section 5.

Background
Climate model simulations pertaining to the Miocene date to the early days of climate modeling itself.Experiments to assess the role of uplift of the Himalayas and the Tibetan Plateau and of the American Rockies found that lower orography resulted in marked changes in precipitation patterns and warmer winter temperatures, but that the magnitude of the temperature change simulated was insufficient to rule out the need for other climate forcings (Kutzbach et al., 1993;Ruddiman & Kutzbach, 1989).These early simulations used atmosphere-only models, however, and more recent coupled atmosphere-ocean models show that uplift of the Tibetan Plateau also impacts ocean circulation resulting in far-field warming (Su et al., 2018).
Several modeling studies have investigated the role of changes in ocean gateways.Miocene Ocean General Circulation Models (OGCMs) simulating an open Isthmus of Panama tend to have a net volume transport that is eastward from the Pacific to the Atlantic Ocean (Maier-Reimer et al., 1990;Mikolajewicz et al., 1993;X. Zhang et al., 2012).Water from the Pacific Ocean is fresher than that in the Atlantic Ocean, causing these modeling studies with an open Isthmus of Panama to simulate a freshening of the North Atlantic, a resulting weakening of North Atlantic Deep Water (NADW) formation and strength of the Atlantic Meridional Overturning Circulation (AMOC).This reorganization of ocean circulation results in cooling of the Northern Hemisphere and warming of the Southern Hemisphere, but the global mean annual temperature changes are small (∼+0.25°C or less, Lunt et al., 2008).With a wide-open Panama Gateway, there no longer needs to be a strong relationship between NADW formation and the AMOC, the North Atlantic can import water through the open Isthmus and through Tethys in the surface branch and export in through the Panama Gateway in the deep branch.Although the Isthmus would have been closing throughout the Miocene, most proxy records show that NADW did not strengthen significantly until after the Miocene (Bartoli et al., 2005;Burton et al., 1997;Haug & Tiedemann, 1998;Lear et al., 2003;Ling, 1997;O'Dea et al., 2016).That said, NADW probably initiated and fluctuated in strength since the late Eocene (Coxall et al., 2018).Therefore, although the timing of emergence of the Isthmus is still hotly debated (e.g., Montes et al., 2012), tectonic changes in this region are not thought to be the cause of Miocene warmth.Together with the Isthmus of Panama, the depth and width of other ocean gateways may have also played a role in the Miocene temperature distribution.It has been suggested that the closure of the Tethys tropical connection between Indian and Atlantic Oceans, together with a deepening or widening of the Drake Passage, induced a reversal of the net ocean flow through the Isthmus (Omta & Dijkstra, 2003;von der Heydt & Dijkstra, 2006) approximately in the Early Miocene.Before the Tethys closure, net transport through the Isthmus was toward the Pacific, while after closure, it was directed toward the Atlantic as several other Miocene GCMs have shown.
Nevertheless, independent of the net flow direction in the Isthmus, GCM simulations show export of Northern Component Water formed in the Atlantic (though much weaker than present) toward the Pacific (von der Heydt & Dijkstra, 2006), keeping intermediate-to-deep waters relatively warm compared to the present day (Sijp et al., 2014); the latter holds in particular for the closed-Tethys situation, which is representative of most of the middle-to-late Miocene.
Modeling studies that have taken all of the reconstructed paleogeographic, ice sheet, and vegetation differences into account simultaneously show that these changes alone (discounting any CO 2 differences or orbital variability) make the Miocene world globally warmer than today.For preindustrial CO 2 concentrations, the models estimate that annual mean temperatures are warmer than preindustrial conditions by 0.3°C-3°C (Bradshaw et al., 2012;Herold, Huber, Greenwood, et al., 2011;Herold, Huber, & Müller, 2011;Knorr et al., 2011) for the late Miocene and 3.1°C for the middle Miocene (Stärz et al., 2017) due to these nongreenhouse gas forcings.For middle Miocene boundary conditions, even with a slightly lower CO 2 concentration than preindustrial (at 200 ppm), a global annual mean temperature of 1.7°C warmer than modern has been simulated (Frigola et al., 2018).The anomaly between the Miocene temperatures and modern temperatures scales nonlinearly with the CO 2 concentration however, because of higher Miocene climate
The inability of Miocene simulations to capture the full extent of the polar amplified warmth and meridional temperature gradient reduction reflected in reconstructions is a problem that is not unique to the Miocene.Historically, it has been equally challenging to fully simulate the high-latitude warmth of the Eocene (Huber & Caballero, 2011;Huber & Sloan, 2001).That said, the latest DeepMIP effort which targets the Eocene has seen some improvement in the ability of models to simulate high-latitude warmth-three of the DeepMIP models (NorESM, GFDL, and CESM1.2) show improved skill in capturing elements of the polar amplification (Lunt et al., 2020) as reconstructed by a comprehensive synthesis of all the available Eocene data (Hollis et al., 2019).In the DeepMIP ensemble, the non-CO 2 component of warming from preindustrial to Eocene boundary conditions ranges from 3°C to 5°C (Lunt et al., 2020).This indicates that a significant contribution to warming comes from paleogeographic forcing, including vegetation changes, and ice-albedo effects.The improved model-data agreement seen in these Eocene simulations is likely also related to the better representation of cloud microphysics overall (Kiehl & Shields, 2001;Sagoo et al., 2013;Zhu et al., 2019).Interestingly, two of the DeepMIP models that best represented the proxy data were also models that implemented adapted aerosol concentrations, either coupling aerosols to a new cloud microphysics scheme (CESM; Zhu et al., 2019) or including aerosol forcing as a new fixed boundary condition (GFDL; Hutchinson et al., 2018;Lunt et al., 2020).These aerosol changes influence cloud microphysics and hence the radiative properties of these simulations and represent a major advance in their physical realism, but one that is largely unconstrained by geological data (Huber, 2013).
If the Eocene is the warmer endmember for the Miocene, the Late Pliocene which has been a target of the Pliocene Model Intercomparison Project represents the cooler endmember.This model intercomparison project is now in its second phase (PlioMIP2, Haywood et al. 2016), with a more tightly constrained time slice defined compared to the first phase aimed at reducing uncertainty in the proxy data.Because the Late Pliocene paleogeography is much more similar to the modern day, the non-CO 2 component of warming is much smaller than for the Eocene.Global annual mean surface air warming for the multi-model ensemble of PlioMIP2 is 3.2°C (with a range between 1.7°C and 5.2°C) relative to pre-industrial.All of these simulations use an atmospheric CO 2 concentration of 400 ppm.The multi-model-mean warming for PlioMIP2 is larger than seen in PlioMIP1 and is the result of contributions from recent models with modified aerosol and cloud microphysics, that have a relatively high climate sensitivity (Haywood et al., 2020).PlioMIP2 models exhibit polar amplification that is in better agreement with proxies than in PlioMIP1 models (Haywood et al., 2011(Haywood et al., , 2013)), at mid and high latitudes, in part due to re-assessment of high latitude proxy records (McClymont et al., 2020), and in part due to changes in the configuration of the Bering Straits (Haywood et al., 2020, Otto-Bliesner et al., 2017) in the model simulations.Pliocene model-data comparisons can also provide constraints on climate sensitivity (Hargreaves & Annan, 2016;Haywood et al., 2020;Renoult et al., 2020), the relationship between climate sensitivity and Earth system sensitivity (Haywood et al., 2020;Lunt et al., 2010), and the contribution to warming from non-CO 2 forcings (Chandan & Peltier, 2018;Lunt, Dunkley Jones, et al., 2012;Lunt, Haywood, et al., 2012).Overall, the Pliocene represents a world that is in equilibrium with near-modern CO 2 forcing, and exhibits a climate similar to that expected at the end of this century under relatively optimistic emissions scenarios (Burke et al., 2018).
Although the Miocene represents a climate state in-between that of the early Eocene and the mid-Pliocene, and presents unique modeling challenges, no formal MIP exists for the Miocene.The existing Miocene model-data comparison efforts have used different proxy data sets and validation techniques (Bradshaw et al., 2012;Goldner et al., 2014;Herold, Huber, Greenwood, et al., 2011;Herold, Huber, & Müller, 2011;Krapp & Jungclaus, 2011;Micheels et al., 2011), and are restricted to interpreting results in a single model framework.Here, we seek to take advantage of the fact that several Miocene modeling efforts have been conducted and evaluate them within a single, consistent, model-data comparison framework to establish not only the degree of model-data agreement in simulating Miocene warmth but also the spread in model responses.

Paleoceanography and Paleoclimatology
As a community, we have endeavored to take the first step toward the ideal solution of having a model-model and model-data intercomparison effort with uniform boundary conditions and standardized reference proxy data sets.Taking a similar approach as in the early steps of the DeepMIP process (Lunt, Dunkley Jones, et al., 2012;Lunt, Haywood, et al., 2012) (Lunt, Dunkley Jones, et al., 2012;Lunt, Haywood, et al., 2012).The results presented in this study serve to synthesize the current range of model-data agreement seen across Miocene modeling efforts to date, highlight robust mechanisms operating across Miocene modeling efforts, and take stock of the current progress toward simulating Miocene warmth while isolating remaining challenges.Our eventual aim is to catalyze a formalized and community-based MioMIP project.

Available Modeling Efforts
Several studies targeting the Miocene have been performed by modeling groups across the globe.In this section, we provide a brief overview of the simulations performed by these studies as summarized in Tables 1-3.Herold et al. (2008), that is, Herold et al. (2010) and You et al. (2009), or the updated boundary conditions of Herold, Huber, and Müller (2011), that is, Herold, Huber, Greenwood, et al. (2011) and Herold et al. (2012).The details of these CCSM3 simulations are summarized in Tables 1  and 2. Table 4 summarizes the key features associated with the Herold, Huber, and Müller (2011) paleogeography boundary conditions which have been widely used by other groups and have formed the basis for some other Miocene boundary condition efforts.For the sake of this intercomparison study, output from the last 100 years of the CCSM3-NH simulation (Herold, Huber, Greenwood, et al., 2011;Herold et al., 2012; Table 2) run for 1,100 years have been provided, at which point global mean ocean temperature varied by less than 0.01° per century.
CCSM3 T42 (MARUM): CCSM3 T42 (MARUM) simulations have been performed for the periods just before (Middle Miocene Climate Optimum; MMCO) and after (Middle Miocene Glaciation; MMG) the Middle Miocene Climate Transition around 13-15 Ma (Table 2).These two experiments differ in the atmospheric CO 2 concentration (400 ppmv in MMCO and 200 ppmv in MMG) and the volume of the Antarctic ice sheet (43-m difference in sea-level equivalent).In addition, two sensitivity experiments with respect to atmospheric CO 2 have been carried out: experiment MMCO_200 uses the MMCO Antarctic ice sheet but an atmospheric CO 2 level of 200 ppmv, while MMG_400 uses the larger MMG ice sheet but an atmospheric CO 2 level of 400 ppm.All other greenhouse gases, as well as ozone distribution, aerosols, solar constant, and orbital configuration, were set to pre-industrial levels.Table 2 Continued global topography, bathymetry, and vegetation as described in Frigola et al. (2018).The experiments were integrated for a total of 1,500 years each, where the last 100 years were used for analyses.
CCSM4: Following on from Goldner et al. (2014) which used a "slab ocean", fully coupled simulations have been conducted using version 1.0.5 of the Community Earth System Model with version 4 of the Community Atmosphere Model (the equivalent of CCSM4; Gent et al., 2011).The CESM1.0 model improvements, in comparison to older generation models, are described in previous study focused on both modern-day climate (Bitz et al., 2012;Neale et al., 2010) and paleoclimate (Goldner et al., 2014;Shields et al., 2012); >2,000 years were simulated.The simulations were initialized with initial conditions from the prior CCSM3-NH simulations so that they equilibrated quickly, and were run with Early to Middle Miocene boundary conditions updated from Herold et al. (2008), Herold, Huber, and Müller (2011), and Herold et al. (2012).These updates to the paleogeography were extensive (Figure 2b) and not previously described.We therefore provide a description of how these boundary conditions were constructed (supporting information S1) and make them available to the community (supporting information Data Set S1).These simulations used the bulk aerosol mode (BAM) with preindustrial aerosols prescribed.

CESM1-CAM5:
The CESM1-CAM5 simulations are run using Miocene boundary conditions that are essentially identical to those used in the CCSM4 simulations.The updated paleogeography described above is used throughout.Orbital parameters and other boundary conditions follow the pre-industrial configuration.
The main difference between the CCSM4 and CESM1-CAM5 cases lies in the physics of the atmospheric component.CAM5 has many improvements over its predecessors in its handling of clouds and aerosols (Bacmeister et al., 2012;Gettleman et al., 2008Gettleman et al., , 2010Gettleman et al., , 2015;;Neale et al., 2010), radiative transfer (Iacono et al., 2008), deep convection (Neale et al., 2010), and shallow convection and moist boundary layer processes (Bretherton & Park, 2009).The combination of these improvements in physical parameterization allows for the simulation of complete aerosol-cloud interactions of cloud droplet activation by aerosols, precipitation processes due to particle size-dependent behavior, and explicit radiative interaction of cloud particles.These simulations are run with the standard CAM5 3-mode modal aerosol mode (MAM3, X. Liu et al., 2012).The aerosol flux is tuned to be close to the preindustrial control flux (Dicks, 2019).Using MAM3 ensures self-consistent aerosol-cloud microphysical interaction.
As described in Zhou et al. (2018), these CESM1-CAM5 simulations were initialized from the end of the CCSM4 simulations (at 2,200 model years).The CCSM4 simulation was equilibrated with a surface radiative imbalance of 0.08 W m −2 .The CESM1-CAM5 simulation used here was continued for another 1,800 years at which point the global mean surface temperature was nearly unchanged and the long-term mean surface radiative imbalance was 0.07 W m −2 .These simulations were then extended, and the output was archived for this study.
COSMOS: COSMOS, with its ECHAM5 (Roeckner et al., 2003) atmospheric component and MPIOM (Marsland et al., 2003) oceanic component, has been used for simulating the Late Miocene in Knorr et al. (2011) and the Middle Miocene Climate transition in Knorr and Lohmann (2014).For further details regarding the paleogeography reconstruction and proxy-based reconstruction of the Late Miocene vegetation, refer to Micheels et al. (2007Micheels et al. ( , 2011) ) and the references therein.The corresponding boundary conditions are available at PANGAEA (Knorr et al., 2019).Further details on ECHAM5/MPIOM are described in Jungclaus et al. (2006).Furthermore, Middle to Early Miocene simulations have been performed in a COSMOS model configuration (Huang et al., 2017;Stäerz et al., 2017) that additionally include dynamical vegetation as part of the land surface scheme JSBACH (Brovkin et al., 2009;Raddatz et al., 2007).The model setup targets the Middle to Early Miocene time period (∼23-15 Ma) using the boundary conditions of Herold et al. (2008), including orography, paleobathymetry, and ice sheet adjustments.Additionally, a regional high-resolution bathymetric reconstruction comprising the North Atlantic/Arctic Ocean (Ehlers & Jokat, 2013) has been implemented.

Genesis-slab ocean:
The Genesis simulations were performed to provide boundary meteorology to an Antarctic regional climate model; they have different Antarctic ice sheet extents and have extremes of astronomical parameters that lead to highs (ecc.= 0.05, obliquity = 24.5°,longitude of precession = 270°) and lows (ecc.= 0.05, obliquity = 22.5°, longitude of precession = 90°) of insolation for Antarctica.Simu-

Paleoceanography and Paleoclimatology
lations were also performed with an astronomical configuration similar to the modern day (ecc.= 0, obliquity = 23.5°).The atmosphere is coupled to a 50-m slab diffusive mixed-layer ocean and dynamical sea-ice model (Thompson & Pollard, 1997).The runs include dynamic vegetation and oxygen isotope tracing.The paleogeography is from Herold et al. (2008)  HadCM3L-Marzocchi: These orbital parameter sensitivity simulations are initialized from the end of the 2,100-year late Miocene simulation by Bradshaw et al. (2012), so the model setup is identical to that described above as "HadCM3L-Bradshaw" for the late Miocene, apart from version HadCM3LB-M2.1aE(Valdes et al., 2017) was used.These simulations are all run for an additional 200 years with varying orbital parameters.Climatological means of the last 50 years of simulation are used for analyses.
HadCM3L-Farnsworth: These simulations use the HadCM3LB-M2.1aD(Valdes et al., 2017) coupled atmosphere-ocean general circulation model with a dynamic vegetation scheme as used for the late Miocene HadCM3L-Bradshaw simulations.The Getech Plc.paleogeographic (topography, bathymetry, and ice sheets) boundary conditions are the same as Farnsworth, Lunt, Robinson, et al. (2019) with stage specific solar luminosity for the Langhian, Tortonian, and Messinian (see Tables 1 and 2) as calculated by Gough et al. (1981).pCO 2 is set to 400 ppm (as well as two sensitivity studies at 280 and 560 ppm) in line with proxy estimates (Foster et al., 2017).Each simulation is integrated for 7,422 model years and has reached equilibrium in both the atmosphere and deep ocean with climate means taken from the last 100 years of each run.Each simulation was initialized from a stationary state in the ocean with the atmosphere initialized from a preindustrial state (for full details see Lunt et al. [2017] and Farnsworth, Lunt, O'Brien, et al. [2019]).
NorESM-L: These simulations use NorESM-L which couples the spectral Community Atmosphere Model (CAM4) (Eaton, 2010;Neale et al., 2013) and the Miami Isopycnic Coordinate Ocean Model (MICOM).NorESM-L performs well in simulating the pre-industrial climate (Z. S. Zhang et al., 2012) and has good skill in simulating paleoclimates (Z.Zhang et al., 2013;Z. Zhang et al., 2014).Detailed introduction of NorESM-L can be found in Z. S. Zhang et al., 2012;Bentsen et al., 2013, and the Miocene boundary conditions can be found in Z. Zhang et al. (2014).

Available Temperature Reconstructions
Here, we describe the proxy data sources used in the model-data comparison shown.

Late and Middle Miocene Terrestrial Mean Annual Temperature Estimates
The Bradshaw et al. (2012) et al., 2014).CLAMP and Leaf Margin Analysis use the morphological characteristics of fossil leaves to determine climatic parameters (Uhl et al., 2003;Wolfe, 1971;Yang et al., 2011).The new synthesis expands the original data sets of 52-110 sites with important additional localities in the southern hemisphere (Table S1).Additional data points have been synthesized from the published literature with a cutoff date of October 2019.To maximize geographic spread, some sites with published taxa lists and no climate reconstructions have had MAT estimates reconstructed.New MAT reconstructions are based upon the Co-existence Approach, following the advice of Utescher et al. (2014) and the updated data set of Pound and Salzmann (2017).All error margins and uncertainty ranges follow the published authors, unless none were provided in which case a 5°C uncertainty range was applied.A subset of the records within this new Middle Miocene MAT data set, with ages that fall between 14.5 and 16.75 Ma, are used to estimate MCO global mean surface temperature as outlined in Section 3.4.

Late, Middle Miocene and MCO Sea Surface Temperature Estimates
Published Miocene SST reconstructions (Mg/Ca, U k 37, and TEX 86 ) have been synthesized from the literature to provide Late and Middle Miocene mean annual SST estimates (Figure 2 and Table S2).The majority of the Miocene SST records are U k 37 records to which we assign a 95% confidence interval uncertainty of ±3°C (K.T. Lawrence et al., 2007).This value is aligned with one standard deviation uncertainty estimates of ±1.1°C (Conte et al., 2006) to ±1.5°C (Tierney et al., 2018).Care needs to be taken in tropical applications of the U k 37 ratio especially during past warm intervals because the ratio reaches saturation and thus the thermal limit of the paleothermometer at temperatures below the maximum surface water temperatures in the modern tropical ocean (Tierney et al., 2018).The remainder are Mg/Ca or TEX 86 records.Care needs to be taken when interpreting TEX 86 records as they have been shown to be influenced by temperature signals below the surface mixed layer (Lopes dos Santos et al., 2010).For the TEX 86 uncertainty estimates, we use the stated 90% or 95% confidence intervals published with a given record or convert the stated one standard deviation uncertainty estimate to a 95% confidence interval.Care also needs to be taken when using Miocene age Mg/Ca records given uncertainties in the Mg/Ca of seawater, salinity, and pH (Gray & Evans, 2019;Holland et al., 2020).Therefore, whilst most modern Mg/Ca-temperature calibrations have a standard deviation on the order of ±1.2°C (Anand et al., 2003), here we assign a 95% confidence interval uncertainty of ±4°C on absolute Mg/Ca SSTs.This Mg/Ca uncertainty estimate is based on the average 95% confidence interval assigned to the Mg/Ca records synthesized for the Eocene in Hollis et al. (2019), where the DeepMIP community did a thorough assessment of the impacts of the above-mentioned factors on Mg/ Ca uncertainty for the Eocene.While a comparable assessment for the Miocene Mg/Ca records is planned, this serves as an appropriate intermediate estimate.Given that the number of ocean surface temperature records from high latitude regions is limited, future analysis may consider comparing against published benthic temperature records.
Late Miocene reconstructed values are defined as all available estimates from a given site falling between 11.6 and 5.33 Ma and Middle Miocene reconstructed values are defined as all available estimates falling between 15.97 and 11.63 Ma.It is important to note that the records differ in temporal resolution and while some records span the entire interval, others cover only a portion.To partially address the caveat that some records are more representative of the end of the Middle Miocene interval while others the beginning, and given that significant cooling occurs toward the end of the Middle Miocene (end of the Langhian though the Serravallian), the sensitivity of the model-data SST bias results to focusing on a narrower interval within the MCO (14.5-16.75Ma) has been assessed-noting however that fewer sites are available for the comparison after this refinement.

Model-Data Comparison Methodology
The model-data comparison methodology employed in Section 4.3 is similar to that used in previous Miocene studies (Bradshaw et al., 2012(Bradshaw et al., , 2015)).For each site, the one standard deviation (68%) uncertainty intervals are added to the mean reconstruction values across the given timeslice and the site location is translated back to an estimated paleolocation.The method used to estimate these paleolocations uses plate reconstructions consistent with the paleogeographies described in Markwick and Valdes (2004) that were based on Rowley (1995, pers. comm.).All model output is interpolated, using bilinear interpolation, to a common 1-degree by 1-degree grid.Reconstructed values are compared to the minimum and maximum values simulated in the model grid cell containing the reconstructed data, and all adjacent model grid cells.This allows for data location uncertainties and model uncertainties such as the misplacement of large-scale climate features (it is unreasonable to expect a model to reconstruct the exact climate of a single grid cell, also note that by using degrees this encompasses a larger zonal distance in the tropics than high latitudes).Since the model simulations and data reconstructions are ranges of possible values rather than a single value, the term overlap is used to define where the two value ranges are consistent (see Figure S1).Bias between the model and data is determined where there is no overlap between the model and data uncertainty ranges and the magnitude of this bias is determined by the degree of separation between the uncertainty intervals of the two data sets (see Figure S1, note that as illustrated in Figure S1 this represents the minimum possible bias within the given uncertainty ranges).When, for a given model setup, a grid point that corresponds with a given terrestrial MAT (SST) site location, as well as its neighboring grid points, are in the ocean (over land) the site is removed from the bias calculation.
Note that rather than using the mean reconstructed SST values for the given time interval, ± the assigned data uncertainty, an alternative approach would be to fully encompass the variability in SST estimates seen across the Late Miocene, Middle Miocene, and MCO time intervals and use maximum and minimum recorded values plus uncertainty-as in Bradshaw et al. (2012Bradshaw et al. ( , 2015)).Given that the SST reconstructions have varying temporal resolutions, and are not collocated in time, this approach has the potential to mask spatial covariance signals (e.g., a higher resolution record can capture more of the sample variance even when the two samples have similar means, which in turn could lead to a smaller "bias" derived using the approach outlined above, even though the means are similar).Given that this paper focuses on the ability of Miocene simulations to represent gross features of Miocene warmth, we feel that the use of mean reconstructed SST values is more appropriate in this context.

Global Mean Surface Temperature Estimates
Global Mean Surface Temperature (GMST) estimates for the Late Miocene, Middle Miocene, and MCO are calculated using the proxy MAT and SST syntheses compiled (Section 3.2).We use a method that has been previously employed to provide Eocene GMST estimates.Despite its simplicity, when applied to the same set of Eocene temperature reconstructions (Hollis et al., 2019), this method produces values that are largely consistent with a range of other methods (Inglis et al., 2020).The zonal mean temperature (T) profile at sea level is approximated using a simple model: where coefficients a, b, and c are chosen to minimize the sum of the squared residuals relative to the Miocene surface temperature proxy data (Figure 3).As demonstrated in Figure 3 of Inglis et al. (2020), this model provides an accurate approximation of the modern    T profile.For both the MAT and SST data, we account for the uncertainty in each temperature estimate by assuming a normal probability distribution around each temperature estimate with the stated uncertainty interval treated as the 90% confidence interval.For the MAT data, we also account for the uncertainty in the elevation of each site by assuming a skew-normal distribution with a 90% confidence interval equal to the lowest and highest elevations within all paleotopographic grid points adjacent to the given site (with a lower bound of zero).For the Middle Miocene and MCO calculations, the paleotopography from the Middle Miocene paleogeography described in this article (updated Herold, Huber, and Müller [2011]) is used, while the Knorr et al. (2011) paleogeography is used for the Late Miocene.The temperature and elevation distribution at each site is then randomly sampled and corrected to sea level by applying a lapse-rate adjustment of 6 K/km.This implies that errors in the paleotopography may influence the estimate of global mean surface temperature, and the resolution and fidelity of the paleotopography used will be a possible source of errors when the terrestrial data is included.Using a standard Monte Carlo bootstrapping method, we resample the same number of data points with replacement and find the coefficients in Equation 1 that best fit the sub-sampled data.A probability distribution for    T is found by repeating this procedure 10,000 times.The result is shown in Figure 3

Zonal Mean Energy Balance Analysis
To establish the relative contribution of energy flux convergence and each radiative component to the meridional warming patterns within the Miocene simulations, we employ a 1-D zonal mean energy balance analysis framework (Feng et al., 2017;Heinemann et al., 2009;Hill et al., 2014;Lunt, Dunkley Jones, et al., 2012;Lunt, Haywood, et al., 2012;Lunt et al., 2020).1-D energy balance is used to approximate zonal mean surface temperature: where    T is the zonal mean surface temperature, a function of latitude, . , and C is the surface heat capacity.When applied to equilibrated or near equilibrated, annual-mean, climatological fields, the term on the left-hand side is negligible, and simulated zonal mean surface temperature profiles are well approximated by the balance on the right-hand side of Equation 2. This balance is between net downward shortwave, net outgoing longwave, and the convergence of energy by atmospheric and ocean energy transport, such that the solution for The change in    T between each Miocene experiment and its respective control is then , where the prime represents the control simulation.Given that changes in  ,  and H are small relative to their absolute values, the contribution of chang- es in emissivity, albedo and heat transport can be approximated as , where Using clear-sky (cs) radiative fluxes Δ emm T can be decomposed into the contribution due to the greenhouse effect (CO 2 , water vapor, and lapse rate), . Similarly Δ alb T can be decomposed into the contribution of clearsky changes due to surface albedo and aerosols, There are limitations in this diagnostic approach, particularly when it comes to separating out the changes due to clear-sky and cloud radiative forcing in the presence of sea-ice, and we have not separated the contribution of changes in surface elevation from the greenhouse effect.Nevertheless, it provides insight into the relative contribution of the five components, , based on what was feasible given the output available within our MioMIP data set.

Paleoceanography and Paleoclimatology
relative to the preindustrial control climate, explaining around 75% of the variance (Figure 4a).The best fit regression across these experiments is for a climate sensitivity of 0.97 Wm per CO 2 doubling (this is less valid when doubling at higher CO 2 ; Etminan et al., 2016) translates into ∼3.6°C per doubling.This first-order linear relationship can be used to provide a few additional insights based on this particular ensemble performed with a broad range of boundary conditions reflective of how conditions may have changed throughout the Miocene.First, the X-intercept suggests that the mean impact of the range of non-CO 2 Miocene boundary conditions used is to raise global mean surface temperature by 1.96°C (this statement ignores nonlinear interactions between CO 2 induced temperature changes and temperature changes induced by other changes in the boundary conditions but provides an insightful starting point).Second, deviations from this first order linear relationship range from −1.87°C to 2.39°C providing an indication of the spread in warming responses arising due to a combination of the spread in imposed Wm or 1.2 times the forcing from a CO 2 doubling.
As described in Section 3.4, GMST estimates ± one standard deviation have been calculated using the Late Miocene, Middle Miocene and MCO syntheses described in Section 3.2.Two different estimates have been derived for each interval, first an estimate derived using both the terrestrial MAT and SST data sets (Figure 3, top row), and second an estimate based on only the SST data (Figure 3, bottom row).When both the terrestrial MAT and SST are included, this method estimates Late Miocene GMST as 19.30°C ± 0.98°C, Middle Miocene GMST as 21.21°C ± 0.56°C and MCO GMST as 22.93°C ± 1.01°C.If the terrestrial sites are excluded, these GMST estimates based only on SST reconstructions are substantially warmer: Late Miocene GMST 21.95°C ± 0.81°C, Middle Miocene GMST 24.46°C ± 0.81°C and MCO GMST 25.47°C ± 1.17°C.There are several reasons why the terrestrial data might bias the estimate cold, including calibration techniques biased toward the modern temperature range and that many of these records are from mountain ranges (while we have applied a lapse rate correction, it is only as good as our elevation estimate derived from a low resolution paleotopography product which might lead to a cold/warm biasing).On the other hand, the number of SST reconstructions is limited such that increasing the sample size is needed to constrain the large uncertainty in GMST that falls out of our analysis.Using a preindustrial value of 14°C (Hansen et al., 2013), Figure 4a

Warming Patterns
Starting with the zonal-mean changes in surface temperature (Figures S2 and S3), across the entire ensemble the magnitude of tropical warming spans ∼0°C-8°C, while high-latitude warming (latitude > 60°N and °S) spans ∼0°C-38°C in the SH and ∼0°C-18°C in the NH.While there is substantial spread in its magnitude, polar amplified warming is a robust feature across the ensemble (Figure 5).The wide range of warming responses seen in the SH reflects the range of boundary conditions prescribed for the Antarctic Ice Sheet given the uncertainty in its evolution across the Miocene.Interestingly, once the zonal-mean responses are stratified by prescribed CO 2 concentration, there does not appear to be a consistent mapping between the degree of high-latitude warming and tropical warming, with the degree of polar amplification ranging substantially across the experiments (e.g., when looking at the 560 ppm experiments in Figure S3 widely different high latitude warming due to varying ice sheet extents has a limited impact on tropical temperatures).This result points to a somewhat limited impact of high-latitude glacial changes on the tropics relative to the more systematic influence of CO 2 on tropical temperature (Figure 5).This feature appears to hold not only across the different models, but within a given model, namely HadCM3L, for which experiments with a range of Antarctic ice-sheets display considerable variability in high-latitude surface temperature but limited variability in tropical temperature (Figure S3).
Stratifying the energy balance analysis by prescribed CO 2 concentration we see that a stronger polar greenhouse effect (due to CO 2 , water vapor and lapse rate) and decreased surface albedo, are the dominant contributors to the polar amplified warming-a result consistent with the literature on future climate change, for example, Pithan and Mauritsen (2014).One caveat with differencing the all-sky and clear-sky to separate the surface and cloud albedo components is that the masking of surface albedo changes by clouds is not adequately deconvolved, particularly in the presence of sea ice.For example, in a polar region where climatologically high cloud cover largely masks the influence of surface albedo, the method will overestimate the contribution of reduced surface albedo given that the cloud albedo is taken as the residual between total albedo and surface albedo.Therefore, the method can incorrectly indicate a cooling signal due to cloud albedo changes even if cloud properties remain similar to the control.Given this caveat, the high-latitude

Paleoceanography and Paleoclimatology
cooling/warming signal seen in Figure 5 due to changes in cloud/surface albedo may not be as strong as the analysis suggests.Regardless, the total albedo change due to the combined effect of surface and cloud albedo changes is a dominant driver of high-latitude warming.Interestingly, across the ensemble, changes in oceanic and atmospheric energy flux convergence tends to warm the southern high-latitudes and cool the northern high latitudes (there are of course exceptions, Figures S2 and S3).Cloud longwave forcing are generally second order.Equatorward of ∼60°N and °S the warming signal is generally more uniform with the greenhouse effect term dominating as a function of CO 2 concentrations.With this uniformity in mind it is not surprising that while the tropics to high-latitude temperature difference generally scales as a function of global warming across the ensemble (Figure 4b), the tropics to mid-latitude temperature difference only displays a weak sensitivity, with the exception of the NorESM (Figure 4c).
Moving beyond the zonal-mean, the two-dimensional annual mean surface temperature changes for each of the Late and Middle to Early Miocene simulations, relative to their preindustrial control simulations, are shown in Figures S4 and S5, respectively.The limited impact of high-latitude glacial changes relative to CO 2 forcing on tropical warming is further illustrated when looking at the two-dimensional warming patterns stratified according to imposed CO 2 concentration (Figure 6).The ensemble mean captures the average BURLS ET AL.  response to a given CO 2 concentration across a range of models and Miocene boundary conditions, and broadly indicates an increase in the degree of polar amplification with CO 2 .When looking at the variability in warming for a given CO 2 range, the influence of the range of boundary conditions prescribed for the Antarctic Ice Sheet is most evident, giving rise to a substantial range in warming responses over the Antarctic continent.Particularly striking is the limited footprint of this variability.Other dominant sources of spread in the surface temperature response for a given CO 2 concentration include differences in continental warm- ing.Hotspots like North Africa are presumably due to variability in vegetation changes, either prescribed or dynamic.While differences in prescribed orography may be more important in regions like Southern Asia.
Variability in ocean warming seen across the MioMIP ensemble, once stratified by CO 2 concentration, is muted relative to the spread in warming signals seen over the continents (Figure 7, note the difference in the standard deviation (STD) color bar range when compared to Figure 6), but still substantial.There are several hotspots of large spread in the response seen across the models.(STD between 2°C and 4°C) than in the Indo-Pacific and eastern Pacific warm pool regions (STD < 2°C).This result indicates that for a given CO 2 forcing there is considerable spread across models in the warming of equatorial upwelling regions relative to the rest of the tropics, and hence in the zonal gradient response.This disagreement in the response of the tropical Pacific to warming across models is not unique to simulating the Miocene.There is considerable disagreement across CMIP models in the historical and projected response of the tropical Pacific (Kociuba & Power, 2015), due to complex and competing atmosphere-ocean interactions operating on a range of timescales (Heede et al., 2020).Other hotspots in the spread in the SST response include the subpolar gyres in the North Atlantic and Pacific basins and the Southern Ocean.
Possible explanations for differing model responses in these regions include differing dynamical responses in wind-driven and thermohaline ocean circulation or regional cloud feedbacks.Alternatively, the model circulation responses might be dynamically similar but there are differences in the extent to which gyre boundaries have shifted.
Speaking specifically to the role of the spread in shortwave cloud feedbacks, Figure S7 confirms that the above-mentioned hotspots of variable SST responses do tend to correspond with hotspots of variable cloud albedo responses -clouds being the primary cause of TOA albedo changes over ocean regions too warm for sea-ice (Figure S7).There are several notable features in the model mean TOA albedo response stratified by CO 2 forcing.First, the north-south dipole signature seen in the eastern equatorial Pacific for all CO 2 concentrations indicates a southward shift in convection and the ITCZ.Similarly, increased albedo in the central Pacific relative to the western Pacific Maritime continent region is indicative of an eastward expansion of tropical convection in the Pacific basin, with a similar westward expansion in the Indian Ocean, a feature consistent with the enhanced ensemble mean SST warming in these regions relative to the western Pacific (Figure 7).Second, the model mean TOA albedo decreases over the subtropical to mid-latitude ocean, more so at higher CO 2 levels (Figure S7).This shortwave cloud feedback mechanism (Figure 5) contributes to weaker midlatitude to tropical surface temperature gradients with higher CO 2 , but as the model-data comparison results in the next section reveal, this does not manifest in enough midlatitude warming to match reconstructed SSTs from these latitudes-with perhaps the exception of the Late Miocene NH at 560 ppm.This in turn has implications for the ability of Miocene simulations to reproduce the reconstructed weakening of the meridional SST gradient between the tropics and the subtropics to mid-latitudes (Figure 4c); and potentially the enhanced warming seen in the eastern equatorial Pacific sites following the ocean tunnel mechanism -whereby the ocean wind-driven subtropical cells communicate subtropical warming signals to equatorial upwelling regions (Erfani & Burls, 2019;Fedorov et al., 2015).Third, Southern Ocean TOA albedo increases poleward of ∼50°S, a feature seen in future projections associated with a southward shift in the storm tracks (Ceppi et al., 2014;Shaw et al., 2016).Finally, reductions in both land and sea ice lead to model mean TOA albedo decreases in both the NH and SH high latitudes, with variability in the prescribed extent of ice sheets leading to the large STDs seen over both Antarctica and Greenland (Figure S7).
Sensitivity tests to differences in astronomical parameters were performed with two models in the MioMIP ensemble (Figure 8).Compared with the sensitivity to other uncertain boundary conditions, these results show important, yet relatively small (comparing Figure 8 with Figure 6), and regional, surface temperature sensitivity to changes in orbit.Simulations with differences in atmospheric CO 2 and Antarctic Ice Sheet extent show an increasing sensitivity to astronomical forcing with decreasing CO 2 and with decreasing ice sheet extent (Figure S6).The sensitivity to CO 2 is likely linked to sea ice extent.There is minimal sea ice in the high atmospheric CO 2 simulations, and with lower atmospheric CO 2 the sea ice extent increases as does the variability in sea ice area between simulations with different astronomical boundary conditions.Similarly, in the absence of an Antarctic Ice Sheet, there is greater variability in the Antarctic land surface albedo and increased sensitivity to changes in orbit.It is worth noting that orbital variability during the Miocene likely would have been accompanied by some sort of CO 2 variability (as in the Quaternary), and therefore the magnitude of orbitally induced climate change modeled here is a minimum estimate.

Model-Data Comparison
Starting with the Late Miocene, Figures 9 and 10 compare the simulated zonal-mean land and SST profiles for all the experiments targeting the Late Miocene against available proxy data estimates.The experiments are again stratified by CO 2 forcing.Both the terrestrial and SST records indicate a reduced meridional surface temperature gradient (Figures 9 and 10).Generally speaking, subtropical to midlatitude warmth tends to fall into better alignment with the proxy data as CO 2 levels increase.For the Northern Hemisphere, the NorESM-L simulation with 10 Ma boundary conditions and 560 ppm of CO 2 appears to provide a reasonable fit from a zonal mean SST perspective (Figure 10), but there are however large zonal asymmetries such that midlatitude SSTs are too warm in the Pacific basin, yet too cold in the Atlantic (Figures 13e and S9).Although still falling short of the proxy estimate, this experiment simulates one of the strongest relaxations of the meridional temperature gradients (Figure 4) with only a few degrees of tropical warming, between ∼4°C and 8°C of subtropical warming, and significant high latitude warming, exceeding well over 20°C in the Antarctic.It also provides one of the best fits with the terrestrial data (Figure 9).With Late Miocene CO 2 estimates spanning ∼300-450 ppm (Figure 1), one might expect the experiments forced with 350-450 ppm of atmospheric CO 2 to provide the best fit with that data.When looking at the mean SST bias across simulations (Figure 13), the root-mean-square error (RMSE) is lowest for 560 ppm (Figure 13e).That said, there is still the general indication that the deep tropics are warming too much while the mid-to high-latitudes are too cold (Figure 13c).It also appears that the zonal gradient in the equatorial Pacific is generally too strong in the models with not enough warming over the eastern Pacific cold tongue region and too much warming in the western Pacific (Figure 13c).From the perspective of the terrestrial MAT proxies, the 560ppm Late Miocene simulation leads to the lowest RMSE (Figure 13f).
Moving on to the Middle Miocene, both the terrestrial and SST proxies point to a further weakening of the meridional surface temperature gradient with a strikingly flat temperature gradient between the tropics and   (Figures 4c,11 and 12).Increasing CO 2 levels brings extra-tropical warmth into better alignment with the proxy data but the models still fall short (Figures 11 and 12), particularly on the regional scale, for example, cold SST biases persist in the North Atlantic and Southern Ocean even at 850 ppm (Figure 14g).Both the tropical Middle Miocene SST proxies and the tropical Middle Miocene terrestrial proxies indicate that 850 ppm of CO 2 leads to too much tropical warming (Figures 14g and 14h).That said, all the simulations with lower CO 2 concentrations generally suffer from cold biases everywhere outside of the tropics (Figures 14 and S11), supporting the conclusion that the models are generally failing to capture the weakening of meridional surface temperature gradients and the full extent of Middle Miocene warmth seen in the proxy data (Figure 4).This finding holds and intensifies when one narrows the data window to the MCO (Figure S12).

Conclusions
In this study, we have brought together a suite of Miocene climate simulations performed with a range of climate models and Late to Early Miocene boundary conditions.Serving as a first pass of this multi-model ensemble data set, the focus has been placed on surface warming, specifically the spread and sensitivity of warming responses to the range of boundary conditions imposed across the ensemble.Furthermore, a model-data comparison has been performed to assess the fidelity with which the ensemble is capturing large-scale Miocene warming patterns over both land and ocean, and to develop insight into how high CO 2 polar amplified warming • Strong regional warming is seen in response to prescribed changes in the Antarctic Ice Sheet but the influence beyond the Southern Hemisphere high latitudes is limited • The degree of polar amplified warming and weakening of the meridional temperature gradient increases with prescribed CO 2 forcing, but generally falls short of the reconstructed meridional gradient weakening seen in the proxies • "Hot spots" in which the SST response within a narrow range of CO 2 forcings is particularly variable across models including the eastern equatorial upwelling regions, the subpolar gyres, and the Southern Ocean, and appear to be linked to the spread in cloud albedo and ocean responses in these regions • Simulations with perturbed orbital parameters show important, yet relatively small and regional, surface temperature sensitivity with an increased sensitivity to orbital changes as CO 2 concentrations, or the extent of the Antarctic Ice Sheet, decrease The extent to which the Miocene surface temperatures within the ensemble align with available terrestrial and SST reconstructions have been assessed.Broadly speaking, the low meridional temperature gradient indicated by the proxies, specifically between the mid-latitudes and tropics, is difficult to reconcile with the simulations.Under Miocene-like CO 2 forcing (the 350-450 ppm experiments), the modeled climates still retain relatively large equator-to-pole temperature gradients, especially in the Northern Hemisphere.The models do not exhibit the estimated 5°C-10℃ warming of the high latitudes with these CO 2 levels.In practice, this means that to produce realistic warming of the northern high-latitudes, the models require higher CO 2 .Increasing CO 2 forcing leads to enhanced warming bringing the midlatitudes into better alignments with the data, but this in turn leads to overshooting in the tropics that become too warm.This pattern is surprisingly robust across both the Late Miocene, Middle Miocene, and MCO data sets, as well as the independent terrestrial and ocean data sets.Achieving the low meridional temperature gradient recorded in the Miocene thus remains an outstanding problem for most models.
That said, the latest generation of climate models with state-of-the-art parameterizations of cloud microphysics and cloud-aerosol interactions tend to support more positive cloud radiative feedbacks of the midlatitude oceans (Lunt et al., 2020;Zhu et al., 2019;Zhu & Poulsen, 2019).In addition, changes in aerosols may also be important as seen for the Eocene (Lunt et al., 2020) and Pliocene (Feng et al., 2019;Sagoo & Storelvmo, 2017).These model improvements may lead to enhanced midlatitude warming at lower CO 2 concentrations, but how they will then fare at simulating tropical Miocene warmth and the reduced meridional temperature gradient remains to be evaluated.More idealized sensitivity studies have revealed that both the meridional and the zonal equatorial Pacific SST gradients scale with the meridional gradient in cloud radiative forcing (specifically cloud albedo, Burls & Fedorov, 2014a, 2014b;Erfani & Burls, 2019), and provide a plausible mechanism for maintaining Early Pliocene SST patterns (Burls & Fedorov, 2014a, 2014b;Fedorov et al., 2015) and the hydrological cycle response (Burls & Fedorov, 2017).

Paleoceanography and Paleoclimatology
The model ensemble we have used here was an "post-hoc" model intercomparison using different boundary conditions and forcings for each model.Although we can still group the models into two timeframes (Early Middle and Late Miocene) and into different CO 2 forcing ranges, it is impossible to cleanly assess how much of the inter-model spread can be attributed to differences in boundary conditions versus model feedback strengths.In particular, even if we could assign one simulation as a good match with the proxies, the reasons for this better match would not necessarily be clear.Therefore, to more rigorously test the models' ability to simulate Miocene warmth, meridional gradients, and other features such as ocean circulation, cloud feedbacks, land surface behavior, and the hydrological cycle, carrying out a formal MIP is desirable.
The current study can guide a discussion on the careful design of the boundary conditions and model configurations to be used in such a formal model intercomparison.
I. In terms of paleogeography, the modeling groups participating in a formal MIP should use the same configuration for each model, and specific time slices (e.g., the MCO) should be chosen.II.A common set of prescribed ice sheet boundary conditions that sample the uncertainty in reconstructed estimate should be used.III.Sensitivity studies isolating the impact of gateway changes can then be encouraged by providing alternative continental geometries exploring different depths and widths of gateways based on minimum/ maximum estimates from plate tectonic reconstructions.Key gateways to be investigated include the Central American Seaway, the Tethys Seaway, the Indonesian Seaway, the Fram Strait as well as the Greenland-Scotland Ridge and Canadian archipelago.More detailed discussions about these gateway conditions can be found in He et al. (this issue) and Brierley and Fedorov (2016).IV.Models should be grouped into those with static or dynamic vegetation, and all models with static vegetation should use the same static vegetation distributions.Using a dynamic vegetation model is optimal as this allows for more direct comparison with vegetation cover indicated by the fossil flora record, for example, Henrot et al. (2017).Care needs to be taken to ensure that plant functional types are appropriate for the Miocene.V. Ideally, all models would treat aerosols in the same way, but if this is not feasible, differences should at least be well-documented (e.g., the PlioMIP2 protocol, Haywood et al. [2016], and DeepMIP protocol, Lunt et al. [2017]).VI.Finally, simulation should be run with a set of CO 2 concentrations that span the spread of uncertainty within Miocene CO 2 reconstructions.With the "best estimate" CO 2 concentration specified for when modeling groups can only carry out one simulation.
While a coordinated Miocene model intercomparison effort is an aspiration within the Miocene community, the analysis of the informal multi-model ensemble presented here has focused only on annual mean surface temperature and albedo responses and there remain several other aspects within this valuable data set for investigation in future study, for example, the hydrological cycle, subsurface ocean temperature, and circulation, differences in seasonality, and so on.We hope that the synthesis presented here will serve as a valuable starting point.We plan on describing updates and future community efforts on the DeepMIP Miocene website-https://www.deepmip.org/deepmip-miocene/.

Data Availability Statement
The new synthesis of terrestrial MATs generated here for the Middle Miocene can be found in the supporting information of this study (additional excel file Table S1).As described in Section 3 of the study, the Late, Middle and MCO SST estimates where compiled using published data sets, a summary of these data sets with links to their repositories has been complied in the supporting information, together with the average estimates derived from each data set for each time period (additional excel file Table S2).This effort to synthesis available Miocene SST will be continually updated as part of the Miocene Temperature Portal (https://bolin.su.se/data/miocene-temperature-portal) hosted at Stockholm University.These Tables S1 and S2 excel files have been archived in the Zenodo repository https://doi.org/10.5281/zenodo.4568897together with NetCDF files containing the updated Middle Miocene paleogeography and land surface data sets described in supporting information, and a NetCDF file containing the MioMIP variables used to make all the figures shown.The code used to make the figures is available on GitHub (https://github.com/nburls/MioMIP1).A description of each model's setup is provided in Section 3 of this study, together with the relevant configuration information and reference to the original modeling study in Tables 1 and 2.

Figure 2 .
Figure 2. (a) The Middle and Late Miocene sea surface temperature reconstructions synthesized and used within our model-data comparison.The cross (plus) symbols indicate the average Middle (Late) Miocene site location, while filled circles indicate the modern site location.The color-coding of each circle indicates the type of proxy record available for each site.The contoured sea surface temperature field shown is derived from the multimodel mean of all of the simulations listed in Table2with Early Middle Miocene boundary conditions.(b) The Middle Miocene paleogeography is described and made available in this study(updated Herold, Huber, and Müller [2011], see supporting information).The red boxes highlight prominent paleogeographic features that evolved over the Miocene and into the Pliocene, namely: the Panama Gateway, Bering Strait, Barents Sea Landmass, Indonesian Seaway, Tethys Seaway, Greenland-Scotland Ridge, and the Canadian Archipelago.See Table4which describes the key characteristics associated with the various baseline paleogeographic forcings used across the MioMIP simulations.

Figure 3 .
Figure 3. (Top row) Average zonal mean surface temperature estimate (red line), ± one standard deviation (red shading), computed from proxy MAT (green) and SST (purple) data for (a) the Late Miocene, (b) Middle Miocene, and (c) MCO.(Bottom row) Average zonal mean surface temperature estimate calculated using only the SST (purple) data for (d) the Late Miocene, (e) Middle Miocene, and (f) MCO.Latitude incoming shortwave radiation at the top of the atmosphere,  the planetary albedo defined as 

Figure 4 .
Figure 4. (a) The global mean surface temperature change (relative to the preindustrial control) versus prescribed atmospheric CO 2 concentration.The green, gray, and red shading indicate the Late Miocene, Middle Miocene, and MCO global mean surface temperature change estimates derived from the mean annual terrestrial and sea surface temperature reconstructions (Figure 3, top row), while the crosses show the equivalent global mean surface temperature change estimates based only on the sea surface temperature proxies (Figure 3, bottom row).(b and c) Global mean surface temperature versus (b) the tropics (30°S-30°N) to high latitude (60-90°N and °S) temperature difference and (c) the tropics (30°S-30°N) to middle latitude (30-60°N and °S) temperature difference.The green, gray, and red crosses indicate the Late Miocene, Middle Miocene, and MCO estimates derived from the mean annual terrestrial and sea surface temperature reconstructions (Figure 3, top row).
compares these Miocene GMST estimates with the simulated GMST changes.For the Middle Miocene and MCO, with the exception of the COSMOS Late Miocene simulation with 450 ppm CO 2 , only Middle to Early Miocene simulations with 840-850 ppm CO 2 intersect with the GMST change estimates based on both terrestrial MAT and SST.All the simulations fall below the higher SST-only Middle Miocene and MCO GMST change estimates.Of the Late Miocene simulations, the COSMOS simulation with 450 ppm and 25% of the modern Antarctic ice sheet height, and the NorESM-L simulation with 560 ppm CO 2 , intercept the reconstructed Late Miocene GMST change estimate based on both terrestrial MAT and SST.None of the Late Miocene simulations intercept the reconstructed Late Miocene GMST change estimate based only on SST.

Figure 5 .
Figure 5.The multi-model-mean of the energy balance analysis for the set of Late Miocene experiments with (a) 278 or 280 ppm, (b) 350-450 ppm, and (c) 560 ppm CO 2 , and the set of Middle-Early Miocene experiments with (d) 278 or 280 ppm, (e) 350-450 ppm, (f) 560 ppm, and (g) 840/850 ppm CO 2 .Note that the variables required for this analysis were not available for the HadCM3L Tortonia, Messinian, and Langhian experiments, see Figures S2 and S3 for the individual experiments.

Figure 6 .
Figure 6.(Left) The multi-model-mean of the annual mean surface temperature change relative to the preindustrial control for the set of experiments with (a) 278 or 280 ppm, (c) 350-450 ppm, (e) 560 ppm, and (g) 840/850 ppm CO 2 .(Right) The standard deviation of this temperature change across the experiments making up each set.The continental outline from the updated Herold, Huber, and Müller (2011) paleogeographic boundary conditions described in this study has been used (see supporting information).
First, the spread across models in the degree of warming experienced in the Pacific cold tongue region (typically the central to south-eastern equatorial Pacific, but depending on model biases, this can extend well into the western Pacific) is larger BURLS ET AL.

Figure 7 .
Figure 7. (Left) The multi-model-mean of the annual mean sea surface temperature change relative to the preindustrial control for the set of experiments with (a) 278 or 280 ppm, (c) 350-450 ppm, (e) 560 ppm, and (g) 840/850 ppm CO 2 .(right) The standard deviation in this temperature change across the experiments making up each set.The continental outline from the updated Herold, Huber, and Müller (2011) paleogeographic boundary conditions described in this study has been used (see supporting information).

Figure 8 .
Figure 8. (Left) The multi-model-mean of the annual mean surface temperature change relative to the preindustrial control for the Orbital sensitivity simulations outlined in Table 3 (a) HadCM3L, (c) Genesis (Table 3, only the simulations that have 280 ppm CO 2 ).(right) The standard deviation in this temperature change across the experiments making up each set of (b) HadCM3L, (d) Genesis 280 ppm CO 2 experiments.Note.That the same color bar as in Figure 5 has been used to facilitate comparison.See Figure S6 for a version of the standard deviation plots that has a colorbar that ranges from 1°C to 10°C.

Figure 9 .Figure 10 .
Figure 9. Zonal-mean, annual-mean land surface temperature for all Late Miocene experiments.Superimposed in gray are Late Miocene in age mean annual terrestrial temperature reconstructions.Each panel distills the Late Miocene experiments by prescribed atmospheric CO 2 concentration.

Figure 12 .
Figure 12.Zonal-mean, annual-mean, sea surface temperature for all Middle and Early Miocene experiments.Superimposed in black are Middle Miocene in age mean annual sea surface temperature reconstructions.Each panel distills the Middle to Early Miocene experiments by prescribed atmospheric CO 2 concentration.

Figure 13 .
Figure 13.Late Miocene multi-model-mean (left) sea surface, (right) terrestrial mean annual temperature bias.The average of individual model bias at each site (Figures S8 and S9) is calculated across the set of Late Miocene experiments with (a and b) 278 or 280 ppm, (c and d) 350-450 ppm, (e and f) 560 ppm, and (g and h) 840/850 ppm.The root-mean-square error across all the sites is shown in red on the bottom left.The continental outline from COSMOS T31 (Knorr et al., 2011) has been used.

Figure 14 .
Figure 14.Middle Miocene multi-model-mean (left) sea surface, (right) terrestrial mean annual temperature bias.The average of individual model bias at each site (Figures S10 and S11) is calculated across the set of Middle to Early Miocene experiments with (a and b) 278 or 280 ppm,(c and d) 350-450 ppm, (e and f) 560 ppm, and (g and h) 840/850 ppm.The root-mean-square error across all the sites is shown in red on the bottom left.The continental outline from the updatedHerold, Huber, and Müller (2011)  paleogeographic boundary conditions described in this article has been used (see supporting information).
, we have surveyed the existing modeling experiments and sorted them into two groups based on the Miocene time period they have targeted: (1) Late Miocene experiments with a paleogeography that falls within 11.6-5.33Ma (Tortonian and Messinian) and (2) Middle to Early Miocene experiments with paleogeography that falls within 20-11.6 Ma (Burdigalian, Langhian, and Serravallian).Unlike a formal model intercomparison where the same boundary conditions are used in each model, it is difficult to isolate the model dependence of the climate response to specific forcing (e.g., CO 2 vs. paleogeography).Nevertheless, the informal model intercomparison presented in this paper serves to provide a "multi-model-ensemble of opportunity" view of the spread in model responses to a range of Miocene boundary conditions and CO 2 forcings, within a single model-data comparison framework; an approach that has previously been used in deep-time paleoclimate modeling

Table 1 Late
All Miocene experiments use Middle MiocenePaleoceanography and PaleoclimatologyBURLS ET AL.Miocene Experiments With Paleogeography That Falls

Table 3
Orbital Parameter Sensitivity Experiments Markwick (2007) 2017) Antarctica.Simulations were run for 50 years with 30 years used for analysis.BURLS ET AL.HadCM3L-Bradshaw: These simulations use the HadCM3LB-M2.1aDandHadCM3LB-M2.1aE(Valdesetal., 2017)coupled atmosphere-ocean general circulation models with an interactive vegetation scheme.The late Miocene simulations (HadCM3LB-M2.1aD-vegetationmodelwitha 10-day coupling period) were run for 2,100 years and the middle Miocene simulations (HadCM3LB-M2.1aE-vegetationmodelwitha 5-year coupling period) continue on from the late Miocene simulations and have been run for a further 2,000 years.The last 50 years of the simulations are used for analyses.Both the late Miocene and the middle Miocene configurations use paleogeographies fromMarkwick (2007).

Table 4
Aumont et al., 2015)les for ocean dynamics (OPA8.2),biochemistry(PISCES,Aumontet al., 2015)and sea-ice (LIM2).Description of Key Characteristics Associated With the Various Baseline Paleogeographic Forcing Used (Mosbrugger & Utescher, 1997;s, to provide Late Miocene terrestrial Mean Annual Temperature (MAT) estimates.A new synthesis of global terrestrial MATs has been generated for the Middle Miocene through the updating and expansion of thePound et al. (2012)andGoldner et al. (2014)syntheses.Most estimates are from fossil plant data and use either the Co-existence Approach, CLAMP, or Leaf Margin Analysis.The Co-existence Approach uses the modern climatic range of a fossil taxons nearest living relative to determine the climatic envelope of co-existence for a fossil assemblage(Mosbrugger & Utescher, 1997; Utescher Hollis et al. (2019)2016) U k 37 synthesis was used as a starting point.Both the previously published and new records within this data set were recalibrated/calibrated by Herbert et al. (2016) using the Müller et al. (1998) calibration.Any additional Miocene SST reconstructions not included in Herbert et al. (2016) were then added, without any recalibration or age model adjustments (i.e., the original calibrations and age models were used).A community effort to place all available records within a common calibration and age model framework, similar to the recent Eocene effort ofHollis et al. (2019), is planned, for which the summary provided in TableS2can serve as a starting point together with new records published as part of this special issue.