Assessing the impact of physicochemical parameters in the 2 predictive capabilities of thermodynamics-based stoichiometric 3 approaches under mesophilic and thermophilic conditions 4 5

2 predictive capabilities of thermodynamics-based stoichiometric 3 approaches under mesophilic and thermophilic conditions 4 5 Claudio Tomi-Andrino1,2,3, Rupert Norman2, Thomas Millat2, Philippe Soucaille2,4,5,6, 6 Klaus Winzer2, David A. Barrett1, John King3, Dong-Hyun Kim1 7 8 1Centre for Analytical Bioscience, Advanced Materials and Healthcare Technology Division, 9 School of Pharmacy, University of Nottingham, Nottingham, United Kingdom. 10 2Nottingham BBSRC/EPSRC Synthetic Biology Research Centre (SBRC), School of Life 11 Sciences, BioDiscovery Institute, University of Nottingham, Nottingham, United Kingdom. 12 3Nottingham BBSRC/EPSRC Synthetic Biology Research Centre (SBRC), School of 13 Mathematical Sciences, University of Nottingham, Nottingham, United Kingdom. 14 4INSA, UPS, INP, Toulouse Biotechnology Institute, (TBI), Université de Toulouse, Toulouse, 15 France. 16 5INRA, UMR792, Toulouse, France. 17 6CNRS, UMR5504, Toulouse, France.

This systematic framework exploits information regarding the metabolic state, which 77 comprises the metabolome (set of low-molecular-weight metabolites (<1.5 kDa)) and the 78 fluxome (or metabolic activity, distribution of rates of conversion/transport in the metabolic 79 network) (5, 6). Kinetic modelling can yield metabolic fluxes from metabolomics data, but lack 80 of high-quality enzymatic parameters and computational limitations (e.g. time-consuming 81 processes) hinder its application (7-9). As an alternative, stoichiometric modelling provides a  (Table 1). 90 FBA solves the underdetermined system represented in Eq. 1 by maximising or ( 13 C-MFA) (11). Since this approach requires fewer assumptions and uses more experimental 101 information than FBA, 13 C-MFA is considered to be the gold standard in fluxomics (15).

104
The set of constraints characterising stoichiometric modelling approaches (Eq. 1) is 105 insufficient to guarantee thermodynamically feasible results in the flux solution space (17, 18).

106
Both FBA and 13 C-MFA assume most reactions to be reversible (11,19): in the first case 107 directionalities are dictated by the optimal flux distribution (which depends on the a priori 108 chosen objective function (12)), whereas in 13 C-MFA they are determined by the MIDs (20).
(18, 38) (39) 173 It is important to note that for E. coli the same strain was used for both the GSM and the 13 C-MFA, whereas for 174 T. thermophilus strain HB27 was used for constructing the GSM, and HB8 for the 13 C-MFA. The E. coli cells 175 were grown in glucose-limited chemostats, whereas batch culture was used for T. thermophilus instead. GAM, 176 growth-associated maintenance; NGAM, non-growth-associated maintenance; Y X/S , biomass yield.

178
TFA required a higher glucose uptake rate than the experimental one (S1 Appendix), 179 which provoked a difference between predicted and experimental growth rate (which is equal both bacteria (S1 Appendix). Using the default constraints from the metabolic networks also 184 allowed comparing the results with previously published ones.

185
In order to achieve compatibility with the COBRA toolbox (46) and matTFA (18), some 186 changes were applied to GSM iTT548: (i) the names of the metabolites were adapted to the 187 convention used in matTFA and associated to metSEED_IDs to enable access to the 188 thermodynamics database in matTFA (S1 Dataset) (18), and (ii) the fields CompartmentData, 189 metCompSymbol and rev were created in the model.  (Table 1). To address this potential deficiency, a modified matTFA was created (mod-matTFA) 194 as described below (Table 3). For reproducibility (47), the complete list of files used in this 195 study was collected in S3 Table. 196 Table 3. Parameters considered in mod-matTFA.

208
The parameter B was assumed to be constant, with a value of 1.6 mol -1/2 L 1/2 (25, 29). (3) Both formulas include terms correcting the pH and I, where is the number of ( ) 212 hydrogen atoms in species , R is the gas constant, T is the absolute temperature and refers 213 to the charge of the species (29). Applying the Gibbs-Helmholtz equation would be necessary 214 to account for temperature different from standard conditions, i.e. 25 , but the lack of ℃ 215 measured changes in enthalpy ( ) for all the metabolites prevents from doing so (48). Hence, ∆ 216 variations from 25 to 37 or to 72 were assumed to be small, as shown elsewhere (49).
The parameter A is normally assumed to be constant (25) or calculated using a 218 temperature-dependent function (Eq. 5) (18, 24), and the impact of using a 219 temperature/salinity-dependent function (Eq. 6) (48) was also tested in this study ( Fig. 1).

229
In general, consistency in units between parameters A (mol -1/2 kg 1/2 ) and B (mol -1/2 L 1/2 ) 230 is achieved by assuming 1 kg = 1 L. In this study, an expression for seawater (Eq. 7) (52) was for T. thermophilus) were used (Table 3).  (Table 3). Constraints regarding substrate uptake 239 rate, specific growth rate and energetic requirements were applied as explained in S1 Appendix, 240 and maximisation of Y X/S was selected as objective function. It is important to note that lower Kendall's W statistics was performed (S3 Table), where a value of 0 means no agreement of 250 ranking position with respect to each criterion, and a value of 1 indicates total agreement. In

270
In this study three questions were addressed: (i) how good available thermodynamic-based are.

275
To tackle these problems, the published matTFA (18) toolbox was modified as shown 276 in S3 Table to include more parameters and a broader range of parameters (Table 3). Two  parameters with 2 levels each were tested (Table 3), yielding 64 runs (Fig. 3). to identify the run with the best predictive capability at both levels, a joint ranking was 317 performed (Table 5).

362
The results for both FBA and TFA showed consistency between runs, with r ≈ 0.6 and 363 r ≈ 0.9 respectively, using a equivalent to 110% of an experimental value 364 (S1 Appendix). Even though the specific growth rate was constrained in the interval  (Fig. 2) (Fig. 3). Thus, we believe that eQuilibrator has proven to be ideal for small metabolic networks 396 or parts of pathways, whereas TFA-based approaches should be used when analysing GSM. In 397 this sense, differences in the problem definition (Table 1) should be further studied to identify 398 potential strategies allowing to improve TFA-based approaches.

Flux pattern changes between in vivo and in silico fluxes in the central carbon metabolism 400
In order to evaluate changes in reaction directionalities, the available in vivo fluxes were tested 401 against their equivalents in the simulated TFA flux distributions (S1-S2 Tables). Overall, the 402 'anaplerotic node' (Fig. 5) is particularly affected. For E. coli, changes in the flux pattern were 403 found for 12/40 of the central carbon metabolism reactions from 13 C-MFA (Table 6) (Table 7). used for 13 C-MFA, and orange when they are exclusively considered in the GSM. In the latter case no mapping 412 was possible (S1-S2 Tables).

413
Discrepancies in flux pattern between methods are caused by both differences in the 414 structure of the metabolic networks and the way the problem is defined (  (Fig. 5), and finally ICL and MALS (from isocitrate to malate, via glyoxylate). In contrast, the 419 metabolic network used for the 13 C-MFA did not consider PPCK and PPS (S1 Table), which 420 could have affected the determination of fluxes to/from phosphoenolpyruvate. Since 13 C-MFA 421 is based on lumped reaction, branched pathways are not taken into account (11). Thus, having 422 a smaller range of alternative pathways than FBA/TFA may affect the estimation of flux values.
423 Table 6. Flux pattern changes between 13 C-MFA data and matTFA predictions in E. coli.

Reaction
Where +, flux in the forward direction; -, flux in the reverse direction; 0, no flux. Corrected direction, refers to 425 the adjustments due to differences in the definition of the reaction between 13 C-MFA and GSM (S1 Table). For and pyruvate), R660 (between pyruvate and malate), and finally R425 and R420 (between 445 isocitrate to malate, via glyoxylate). In this case the PEP-carboxykinase activity (R014) was 446 included in the metabolic network for 13 C-MFA (S2 Table). As for E. coli, this reaction carried 447 no flux in the TFA (Table 7), and the pool of malate was also affected. Regarding the glyoxylate Where +, flux in the forward direction; -, flux in the reverse direction; 0, no flux. Corrected direction, refers to 454 the adjustments due to differences in the definition of the reaction between 13 C-MFA and GSM (S2 Table). The 455 directionality for R722 is the same: both the definition and the sign are opposed. *Glucose-6-P (G6P) is used 456 instead of glucose (glc-D) due to an incongruence between the metabolic networks (S2 Table). distributions. We hypothesise that this was due to the proven robustness of metabolic fluxes in 505 these pathways against changes in the metabolic state, as previously noted (32, 33).

506
Regarding the metabolomics level, our modified matTFA showed that widening the