Normalization in LCA: how to ensure consistency?

HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.


Overview
In this discussion paper, we focus on a topic that seems probably not sufficiently addressed in Life Cycle Assessment (LCA): the normalization (Andreas et al. 2020).More precisely, we look again at the need for the normalization step in LCA and inconsistencies caused by this step.We highlight the importance of having the same data source for the system under study and the normalization references.While this is possible in a conventional normalization way for Economic Input-Output LCA (EIO-LCA), this remains a challenge for process-based LCA.We show how to overcome this limitation and discuss how a normalization by the geometric mean of the inventory database (Hélias et al. 2020) ensures consistency.

Context
In recent decades, the interconnections between people and nature have been increasingly highlighted.
Many concepts and approaches have been developed to address these relationships, as the "One Health" concept to bring together three of them: humans, animals and their environments (Gibbs 2014), the seventeen Sustainable Development Goals for a single planet (UNEP 2020), the global biodiversity affected by five drivers (Millennium Ecosystem Assessment 2005), the product environmental footprint to display sixteen impacts (Fazio et al. 2018) or the three areas of protection (Verones et al. 2017).Preserving the environment as a whole does not elude the recurring problem of most of the decision-making processes: the multicriteria.
Arguing the relevance of impacts on a case-by-case basis, contextualizing and putting them into perspective with regard to the options available for the system under study, are all attempts to avoid a multiattribute decision-making (MADM).Nevertheless, a generic, structured, repeatable, automatable, and objective decision-making process is an expectation.In the field of LCA, this justifies the normalization and weighting steps.The weight represents the importance of the criterion and its design is a real research topic in itself, but we are interested here in normalization.

Meaning of the normalization
The normalization in the LCA framework is the division of the impact computed for the system under study, by the impact of a reference value named the normalization reference.LCA normalization has been identified as a leading driver in the aggregation process, with high consequences on the results (Myllyviita et al. 2014;Prado et al. 2019;Muhl et al. 2021).
Normalization has three main purposes (Pizzol et al. 2017): (1) to compare the results in order to check plausibility, (2) to facilitate communication, and (3) to be free of unit constraints for weighting, with the impacts expressed on a common scale.This last point focuses our attention here.
In LCA, normalization is also twofold (Laurent and Hauschild 2015): (1) internal if the normalization reference is defined on the basis of case studies, as the result of a system "A" expressed in proportion of the result of a system "B", and (2) external if an independent reference is used, as the results of systems "A" and "B" expressed in proportion to the average citizen year impact.
An internal normalization ensures, by definition, consistency between the system under study and the reference.However, it remains context-dependent, which is contrary to the objectives of LCA, and cannot be used with generic weighting (Norris 2001).An external normalization makes possible an aggregation process that is reproducible and independent of the study.However, two modelling processes are used, one for the system and the other for the reference.This raises questions about the representativeness and the consistency between both representations (Heijungs et al. 2007).
The normalization task (sometimes named standardization in some scientific communities) is not unique to LCA and is encountered by anyone who deals in data processing.For decision, not all, but most MADM models require a normalization step.The choice of the normalization method is often part of the MADM process and this affects the result: A different method of normalization will lead to a different result.
There is no consensus on this step in the MADM community and Jahan and Edwards (2015) reviewed 31 methods (sum, norm, max-min, Z-transformation…).All of them are based on the systems under study and from LCA viewpoint, they are internal normalizations.LCA was initially defined outside the framework of the MADM (Pizzol et al. 2017) and the choice was made to mainly use external normalization: As a result, the aggregation process is not determined by the specifics of the study, contrary to most MADM methods.This is undoubtfully an advantage for the genericity, repeatability, and independence of the process.
Global external normalizations are now recommended in the guidelines (Verones et al. 2017).This is the most relevant system of reference: In our globalized economy, any functional unit involves processes across all continents and the problem has to be taken as a whole, using a normalization reference encompassing the entire world.

Consistency issue
LCA is a modelling work: For a functional unit, we draw up the associated impacts.This process is built with rules, and one of the most important is additivity: the impact of two functional units is the sum of the impacts of each.In this sense, external normalization means dividing the impact of the product or service (corresponding to the functional unit of the study) by the sum of all the products and services on a global scale (the reference).To ensure consistency in the modelling work, the way in which the products and services are represented should be the same.As an illustration, Figure 1 offers a simplistic schematic view, with the system under study being an apple and the reference for normalization being the world.From real systems, the modelling processes allow having representations, here blurred or partial.For the normalization, having a blur/blur or partial/partial ratio is preferable to a blur/partial or partial/blur one, whatever the soundness of the modelling.

Figure 1 about here
The raw data and their processing must be identical both for the system under study and the normalization references (unless it can be proved that there is no bias or that the bias is identical).In other words, we can say that an LCA is a model of a part of the world and the reference is a model of the world.
Models and choices must be consistent.

Economic Input-Output LCA
An EIO-LCA uses aggregated sectoral data and their interdependencies.Fractions of the impacts associated to the sectors of the economy are then attributed to the system under study with the Leontief's wellknown equation.With this top-down approach, we have the same model between the world (the totality of all economic sectors) and the system under study (parts of economic sectors).We can ensure consistency in the normalization process (subject to the calculating of this value from the EIO database).Unfortunately, the process-based LCA differs.

Process-based LCA
A process-based LCA is more like a Lego® construction.The LCA practitioners take "bricks" from a Life Cycle Inventory (LCI) database and assemble them to build the system under study.Sometimes, they create a "brick", but this is far from being their primary activity.For each type of "brick" (a component or a process, called a dataset by LCI database providers) that is required, the practitioners take a certain number of them (i.e. the final demand).With this bottom-up approach, we do not have the information to shape the world as a whole.
All the datasets into the LCI database describe what constitutes the world, but the quantity of each is missing.As an example, we can consider that the dataset "market for apple | apple (GLO)" from ecoinvent (Wernet et al.

2016
) allows us to have the average impact of an apple at the world level, but it does not allow us to know how many apples are produced in all the orchards of the world.
With only a process-based LCA database, we cannot calculate the LCA of human activity at the global level.The consistency of the normalization cannot be argued.

Normal versus log-normal laws
In LCA, we normalize by the whole impact (or by a proportion of it when reduced to a citizen, but the reasoning remains the same).This is undeniably a relevant solution, the result being a fraction (studied system divided by the whole) that is easy to interpret and communicate.Divided by the sum of elements is one of the normalization identified by Jahan and Edwards (2015) but there are many others.
In data processing, looking at the law of data distribution to select the descriptors is always relevant.
The best known is the normal distribution law, described by the arithmetic mean and the standard deviation.
When in LCA we divide by the totality of the impact as normalization, up to a factor (the number of elements that make up the whole), we divide by the arithmetic mean.Implicitly, this says that we consider this arithmetic mean is a good descriptor of impacts, that we are in a "normal" world.
But the LCA practitioner is not working in a "normal" world.Uncertainty calculations are generally done with log-normal distribution laws.In an LCI database, the distribution of substance emissions commonly follows a log-normal law (Qin and Suh 2017).Some LCA results are interpreted in orders of magnitude (such as toxicities (Frischknecht and Jolliet 2019)) and therefore on a logarithmic scale.An LCA is the combination of elements following log-normal distributions.If no formal solution has been found for a sum of log-normal variables, there is a general agreement this is well approximated by a log-normal distribution (Beaulieu et al. 1995).Although the log-normal law is less intuitive than the normal law, this distribution frequently occurs in nature (abundance of species, concentrations of minerals in the earth's crust, concentrations of pollutants in the atmosphere, dose-response curves, etc.) (Limpert et al. 2001).We can reasonably assume the impacts of the set of products and services follow log-normal distributions.

Geometric mean as the normalization reference
This leads to thinking about normalization in a different way.The log-normal distribution is described by the geometric mean, not by the arithmetic one.The geometric mean is more robust to extreme values, but more importantly, it results from a multiplication, while the arithmetic mean is constructed from a sum.It therefore becomes possible to factorize and thus simplify the result.
As an illustration, let a reference system resulting from  processes ( ∈ {1, … , }), with  impacts ( ∈ {1, … , }), let  ! the final demand of the process  and ℎ !,# the impact  of the process  (ℎ !,# results from the multiplication of all elementary flows of the process by the associated characterisation factors.).The arithmetic mean of each impact category for the reference system is and the geometric mean The term !∏    =1 # 1  is constant whatever the impact, therefore it can be removed without changing the proportions between the normalized impacts.
With !∏ ℎ ,  =1 # 1  , the geometric mean of ℎ !,# instead of  !ℎ !,# , we have a normalization reference value that we only calculate with the LCI database, without having to deal with the final demand for the reference.This approach ensures consistency between the modelling of the system under study and the normalization references.This approach is especially relevant for process-based LCA, but can also be used with EIO-LCA.

Discussion
Normalizing by the geometric mean of the inventory database raises several considerations that are worthy of discussion.We present below elements of responses to several remarks that can be made.

About the geometric mean
First remark : Even if there are no data bias, normalizing with the geometric mean gives a different result than normalizing with the overall impact.Indeed, normalizing by geometric mean will not lead to the same conclusions as the "conventional" normalization, each normalization choice plays a key role in the aggregation/decision process as shown in MADM.However, it is worth noting that the results are identical with the assumption of a constant Note in addition that the variability of the ratio % is reduced by the square root with respect to  # .It would be interesting to investigate further the values of these ratios with the EIO-LCA databases where  # $ and  # ' are then computable from the same data for all impact categories.
Second remark: The geometric mean does not make the quantities commensurable.Indeed, for the purposes of normalization, the objective of being able to compare and communicate the results is not achieved by this normalization.This approach is interesting from the perspective of aggregating impacts.With some practice, reasoning with the geometric mean would be possible for an LCA practitioner but it would obviously be uncomfortable to communicate, the citizen-equivalent remaining the best solution for this purpose.

About the inventory database as data for normalization
Third remark: As the LCI database is used, this is not a true external normalization.In this approach, there is a link between the system studied and the references, but the building of both from the same source does not make it an internal normalization.The normalization here is not based on the case under study, but on the used database.It therefore goes beyond the internal normalization and can therefore be considered as an external normalization.
Fourth remark: No LCI database is perfect, some elements are missing and none represents the world as a whole.Over time, LCI databases are improved, but they are still perfectible.Using them to represent the world implies that the world is imperfectly modelled.But it is as imperfectly modelled as the system under study.Consider one modelled with a database where infrastructures are not represented, as the agrifootprint database (e.g.see Corrado et al. (2018)).Using a reference including infrastructure is not really relevant: the

/
ratio regardless of j(Hélias et al. 2020).They prove the results are highly correlated in some cases, despite data biases that can be explained.More precisely, assuming lognormal distribution, +  # (see equation(10)(11) inLimbrunner et al. (2000)) and the hypothesis of a constant ratio becomes a constant coefficient of variation  # , ∀ (or at least small changes in the coefficient of variations).

Figure 1 .
Figure 1.Schematic representation of the consistency issue in the normalization step

Figure
Figure