An overview of techniques for dealing with large numbers of independent variables in epidemiologic studies. - INRAE - Institut national de recherche pour l’agriculture, l’alimentation et l’environnement
Journal Articles Preventive Veterinary Medicine Year : 1996

An overview of techniques for dealing with large numbers of independent variables in epidemiologic studies.

I R Dohoo
  • Function : Author
Christian Ducrot
C Fourichon
A Donald
  • Function : Author
D Hurnik
  • Function : Author

Abstract

Many studies of health and production problems in livestock involve the simultaneous evaluation of large numbers of risk factors. These analyses may be complicated by a number of problems including: multicollinearity (which arises because many of the risk factors may be related (correlated) to each other), confounding, interaction, problems related to sample size (and hence the power of the study), and the fact that many associations are evaluated from a single dataset. This paper focuses primarily on the problem of multicollinearity and discusses a number of techniques for dealing with this problem. However, some of the techniques discussed may also help to deal with the other problems identified above. The first general approach to dealing with multicollinearity involves reducing the number of independent variables prior to investigating associations with the disease. Techniques to accomplish this include: (1) excluding variables after screening for associations among independent variables; (2) creating indices or scores which combine data from multiple factors into a single variable; (3) creating a smaller set of independent variables through the use of multivariable techniques such as principal components analysis or factor analysis. The second general approach is to use appropriate steps and statistical techniques to investigate associations between the independent variables and the dependent variable. A preliminary screening of these associations may be performed using simple statistical tests. Subsequently, multivariable techniques such as linear or logistic regression or correspondence analysis can be used to identify important associations. The strengths and limitations of these techniques are discussed and the techniques are demonstrated using a dataset from a recent study of risk factors for pneumonia in swine. Emphasis is placed on comparing correspondence analysis with other techniques as it has been used less in the epidemiology literature.
No file

Dates and versions

hal-02698854 , version 1 (01-06-2020)

Identifiers

  • HAL Id : hal-02698854 , version 1
  • PRODINRA : 224094
  • PUBMED : 9234406
  • WOS : A1997WH58000005

Cite

I R Dohoo, Christian Ducrot, C Fourichon, A Donald, D Hurnik. An overview of techniques for dealing with large numbers of independent variables in epidemiologic studies.. Preventive Veterinary Medicine, 1996, 29 (3), pp.221-239. ⟨hal-02698854⟩

Collections

INRA INRAE UMREPIA
10 View
0 Download

Altmetric

Share

More