Parametric and nonparametric propensity score estimation in multilevel observational studies. 2023

Marie Salditt, and Steffen Nestler
Institute of Psychology, University of Münster, Münster, Germany.

There has been growing interest in using nonparametric machine learning approaches for propensity score estimation in order to foster robustness against misspecification of the propensity score model. However, the vast majority of studies focused on single-level data settings, and research on nonparametric propensity score estimation in clustered data settings is scarce. In this article, we extend existing research by describing a general algorithm for incorporating random effects into a machine learning model, which we implemented for generalized boosted modeling (GBM). In a simulation study, we investigated the performance of logistic regression, GBM, and Bayesian additive regression trees for inverse probability of treatment weighting (IPW) when the data are clustered, the treatment exposure mechanism is nonlinear, and unmeasured cluster-level confounding is present. For each approach, we compared fixed and random effects propensity score models to single-level models and evaluated their use in both marginal and clustered IPW. We additionally investigated the performance of the standard Super Learner and the balance Super Learner. The results showed that when there was no unmeasured confounding, logistic regression resulted in moderate bias in both marginal and clustered IPW, whereas the nonparametric approaches were unbiased. In presence of cluster-level confounding, fixed and random effects models greatly reduced bias compared to single-level models in marginal IPW, with fixed effects GBM and fixed effects logistic regression performing best. Finally, clustered IPW was overall preferable to marginal IPW and the balance Super Learner outperformed the standard Super Learner, though neither worked as well as their best candidate model.

UI MeSH Term Description Entries
D003198 Computer Simulation Computer-based representation of physical systems and phenomena such as chemical processes. Computational Modeling,Computational Modelling,Computer Models,In silico Modeling,In silico Models,In silico Simulation,Models, Computer,Computerized Models,Computer Model,Computer Simulations,Computerized Model,In silico Model,Model, Computer,Model, Computerized,Model, In silico,Modeling, Computational,Modeling, In silico,Modelling, Computational,Simulation, Computer,Simulation, In silico,Simulations, Computer
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D015982 Bias Any deviation of results or inferences from the truth, or processes leading to such deviation. Bias can result from several sources: one-sided or systematic variations in measurement from the true value (systematic error); flaws in study design; deviation of inferences, interpretations, or analyses based on flawed data or data collection; etc. There is no sense of prejudice or subjectivity implied in the assessment of bias under these conditions. Aggregation Bias,Bias, Aggregation,Bias, Ecological,Bias, Statistical,Bias, Systematic,Ecological Bias,Outcome Measurement Errors,Statistical Bias,Systematic Bias,Bias, Epidemiologic,Biases,Biases, Ecological,Biases, Statistical,Ecological Biases,Ecological Fallacies,Ecological Fallacy,Epidemiologic Biases,Experimental Bias,Fallacies, Ecological,Fallacy, Ecological,Scientific Bias,Statistical Biases,Truncation Bias,Truncation Biases,Bias, Experimental,Bias, Scientific,Bias, Truncation,Biase, Epidemiologic,Biases, Epidemiologic,Biases, Truncation,Epidemiologic Biase,Error, Outcome Measurement,Errors, Outcome Measurement,Outcome Measurement Error
D016015 Logistic Models Statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable. A common application is in epidemiology for estimating an individual's risk (probability of a disease) as a function of a given risk factor. Logistic Regression,Logit Models,Models, Logistic,Logistic Model,Logistic Regressions,Logit Model,Model, Logistic,Model, Logit,Models, Logit,Regression, Logistic,Regressions, Logistic
D055361 Multilevel Analysis The statistical manipulation of hierarchically and non-hierarchically nested data. It includes clustered data, such as a sample of subjects within a group of schools. Prevalent in the social, behavioral sciences, and biomedical sciences, both linear and nonlinear regression models are applied. Analyses, Multilevel,Analysis, Multilevel,Multilevel Analyses
D057216 Propensity Score Conditional probability of exposure to a treatment given observed covariates. Propensity Scores,Score, Propensity,Scores, Propensity
D064887 Observational Studies as Topic Works about clinical studies in which participants may receive diagnostic, therapeutic, or other types of interventions, but the investigator does not assign participants to specific interventions (as in an interventional study). Natural Experiment as Topic,Natural Experiments as Topic,Naturalistic Observation Studies as Topic,Naturalistic Observation Study as Topic,Observational Study as Topic

Related Publications

Marie Salditt, and Steffen Nestler
November 2022, Statistics in medicine,
Marie Salditt, and Steffen Nestler
September 2006, Biometrics,
Marie Salditt, and Steffen Nestler
December 2004, Psychological methods,
Marie Salditt, and Steffen Nestler
August 2013, Statistics in medicine,
Marie Salditt, and Steffen Nestler
September 2009, International MS journal,
Marie Salditt, and Steffen Nestler
July 2011, Statistics & probability letters,
Marie Salditt, and Steffen Nestler
October 2011, Revista espanola de cardiologia,
Marie Salditt, and Steffen Nestler
September 2000, Biometrics,
Copied contents to your clipboard!