Input variable selection and calibration data selection for storm water quality regression models. 2013

Siao Sun, and Jean-Luc Bertrand-Krajewski
University of Lyon, INSA Lyon, LGCIE, France. siao.sun@insa-lyon.fr

Storm water quality models are useful tools in storm water management. Interest has been growing in analyzing existing data for developing models for urban storm water quality evaluations. It is important to select appropriate model inputs when many candidate explanatory variables are available. Model calibration and verification are essential steps in any storm water quality modeling. This study investigates input variable selection and calibration data selection in storm water quality regression models. The two selection problems are mutually interacted. A procedure is developed in order to fulfil the two selection tasks in order. The procedure firstly selects model input variables using a cross validation method. An appropriate number of variables are identified as model inputs to ensure that a model is neither overfitted nor underfitted. Based on the model input selection results, calibration data selection is studied. Uncertainty of model performances due to calibration data selection is investigated with a random selection method. An approach using the cluster method is applied in order to enhance model calibration practice based on the principle of selecting representative data for calibration. The comparison between results from the cluster selection method and random selection shows that the former can significantly improve performances of calibrated models. It is found that the information content in calibration data is important in addition to the size of calibration data.

UI MeSH Term Description Entries
D008962 Models, Theoretical Theoretical representations that simulate the behavior or activity of systems, processes, or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Experimental Model,Experimental Models,Mathematical Model,Model, Experimental,Models (Theoretical),Models, Experimental,Models, Theoretic,Theoretical Study,Mathematical Models,Model (Theoretical),Model, Mathematical,Model, Theoretical,Models, Mathematical,Studies, Theoretical,Study, Theoretical,Theoretical Model,Theoretical Models,Theoretical Studies
D011891 Rain Water particles that fall from the ATMOSPHERE. Rains
D004784 Environmental Monitoring The monitoring of the level of toxins, chemical pollutants, microbial contaminants, or other harmful substances in the environment (soil, air, and water), workplace, or in the bodies of people and animals present in that environment. Monitoring, Environmental,Environmental Surveillance,Surveillance, Environmental
D060753 Water Quality A rating of a body of water based on measurable physical, chemical, and biological characteristics. Quality, Water

Related Publications

Siao Sun, and Jean-Luc Bertrand-Krajewski
January 2010, Statistica Sinica,
Siao Sun, and Jean-Luc Bertrand-Krajewski
January 2009, Analytica chimica acta,
Siao Sun, and Jean-Luc Bertrand-Krajewski
January 2024, Biometrics,
Siao Sun, and Jean-Luc Bertrand-Krajewski
December 2020, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America,
Siao Sun, and Jean-Luc Bertrand-Krajewski
July 2022, Statistics in medicine,
Siao Sun, and Jean-Luc Bertrand-Krajewski
January 2023, Journal of applied statistics,
Siao Sun, and Jean-Luc Bertrand-Krajewski
September 1980, Biometrics,
Siao Sun, and Jean-Luc Bertrand-Krajewski
June 2012, Journal of nonparametric statistics,
Siao Sun, and Jean-Luc Bertrand-Krajewski
November 2023, Journal of the Royal Statistical Society. Series C, Applied statistics,
Siao Sun, and Jean-Luc Bertrand-Krajewski
August 2018, Statistical methods in medical research,
Copied contents to your clipboard!