ESAIM: Probability and Statistics

Research Article

Detecting atypical data in air pollution studies by using shorth intervals for regression

Durot, Cécilea1 and Thiébot, Karellea1a2

a1 Université Paris Sud, Bâtiment 425, 91405 Orsay Cedex, France; cecile.durot@math.u-psud.fr

a2 ; Air Pays de la Loire, 2 rue A. Kastler, BP 30723, 44307 Nantes Cedex 3, France.

Abstract

To validate pollution data, subject-matter experts in Airpl (an organization that maintains a network of air pollution monitoring stations in western France) daily perform visual examinations of the data and check their consistency. In this paper, we describe these visual examinations and propose a formalization for this problem. The examinations consist in comparisons of so-called shorth intervals so we build a statistical test that compares such intervals in a nonparametric regression model. This allows to detect atypical data. A practical application of the test is given.

(Received December 15 2003)

(Revised April 8 2005)

(Online publication November 15 2005)

Key Words:

  • Air pollution;
  • validation;
  • regression;
  • bootstrap;
  • shorth.

Mathematics Subject Classification:

  • 62G08;
  • 62G09;
  • 62G10;
  • 62P12
--