Il problema della selezione delle variabili è uno dei problemi più diffusi nelle applicazioni statistiche. Spesso indicato come il problema della selezione di un sottoinsieme, lo stesso sorge quando si vuole modellare la relazione tra una variabile di interesse e un sottoinsieme di potenziali variabili esplicative o predittori, ma c'è incertezza su quale sia il sottoinsieme da utilizzare. Il lavoro passa in rassegna alcuni dei principali sviluppi che hanno portato alla grande varietà di approcci per questo problema. Viee inoltre presentato un nuovoi metodo deto BERDS.
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This paper reviews some of the key developments which have led to the wide variety of approaches for this problem. In the section 3, for example, a new algorithm— backward elimination via repeated data splitting (BERDS)— is proposed for variable selection in regression. We also discuss the following problem: given a random sample from an unknown probability distribution, estimate the sampling distribution of some prespecified random variable, on the basis of the observed data. A general method, called the bootstrap, is introduced.
Le tecniche statistiche per la selezione delle variabili
GUERRIERO, Massimo
2006-01-01
Abstract
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This paper reviews some of the key developments which have led to the wide variety of approaches for this problem. In the section 3, for example, a new algorithm— backward elimination via repeated data splitting (BERDS)— is proposed for variable selection in regression. We also discuss the following problem: given a random sample from an unknown probability distribution, estimate the sampling distribution of some prespecified random variable, on the basis of the observed data. A general method, called the bootstrap, is introduced.File | Dimensione | Formato | |
---|---|---|---|
Relazione Guerriero AdR2006-2007.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Dominio pubblico
Dimensione
166.14 kB
Formato
Adobe PDF
|
166.14 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.