Association rules are a well established tool in data mining software which are nowadays used to describe statistical associations in many fields. Classical association rules (also called boolean rules), which have been introduced in the context of market basket analysis by Agrawal, Imielinski and Swami, are statements about the fact that the presence of a subset of items called "antecedent" is likely to imply the presence of another set of items called "consequent". In market basket analysis, for instances, there will be a transaction (that is, a nonempty set of items) for each customer (actually for each single bill), each of which consisting of a selection from the set of the K products (items) present in the store. To reduce the mass of discovered rules to a manageable number of patterns, a number of selection and pruning methods have been proposed. The use of statistical measures and statistical tests have also been advanced to asses the "interestingness" of an association rule. Here, to test the "interestingness" of a rules, or of a set of rules, we outline how recent developments in the analysis of frequency data, in particular on the theory of marginal models, can be applied to this context. Marginal models are a rather recent extension of log-linear models intended to analyze simultaneously several marginal distributions of interest. As such, this is an approach particularly suitable for investigating association rules where we are mostly interested in low dimensional marginal distributions in view of the fact that they provide a simple way of summarizing the most tangible and easily accessible structures in the data. In the following we outline this general approach and indicate how it could be applied to solve a few specific problems related to pruning of association rules.

Marginal models and pruning of association rules

MINOZZO, Marco;
2004-01-01

Abstract

Association rules are a well established tool in data mining software which are nowadays used to describe statistical associations in many fields. Classical association rules (also called boolean rules), which have been introduced in the context of market basket analysis by Agrawal, Imielinski and Swami, are statements about the fact that the presence of a subset of items called "antecedent" is likely to imply the presence of another set of items called "consequent". In market basket analysis, for instances, there will be a transaction (that is, a nonempty set of items) for each customer (actually for each single bill), each of which consisting of a selection from the set of the K products (items) present in the store. To reduce the mass of discovered rules to a manageable number of patterns, a number of selection and pruning methods have been proposed. The use of statistical measures and statistical tests have also been advanced to asses the "interestingness" of an association rule. Here, to test the "interestingness" of a rules, or of a set of rules, we outline how recent developments in the analysis of frequency data, in particular on the theory of marginal models, can be applied to this context. Marginal models are a rather recent extension of log-linear models intended to analyze simultaneously several marginal distributions of interest. As such, this is an approach particularly suitable for investigating association rules where we are mostly interested in low dimensional marginal distributions in view of the fact that they provide a simple way of summarizing the most tangible and easily accessible structures in the data. In the following we outline this general approach and indicate how it could be applied to solve a few specific problems related to pruning of association rules.
2004
8846474406
Dependence structure; Marginal models; Market basket analysis; Mining association rules; Statistical tests
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/325063
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact