A Model-Based Frequency Constraint for Mining Associations from Transaction Data - Computer Science > DatabasesReportar como inadecuado




A Model-Based Frequency Constraint for Mining Associations from Transaction Data - Computer Science > Databases - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: Mining frequent itemsets is a popular method for finding associated items indatabases. For this method, support, the co-occurrence frequency of the itemswhich form an association, is used as the primary indicator of theassociations-s significance. A single user-specified support threshold is usedto decided if associations should be further investigated. Support has someknown problems with rare items, favors shorter itemsets and sometimes producesmisleading associations.In this paper we develop a novel model-based frequency constraint as analternative to a single, user-specified minimum support. The constraintutilizes knowledge of the process generating transaction data by applying asimple stochastic mixture model the NB model which allows for transactiondata-s typically highly skewed item frequency distribution. A user-specifiedprecision threshold is used together with the model to find local frequencythresholds for groups of itemsets. Based on the constraint we develop thenotion of NB-frequent itemsets and adapt a mining algorithm to find allNB-frequent itemsets in a database. In experiments with publicly availabletransaction databases we show that the new constraint provides improvementsover a single minimum support threshold and that the precision threshold ismore robust and easier to set and interpret by the user.



Autor: Michael Hahsler

Fuente: https://arxiv.org/







Documentos relacionados