In this work, we present a novel algorithm for extracting valuable knowledge from large databases. Rare events are difficult to mine due to very little support they possess. Our algorithm, SARG (Significant Association Rule Generator) helps us to mine for significant patterns (including rare events) from large databases by defining the support fraction per cell in the contingency table instead of per the entire contingency table. It uses a combination of both support confidence and chi square statistic framework for mining significant patterns from vast raw data. In this algorithm, we introduce the notion of critical attribute and critical attribute value which are passed as input parameters to the SARG algorithm to make the mining process more selective. We ran our algorithm against a huge medical file provided by the Cleveland Clinic Foundation, Cleveland, OH. We compared the results of SARG algorithm with the results produced by Brin s chi square algorithm. Some of the results produced by SARG are unknown medical facts that are not produced by Brin's chi square algorithm.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.