Extracting Support Based k most Strongly Correlated Item Pairs in Large Transaction Databases
Support confidence framework is misleading in finding
statistically meaningful relationships in market basket data. The
alternative is to find strongly correlated item pairs from the
basket data. However, strongly correlated pairs query suffered
from suitable threshold setting problem. To overcome that, top-k
pairs finding problem has been introduced. Most of the existing
techniques are multi-pass and computationally expensive. In this
work an efficient technique for finding k top most strongly and
correlated item pairs from transaction database, without
generating any candidate sets has been reported. The proposed
technique uses a correlogram matrix to compute support count of
all the 1- and 2-itemset in a single scan over the database. From
the correlogram matrix the positive correlation values of all the
item pairs are computed and top-k correlated pairs are extracted.
The simplified logic structure makes the implementation of the
proposed technique more attractive. We experimented with real
and synthetic transaction datasets and compared the performance
of the proposed technique with its other counterparts (TAPER,
TOP-COP and Tkcp) and found satisfactory.
Keywords: Association mining, correlation coefficient,
correlogram matrix, top-k correlated item pairs
Download Full-Text








