Association Rule 4

  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Association Rule 4 as PDF for free.

More details

  • Words: 3,327
  • Pages: 23
Association Rule Mining COMP 290-90 Seminar GNET 214 BCB Module Spring 2006

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Sequential Pattern Mining Why sequential pattern mining? GSP algorithm FreeSpan and PrefixSpan Boarder Collapsing Constraints and extensions

2

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Sequence Databases and Sequential Pattern Analysis (Temporal) order is important in many situations Time-series databases and sequence databases Frequent patterns Æ (frequent) sequential patterns

Applications of sequential pattern mining Customer shopping sequences: First buy computer, then CD-ROM, and then digital camera, within 3 months.

Medical treatment, natural disasters (e.g., earthquakes), science & engineering processes, stocks and markets, telephone calling patterns, Weblog click streams, DNA sequences and gene structures COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

3

What Is Sequential Pattern Mining? Given a set of sequences, find the complete set of frequent subsequences A sequence database SID

sequence

10



20

<(ad)c(bc)(ae)>

30

<(ef)(ab)(df)cb>

40

<eg(af)cbc>

A

sequence : < (ef) (ab)

(df) c b >

An element may contain a set of items. Items within an element are unordered and we list them alphabetically.

is a subsequence of

Given support threshold min_sup =2, <(ab)c> is a

sequential pattern 4

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Challenges on Sequential Pattern Mining A huge number of possible sequential patterns are hidden in databases A mining algorithm should Find the complete set of patterns satisfying the minimum support (frequency) threshold Be highly efficient, scalable, involving only a small number of database scans Be able to incorporate various kinds of userspecific constraints COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

5

A Basic Property of Sequential Patterns: Apriori A basic property: Apriori (Agrawal & Sirkant’94) If a sequence S is not frequent Then none of the super-sequences of S is frequent E.g, is infrequent Æ so do and <(ah)b> Seq. ID 10 20 30 40 50 6

Sequence <(bd)cb(ac)> <(bf)(ce)b(fg)> <(ah)(bf)abf> <(be)(ce)d>

Given support threshold min_sup =2

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Basic Algorithm : Breadth First Search (GSP) L=1 While (ResultL != NULL) Candidate Generate Prune Test L=L+1

7

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Finding Length-1 Sequential Patterns Initial candidates: all singleton sequences , , , , <e>, , ,

Scan database once, count support for candidates

min_sup =2 Seq. ID 10 20 30 40 50 8

Sequence <(bd)cb(ac)> <(bf)(ce)b(fg)> <(ah)(bf)abf> <(be)(ce)d>

Cand
<e>

Sup 3 5 4 3 3 2 1 1

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

The Mining Process 5th scan: 1 cand. 1 length-5 seq. pat.

Cand. cannot pass sup. threshold

<(bd)cba>

Cand. not in DB at all 4th scan: 8 cand. 6 length-4 seq. <(bd)bc> … pat. 3rd scan: 46 cand. 19 length-3 seq. … pat. 20 cand. not in DB at all 2nd scan: 51 cand. 19 length-2 seq. <(ab)> … <(ef)> pat. 10 cand. not in DB at all 1st scan: 8 cand. 6 length-1 seq.
<e> pat. Seq. ID Sequence

min_sup =2

10

<(bd)cb(ac)>

20

<(bf)(ce)b(fg)>

30

<(ah)(bf)abf>

40

<(be)(ce)d>

50



COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

9

Generating Length-2 Candidates 51 length-2 Candidates

<e> 10









<e>























































<de>



<e>

<ea>

<eb>

<ec>

<ed>

<ee>

<ef>





















<e>



<(ab)>

<(ac)>

<(ad)>

<(ae)>

<(af)>

<(bc)>

<(bd)>

<(be)>

<(bf)>

<(cd)>

<(ce)>

<(cf)>

<(de)>

<(df)> <(ef)>

Without Apriori property, 8*8+8*7/2=92 candidates

Apriori prunes 44.57% candidates

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Pattern Growth (prefixSpan) Prefix and Suffix (Projection)
, , and are prefixes of sequence Given sequence Prefix

Suffix (Prefix-Based Projection)



<(abc)(ac)d(cf)> <(_bc)(ac)d(cf)> <(_c)(ac)d(cf)> COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

12

Example Sequence_id

10 20 30 40

13

Sequence

<(ad)c(bc)(ae)> <(ef)(ab)(df)cb> <eg(af)cbc>

An Example ( min_sup=2):

Prefix

Sequential Patterns



,,,,,,<(ab)>,<(ab)c>,<(a b)d>,<(ab)f>,<(ab)dc>,,,,,,,



, , , <(bc)>, <(bc)a>, , ,



, , ,



,,,

<e>

<e>,<ea>,<eab>,<eac>,<eacb>,<eb>,<ebc>,<ec>,<ecb>,<ef>,<efb >,<efc>,<efcb>



,,, , COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

PrefixSpan

(the example to be continued)

Step1: Find length-1 sequential patterns;
:4, :4, :4, :3, <e>:3, :3 support pattern

Step2: Divide search space; six subsets according to the six prefixes; Step3: Find subsets of sequential patterns; By constructing corresponding projected databases and mine each recursively.

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

14

Example to be continued

15

Sequence_id

Sequence

Projected(suffix) databases

10 20 30 40

<(ad)c(bc)(ae)> <(ef)(ab)(df)cb> <eg(af)cbc>

<(ad)c(bc)(ae)> <(ef)(ab)(df)cb> <eg(af)cbc>

Prefix

Projected(suffix) databases

Sequential Patterns



<(abc)(ac)d(cf)>, <(_d)c(bc)(ae)>, <(_b)(df)cb>, <(_f)cbc>

,,,, ,,<(ab)>,<(ab)c>,<(ab )d>,<(ab)f>,<(ab)dc>,, ,,,,,

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Example Find sequential patterns having prefix
: 1.

Scan sequence database S once. Sequences in S containing
are projected w.r.t to form the projected database.

2.

Scan
-projected database once, get six length-2 sequential patterns having prefix : :2 , :4, <(_b)>:2, :4, :2, :2 :2 , :4, <(ab)>:2, :4, :2, :2

3.

Recursively, all sequential patterns having prefix
can be further partitioned into 6 subsets. Construct respective projected databases and mine each. e.g. -projected database has two sequences : <(_bc)(ac)d(cf)> and <(_e)>. COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

16

PrefixSpan Algorithm Main Idea: Use frequent prefixes to divide the search space and to project sequence databases. only search the relevant sequences. PrefixSpan(α, i, S|α) 1. Scan S|α once, find the set of frequent items b such that •

b can be assembled to the last element of α to form a sequential pattern; or



can be appended to α to form a sequential pattern.

2. For each frequent item b, appended it to α to form a sequential pattern α’, and output α’; 3. For each α’, construct α’-projected database S|α’, and call PrefixSpan(α’, i+1,S|α’). 17

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Approximate match Compatibility Matrix

When you observe d1 Spread count as d1: 90%, d2: 5%, d3: 5% COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

18

Match The degree to which pattern P is retained/reflected in S M(P,S) = P(P|S)=∏ C(p,s) when when lS=lP M(P,S) = max over all possible when lS>lP Example P S M d1d1 d1d3 0.9*0 d1d2 d1d2 0.9*0.8 d1d2 d1d3 0.9*0.05 d1d2 d2d3 0.1*0.05 d1d2 d1d2d3 0.9*0.8 19

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Calculate Max over all Dynamic Programming M(p1p2..pi, s1s2…sj)= Max of M(p1p2..pi-1, s1s2…sj-1) * C(pi,sj) M(p1p2..pi, s1s2…sj-1)

O(lP*lS) When compatibility Matrix is sparse O(lS)

20

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Match in D Average over all sequences in D

21

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Spread of match If compatibility matrix is identity matrix Match = support

22

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Anti-Monotone The match of a pattern P in a symbol sequence S is less than or equal to the match of any subpattern of P in S The match of a pattern P in a sequence database D is less than or equal to the match of any subpattern of P in D Can use any support based algorithm More patterns match so require efficient solution Sample based algorithms Border collapsing of ambiguous patterns 23

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Chernoff Bound Given sample size=n, range R, with probability 1-δ true value: ± ε ε = sqrt([R2ln(1/δ)]/2n)

Distribution free More conservative Sample size : fit in memory Restricted spread :

Frequent Patterns min_match + ε min_match - ε Infrequent patterns

For pattern P= p1p2..pL R=min (match[pi]) for all 1 ≤ i ≤L 24

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Algorithm Scan DB: O(N*min (Ls*m, Ls+m2)) Find the match of each individual symbol Take a random sample of sequences

Identify borders that embrace the set of ambiguous patterns O(mLp * |S| * Lp * n) Min_match ± ε existing methods for association rule mining

Locate the border of frequent patterns in the entire DB via border collapsing

25

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Border Collapsing If memory can not hold the counters for all ambiguous counters Probe-and-collapse : binary search Probe patterns with highest collapsing power until memory is filled If memory can hold all patterns up to the 1/x layer the space of of ambiguous patterns can be narrowed to at least 1/x of the original one where x is a power of 2 If it takes a level-wise search y scans of the DB, only O(logxy) scans are necessary when the border collapsing technique is employed 26

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Periodic Pattern Full periodic pattern ABC ABC ABC

Partial periodic pattern ABC ADC ACC ABC

Pattern hierarchy ABC ABC ABC DE DE DE DE ABC ABC ABC DE DE DE DE ABC ABC ABC DE DE DE DE 29

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Periodic Pattern Recent Achievements Partial Periodic Pattern Asynchronous Periodic Pattern Meta Pattern InfoMiner/InfoMiner+/STAMP

30

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

Clustering Sequential Data CLUSEQ ApproxMAP

31

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Frequent-pattern Mining Methods R. Agarwal, C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for generation of frequent itemsets. In Journal of Parallel and Distributed Computing (Special Issue on High Performance Data Mining), 2000. R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD'93, 207-216, Washington, D.C. R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB'94 487-499, Santiago, Chile. J. Han, J. Pei, and Y. Yin: “Mining frequent patterns without candidate generation”. In Proc. ACM-SIGMOD’2000, pp. 1-12, Dallas, TX, May 2000. H. Mannila, H. Toivonen, and A. I. Verkamo. Efficient algorithms for discovering association rules. KDD'94, 181-192, Seattle, WA, July 1994. 32

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Frequent-pattern Mining Methods A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. VLDB'95, 432-443, Zurich, Switzerland. C. Silverstein, S. Brin, R. Motwani, and J. Ullman. Scalable techniques for mining causal structures. VLDB'98, 594-605, New York, NY. R. Srikant and R. Agrawal. Mining generalized association rules. VLDB'95, 407-419, Zurich, Switzerland, Sept. 1995. R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. SIGMOD'96, 1-12, Montreal, Canada. H. Toivonen. Sampling large databases for association rules. VLDB'96, 134-145, Bombay, India, Sept. 1996. M.J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. KDD’97. August 1997.

33

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Performance Improvements S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket analysis. SIGMOD'97, Tucson, Arizona, May 1997. D.W. Cheung, J. Han, V. Ng, and C.Y. Wong. Maintenance of discovered association rules in large databases: An incremental updating technique. ICDE'96, New Orleans, LA. T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama. Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. SIGMOD'96, Montreal, Canada. E.-H. Han, G. Karypis, and V. Kumar. Scalable parallel data mining for association rules. SIGMOD'97, Tucson, Arizona. J.S. Park, M.S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD'95, San Jose, CA, May 1995.

34

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Performance Improvements G. Piatetsky-Shapiro. Discovery, analysis, and presentation of strong rules. In G. Piatetsky-Shapiro and W. J. Frawley, Knowledge Discovery in Databases,. AAAI/MIT Press, 1991. J.S. Park, M.S. Chen, and P.S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD'95, San Jose, CA. S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. SIGMOD'98, Seattle, WA. K. Yoda, T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama. Computing optimized rectilinear regions for association rules. KDD'97, Newport Beach, CA, Aug. 1997. M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. Parallel algorithm for discovery of association rules. Data Mining and Knowledge Discovery, 1:343-374, 1997. 35

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References. Extensions of Scopes

36

S. Brin, R. Motwani, and C. Silverstein. Beyond market basket: Generalizing association rules to correlations. SIGMOD'97, 265-276, Tucson, Arizona. J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. VLDB'95, 420-431, Zurich, Switzerland. M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A.I. Verkamo. Finding interesting rules from large sets of discovered association rules. CIKM'94, 401-408, Gaithersburg, Maryland. F. Korn, A. Labrinidis, Y. Kotidis, and C. Faloutsos. Ratio rules: A new paradigm for fast, quantifiable data mining. VLDB'98, 582-593, New York, NY. Wei Wang, Jiong Yang, Philip S. Yu: Efficient mining of weighted association rules (WAR). KDD 2000: 270-274 Wei Wang, Jiong Yang, Richard R. Muntz: TAR: Temporal Association Rules on Evolving Numerical Attributes. ICDE 2001: 283-292 COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Extensions of Scopes B. Lent, A. Swami, and J. Widom. Clustering association rules. ICDE'97, 220-231, Birmingham, England. R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. VLDB'96, 122-133, Bombay, India. R.J. Miller and Y. Yang. Association rules over interval data. SIGMOD'97, 452-461, Tucson, Arizona. A. Savasere, E. Omiecinski, and S. Navathe. Mining for strong negative associations in a large database of customer transactions. ICDE'98, 494-502, Orlando, FL, Feb. 1998. D. Tsur, J. D. Ullman, S. Abitboul, C. Clifton, R. Motwani, and S. Nestorov. Query flocks: A generalization of association-rule mining. SIGMOD'98, 1-12, Seattle, Washington. J. Pei, A.K.H. Tung, J. Han. Fault-Tolerant Frequent Pattern Mining: Problems and Challenges. SIGMOD DMKD’01, Santa Barbara, CA. 37

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Mining Maxpatterns and Closed Itemsets R. J. Bayardo. Efficiently mining long patterns from databases. SIGMOD'98, 85-93, Seattle, Washington. J. Pei, J. Han, and R. Mao, "CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets", Proc. 2000 ACM-SIGMOD Int. Workshop on Data Mining and Knowledge Discovery (DMKD'00), Dallas, TX, May 2000. N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. ICDT'99, 398-416, Jerusalem, Israel, Jan. 1999. M. Zaki. Generating Non-Redundant Association Rules. KDD'00. Boston, MA. Aug. 2000. M. Zaki. CHARM: An Efficient Algorithm for Closed Association Rule Mining, TR99-10, Department of Computer Science, Rensselaer Polytechnic Institute. M. Zaki, Fast Vertical Mining Using Diffsets, TR01-1, Department of Computer Science, Rensselaer Polytechnic Institute. 38

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Constraint-base Frequent-pattern Mining G. Grahne, L. Lakshmanan, and X. Wang. Efficient mining of constrained correlated sets. ICDE'00, 512-521, San Diego, CA, Feb. 2000. Y. Fu and J. Han. Meta-rule-guided mining of association rules in relational databases. KDOOD'95, 39-46, Singapore, Dec. 1995. J. Han, L. V. S. Lakshmanan, and R. T. Ng, "Constraint-Based, Multidimensional Data Mining", COMPUTER (special issues on Data Mining), 32(8): 46-50, 1999. L. V. S. Lakshmanan, R. Ng, J. Han and A. Pang, "Optimization of Constrained Frequent Set Queries with 2-Variable Constraints", SIGMOD’99

39

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Constraint-base Frequent-pattern Mining R. Ng, L.V.S. Lakshmanan, J. Han & A. Pang. “Exploratory mining and pruning optimizations of constrained association rules.” SIGMOD’98 J. Pei, J. Han, and L. V. S. Lakshmanan, "Mining Frequent Itemsets with Convertible Constraints", Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), April 2001. J. Pei and J. Han "Can We Push More Constraints into Frequent Pattern Mining?", Proc. 2000 Int. Conf. on Knowledge Discovery and Data Mining (KDD'00), Boston, MA, August 2000. R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. KDD'97, 67-73, Newport Beach, California.

40

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Sequential Pattern Mining Methods R. Agrawal and R. Srikant. Mining sequential patterns. ICDE'95, 3-14, Taipei, Taiwan. R. Srikant and R. Agrawal. Mining sequential patterns: Generalizations and performance improvements. EDBT’96. J. Han, J. Pei, B. Mortazavi-Asl, Q. Chen, U. Dayal, M.-C. Hsu, "FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining", Proc. 2000 Int. Conf. on Knowledge Discovery and Data Mining (KDD'00), Boston, MA, August 2000. H. Mannila, H Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1:259-289, 1997. J. Pei, J. Han, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, "PrefixSpan: Mining Sequential Patterns Efficiently by PrefixProjected Pattern Growth", Proc. 2001 Int. Conf. on Data Engineering (ICDE'01), Heidelberg, Germany, April 2001. 41

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Sequential Pattern Mining Methods B. Ozden, S. Ramaswamy, and A. Silberschatz. Cyclic association rules. ICDE'98, 412-421, Orlando, FL. S. Ramaswamy, S. Mahajan, and A. Silberschatz. On the discovery of interesting patterns in association rules. VLDB'98, 368-379, New York, NY. M.J. Zaki. Efficient enumeration of frequent sequences. CIKM’98. Novermber 1998. M.N. Garofalakis, R. Rastogi, K. Shim: SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. VLDB 1999: 223-234, Edinburgh, Scotland. Wei Wang, Jiong Yang, Philip S. Yu: Mining Patterns in Long Sequential Data with Noise. SIGKDD Explorations 2(2): 28-33 (2000) Jiong Yang, Wei Wang, Philip S. Yu, Jiawei Han: Mining Long Sequential Patterns in a Noisy Environment. SIGMOD Conference 2002 42

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Periodic Pattern Mining Methods Jiawei Han, Wan Gong, Yiwen Yin: Mining Segment-Wise Periodic Patterns in Time-Related Databases. KDD 1998: 214-218 Jiawei Han, Guozhu Dong, Yiwen Yin: Efficient Mining of Partial Periodic Patterns in Time Series Database. ICDE 1999: 106-115 Jiong Yang, Wei Wang, Philip S. Yu: Mining asynchronous periodic patterns in time series data. KDD 2000: 275-279 Wei Wang, Jiong Yang, Philip S. Yu: Meta-patterns: Revealing Hidden Periodic Patterns. ICDM 2001: 550-557 Jiong Yang, Wei Wang, Philip S. Yu: Infominer: mining surprising periodic patterns. KDD 2001: 395-400

43

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Mining Various Databases K. Koperski, J. Han, and G. B. Marchisio, "Mining Spatial and Image Data through Progressive Refinement Methods", Revue internationale de gomatique (European Journal of GIS and Spatial Analysis), 9(4):425-440, 1999. A. K. H. Tung, H. Lu, J. Han, and L. Feng, "Breaking the Barrier of Transactions: Mining Inter-Transaction Association Rules", Proc. 1999 Int. Conf. on Knowledge Discovery and Data Mining (KDD'99), San Diego, CA, Aug. 1999, pp. 297-301.

44

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Mining Various Databases H. Lu, L. Feng, and J. Han, "Beyond Intra-Transaction Association Analysis:Mining Multi-Dimensional Inter-Transaction Association Rules", ACM Transactions on Information Systems (TOIS’00), 18(4): 423-454, 2000. O. R. Zaiane, M. Xin, J. Han, "Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web Logs," Proc. Advances in Digital Librar ies Conf. (ADL'98), Santa Barbara, CA, April 1998, pp. 19-29 O. R. Zaiane, J. Han, and H. Zhu, "Mining Recurrent Items in Multimedia with Progressive Resolution Refinement", Proc. 2000 Int. Conf. on Data Engineering (ICDE'00), San Diego, CA, Feb. 2000, pp. 461-470.

45

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

References: Frequent-pattern Mining for Classification and Data Cube K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. SIGMOD'99, 359-370, Philadelphia, PA, June 1999. M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. VLDB'98, 299-310, New York, NY, Aug. 1998. J. Han, J. Pei, G. Dong, and K. Wang, “Computing Iceberg Data Cubes with Complex Measures”, Proc. ACM-SIGMOD’2001, Santa Barbara, CA, May 2001. M. Kamber, J. Han, and J. Y. Chiang. Metarule-guided mining of multi-dimensional association rules using data cubes. KDD'97, 207210, Newport Beach, California. K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. SIGMOD’99 T. Imielinski, L. Khachiyan, and A. Abdulghani. Cubegrades: Generalizing association rules. Technical Report, Aug. 2000 46

COMP 290-090 Data Mining: Concepts, Algorithms, and Applications

THANK YOU!

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL

Related Documents