Association Rule Mining.pptx

  • Uploaded by: Shriprasad Jadhav
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Download & View Association Rule Mining.pptx as PDF for free.

More details

  • Words: 991
  • Pages: 17
ASSOCIATION RULE MINING • The task of association rule mining is to find certain association relationships among a set of items in a dataset/database. • A typical example of an association rule created by data mining often termed to as “market basket data” is: “ 80% of customers who purchase bread also purchase butter.”

Suppose, as manager of an AllElectronics branch, you would like to learn more about the buying habits of your customers. Specifically, you wonder, “Which groups or sets of items are customers likely to purchase on a given trip to the store?” To answer your question, market basket analysis may be performed on the retail data of customer transactions at your store. You can then use the results to plan marketing or advertising strategies, or in the design of a new catalog. For instance, market basket analysis may help you design different store layouts. In one strategy, items that are frequently purchased together can be placed in proximity to further encourage the combined sale of such items.

If customers who purchase computers also tend to buy antivirus software at the same time, then placing the hardware display close to the software display may help increase the sales of both items.

• The association relationships are described in association rules. • In association rule mining there are two measurements, support and confidence. • The confidence measure indicates the rule’s strength, while support corresponds to the frequency of the pattern. For example, the information that customers who purchase computers also tend to buy antivrus software at the same time is represented in the following association rule:

A support of 2% for Rule (6.1) means that 2% of all the transactions under analysis show that computer and antivirus software are purchased together. A confidence of 60% means that 60% of the customers who purchased a computer also bought the software.

• Given a user specified minimum support and minimum confidence. • The problem of mining association rules is to find all the association rules whose support and confidence are larger than the minimum support and minimum confidence. • Thus, this approach can be broken into two sub-problems as follows: (1) Finding the frequent itemsets which have support above the predetermined minimum support. (2) Deriving all rules, based on each frequent itemset, which have confidence more than the minimum confidence. • There are a lots of ways to find the large itemsets but we will only discuss the Apriori Algorithm

Apriori Algorithm • Step 1: Data in the database • Step 2: Calculate the support/frequency of all items • Step 3: Discard the items with minimum support less than 2 • Step 4: Combine two items • Step 5: Calculate the support/frequency of all items • Step 6: Discard the items with minimum support less than 2 • Step 6.5: Combine three items and calculate their support. • Step 7: Discard the items with minimum support less than 2

• Step 1: Data in the database • Step 2: Calculate the support/frequency of all items • Step 3: Discard the items with minimum support less than 2 • Step 4: Combine two items • Step 5: Calculate the support/frequency of all items • Step 6: Discard the items with minimum support less than 2 • Step 6.5: Combine three items and calculate their support. • Step 7: Discard the items with minimum support less than 2

Fp Growth Algorithm (Frequent pattern growth). The FP-Growth Algorithm, proposed by J.Han . FP growth algorithm is an improvement of apriori algorithm. FP growth algorithm used for finding frequent itemset in a transaction database without candidate generation. FP growth represents frequent items in frequent pattern trees or FP-tree. Advantages of FP growth algorithm:1. Faster than apriori algorithm 2. No candidate generation 3. Only two passes over dataset Disadvantages of FP growth algorithm:1. FP tree may not fit in memory 2. FP tree is expensive to build

FP Tree Algorithm

Input: A database DB, represented by FP-tree constructed according to Algorithm 1, and a minimum support threshold ?. Output: The complete set of frequent patterns. Method: call FP-growth(FP-tree, null). Procedure FP-growth(Tree, a)

• • •

{ (01) if Tree contains a single prefix path then { // Mining single prefix-path FP-tree (02) let P be the single prefix-path part of Tree; (03) let Q be the multipath part with the top branching node replaced by a null root; (04) for each combination (denoted as ß) of the nodes in the path P do (05) generate pattern ß ∪ a with support = minimum support of nodes in ß; (06) let freq pattern set(P) be the set of patterns so generated;


(07) else let Q be Tree; (08) for each item ai in Q do { // Mining multipath FP-tree (09) generate pattern ß = ai ∪ a with support = ai .support; (10) construct ß’s conditional pattern-base and then ß’s conditional FP-tree Tree ß; (11) if Tree ß ≠ Ø then (12) call FP-growth(Tree ß , ß); (13) let freq pattern set(Q) be the set of patterns so generated;} (14) return(freq pattern set(P) ∪ freq pattern set(Q) ∪ (freq pattern set(P) × freq pattern set(Q))) }

The original example can be viewed in Consider the transactions below and the minimum support as 3: Step 2 - Find frequency of occurrence i(t)

Frequency / Support











Step 3 - Prioritize the items

{ B(6), E(5), A(4), C(4), D(4) }














(b) Transaction 2: BEC

(a) Transaction 1: BEAD

{ B(6), E(5), A(4), C(4), D(4) }

(c) Transaction 3: BEAD

(d) Transaction 4: BEAC (e) Transaction 5: BEACD

• (f) Transaction 6: BCD

i(t) T100

I2, I1, I5


I2, I4


I2, I3


I2, I1, I4


I1, I3


I2, I3


I1, I3


I2, I1, I3, I5


I2, I1, I3

Related Documents

Association Rule 4
October 2019 37
Association Rule Mining.pptx
November 2019 18
November 2019 54

More Documents from ""

Final Cc Unit 1.docx
November 2019 5
Association Rule Mining.pptx
November 2019 18
Vaishanvi Refernace.docx
December 2019 17
June 2020 5
Jan-feb 2009
June 2020 19
Nov-dec 2008
June 2020 18