Quantitative Discriminant Rule
- general form: target_class contition condition
- Discriminant rule: sufficient condition of the target class
- d-weight =
- d-weight: discriminability of each disjunct in the rule
- important
- Characteristic rule: necessary condition
Association
-
Association rules: [support, condifence]
-
Support
-
Confidence (t-weight)
-
-
strong: support > min_sup, confidence > min_conf
-
k-itemset: contains k items
-
frequency: support count
-
frequent itemset: support > min_sup
Apriori
- Two-step process
- find all frequent itemsets
- generate strong association rules (easy)
- INSIGHT: All non-empty subsets of a frequent itemset must also be frequent
- frequent -itemset:
- Join:
- Prune: determine candidates in to get
- ordering
Apriori variant
-
AIS: for each tuple, expand by adding other items contained in the tuple to generate
-
AprioriTid: calculate support in
-
DHP (Direct Hashing and Pruning)
- INSIGHT: the processes in the initial iterations of Apriori dominates the total execution cost
- Knowledge 1: Any member of a candidate frequent itemset must be hashed into a bucket whose count min_sup
- Knowledge 2: Any tuple useful in determining must contain at least sets in
- Knowledge 3: For any items contained in a tuple, if it is useful in determining , it must appear in at least sets in
- Knowledge 4: For any items contained in a tuple, if it is useful in determining it must appear in at least one (k+1)-itemset whose k-itemsets are all candidate frequent k-itemsets
-
Improvement
- partitioning
- sampling
- transaction reduction
FP-growth
- Mining from FP-tree
- FP-tree construction
- scan DB once, find frequent 1-itemset
- sort frequent items in frequency descending order to get F-list
- scan DB again construct FP-tree
- for each frequent item in reverse frequency, construct its conditional pattern-base and conditional FP-tree
- repeat on newly created conditional FP-tree until empty
Others
- closed frequent itemset
- closed: no proper super-itemset such that has the same support count as in
- Maximal frequent itemset: if is frequent, and there exists no proper super-itemset such that is frequent in
- Multilevel association rules: Rules generated from association rule mining with concept hierarchies
- same min_sup for all levels
- level-by-level independent
- level-cross filtering by -itemsets
- Level-crossfilteringbysingleitem
- Controlled level-cross filtering by single item
- Cross-level association rules
- Quantitative Association rules
- Step 1. Binning
- Step 2. Finding frequent predicatesets
- Step 3. Clustering association rules
- Distance-based association rules
- Criticism: Strong association rules may not be interesting
- Correlation analysis
- Lift: