# 1、简介

Supp(X=>Y) = P(X)
Conf(X=>Y) = P(Y|X)
Lift(X=>Y) = CONF(X=>Y)/SUPP(Y) = P(X and Y)/(P(X)P(Y))
(Lift)是避免了一些不平衡数据标签的偏差性， Lift越大，则数据质量较好；Lift越小，则数据越不平衡。

# 2、数据准备和arulesViz的统一接口

> library("arulesViz")
> data("Groceries")
> summary(Groceries)

> rules <- apriori(Groceries, parameter = list(support = 0.001, confidence = 0.5))
> rules
set of 5668 rules

> inspect(head(sort(rules, by = "lift"), 3))
lhs rhs support confidence lift
1 {Instant food products,
soda} => {hamburger meat} 0.001220132 0.6315789 18.99565
2 {soda,
popcorn} => {salty snack} 0.001220132 0.6315789 16.69779
3 {flour,
baking powder} => {sugar} 0.001016777 0.5555556 16.40807

# 3、散点图

> plot(rules)

support confidence lift
1 0.001118454 0.7333333 2.870009
2 0.001220132 0.5217391 2.836542
3 0.001321810 0.5909091 2.312611
4 0.001321810 0.5652174 2.212062
5 0.001321810 0.5200000 2.035097
6 0.003660397 0.6428571 2.515917

> plot(rules, measure = c("support", "lift"), shading = "confidence")

> plot(rules, shading = "order", control = list(main = "Two-key plot"))

> sel <- plot(rules, measure = c("support", "lift"), shading = "confidence",
+ interactive = TRUE)

# 4、基于分组矩阵的可视化

> plot(rules, method = "grouped")

lift从左上角到右下角的颜色逐渐变小。这里有3条规则包含“Instant food products ”，RHS超过2个其他项集的是“hamburger meat”。

# 5、基于图的可视化

> subrules2 <- head(sort(rules, by = "lift"), 10)

arulesViz包含了一些基于图形的可视化展示，使用Rgraphviz扩展包的一些接口。默认的版本点代表项目集，表代表规则项集之间的有向边 。

> plot(subrules2, method = "graph")

> plot(subrules2, method = "graph", control = list(type = "items"))

8、小节