TY - GEN
T1 - Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics
AU - Zou, Zhaonian
AU - Gao, Hong
AU - Li, Jianzhong
PY - 2010
Y1 - 2010
N2 - Frequent subgraph mining has been extensively studied on certain graph data. However, uncertainties are inherently accompanied with graph data in practice, and there is very few work on mining uncertain graph data. This paper investigates frequent subgraph mining on uncertain graphs under probabilistic semantics. Specifically, a measure called φ-frequent probability is introduced to evaluate the degree of recurrence of subgraphs. Given a set of uncertain graphs and two numbers 0 < φ, τ < 1, the goal is to quickly find all subgraphs with φ-frequent probability at least τ. Due to the NP-hardness of the problem, an approximate mining algorithm is proposed for this problem. Let 0 < δ < 1 be a parameter. The algorithm guarantees to find any frequent subgraph S with probability at least (1-δ/2)s, where s is the number of edges of S. In addition, it is thoroughly discussed how to set δ to guarantee the overall approximation quality of the algorithm. The extensive experiments on real uncertain graph data verify that the algorithm is efficient and that the mining results have very high quality.
AB - Frequent subgraph mining has been extensively studied on certain graph data. However, uncertainties are inherently accompanied with graph data in practice, and there is very few work on mining uncertain graph data. This paper investigates frequent subgraph mining on uncertain graphs under probabilistic semantics. Specifically, a measure called φ-frequent probability is introduced to evaluate the degree of recurrence of subgraphs. Given a set of uncertain graphs and two numbers 0 < φ, τ < 1, the goal is to quickly find all subgraphs with φ-frequent probability at least τ. Due to the NP-hardness of the problem, an approximate mining algorithm is proposed for this problem. Let 0 < δ < 1 be a parameter. The algorithm guarantees to find any frequent subgraph S with probability at least (1-δ/2)s, where s is the number of edges of S. In addition, it is thoroughly discussed how to set δ to guarantee the overall approximation quality of the algorithm. The extensive experiments on real uncertain graph data verify that the algorithm is efficient and that the mining results have very high quality.
KW - Frequent subgraph
KW - Probabilistic semantics
KW - Uncertain graph
UR - https://www.scopus.com/pages/publications/77956193984
U2 - 10.1145/1835804.1835885
DO - 10.1145/1835804.1835885
M3 - 会议稿件
AN - SCOPUS:77956193984
SN - 9781450300551
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 633
EP - 642
BT - KDD'10 - Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data
T2 - 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-2010
Y2 - 25 July 2010 through 28 July 2010
ER -