Balanced Parallel Algorithm for Mining Frequent Pattern from Data Stream

Authors

  • Zakria Mahrousa
  • Dima Mufti Alchawafa
  • Hasan Kazzaz

Keywords:

Association Rule Mining (ARM), Frequent itemset, FP-growth, Directed Graph, Massive Data Stream, MapReduce, Hadoop, Partitioning data.

Abstract

The frequent pattern mining methods play very important role to generate association rules from massive data stream such as include customer click streams, network monitoring data, etc. The continuous, unbounded and high-speed characteristics of massive data stream are a huge challenge for the current frequent pattern mining approach. The complexities related to finding frequent itemset for mining association rules from a massive data stream in this work can be minimized by using modified FP-growth algorithm and parallelizing the mining task with MapReduce technique in Hadoop framework, improves performance by using balanced load technique, which exploits correlations among transactions. In this paper, we introduce (Balanced Parallel Graph Frequent Pattern BPGFP-growth), a modified FP-growth with one-pass scan based on directed graph, Hadoop framework, partitioning and balancing load strategy in order to reduce the execution time for the massive dynamic database and the volume of data exchanged between computational nodes (computers). The algorithm was tested, our experimental results demonstrated that the proposed algorithm could scale well and efficiently process large dynamic datasets. In addition, it achieves improvement in memory consumption to store frequent patterns and time complexity.  

Published

2020-07-31

How to Cite

1.
Mahrousa Z, Mufti Alchawafa D, Kazzaz H. Balanced Parallel Algorithm for Mining Frequent Pattern from Data Stream. Engineering Sciences Series [Internet]. 2020Jul.31 [cited 2020Nov.23];42(3). Available from: http://journal.tishreen.edu.sy/index.php/engscnc/article/view/9783