Privacy preserving in association rules using a genetic algorithm

Authors: RAHAT ALI SHAH, SOHAIL ASGHAR

Abstract: Association rule mining is one of the data mining techniques used to extract hidden knowledge from large datasets. This hidden knowledge contains useful and confidential information that users want to keep private from the public. Similarly, privacy preserving data mining techniques are used to preserve such confidential information or restrictive patterns from unauthorized access. The pattern can be represented in the form of a frequent itemset or association rule. Furthermore, a rule or pattern is marked as sensitive if its disclosure risk is above a given threshold. Numerous techniques have been used to hide sensitive association rules by performing some modifications in the original dataset. Due to these modifications, some nonrestrictive patterns may be lost, called lost rules, and new patterns are also generated, known as ghost rules. In the current research work, a genetic algorithm is used to counter the side effects of lost rules and ghost rules. Moreover, the technique can be applied for small as well as for large datasets in the domain of medical, military, and business datasets.

Keywords: Privacy preserving data mining, association rules, sensitive association rules hiding, genetic algorithm

Full Text: PDF