Authors: VINOTHSARAVANAN RAMAKRISHNAN, PALANISAMY CHENNIAPPAN
Abstract: Discovery of antipatterns from arbitrary SQL query log depends on the static code analysis used to enhance the quality and performance of software applications. The existence of antipatterns reduces the quality and leads to redundant SQL statements. SQL log includes a large load on the database and it is difficult for an analyst to extract large patterns in a minimal time. Existing techniques which discover antipatterns in SQL query face a lot of innumerable challenges to discover the normal sequences of queries within the log. In order to discover the antipatterns in the log, an efficient technique called Brown infomax boosted SQL query clustering (BIBSQLQC) technique is introduced. Initially, the number of patterns (i.e. queries) are extracted from the SQL query log. After extracting the patterns, the ensemble clustering process is carried out to find out the antipatterns from the given query log. The Brown infomax boost clustering is an ensemble learning method for grouping the patterns by constructing several weak learners. The Brown clustering is used as a weak learner for partitioning the patterns into 'k' number of clusters based on the Euclidean distance measure. Then the weak learner merges the two clusters with maximum information gained to minimize the time complexity. The clustering results of weak learners are combined into strong results with minimal error rate (ER). By this way, the antipattern in the SQL query log is detected with a higher accuracy. Experimental evaluation is conducted with different parameters namely detection accuracy (DA), false positive rate (FPR) and time complexity (TC) using the two SQL query log data-sets (DS). The experimental result shows that, the BIBSQLQC technique achieves higher DA with lower TC and FPR than the conventional methods.
Keywords: SQL log analysis, patterns and antipatterns, Brown infomax boosting, query, clustering
Full Text: PDF