An efficient storage-optimizing tick data clustering model

Authors: HALEH AMINTOOSI, MASOOD NIAZI TORSHIZ, YAHYA FORGHANI, SARA ALINEJAD

Abstract: Tick data is a large volume of data, related to a phenomenon such as stock market or weather change, with data values changing rapidly over time. An important issue is to store tick data table in a way that it occupies minimum storage space while at the same time it can provide fast execution of queries. In this paper, a mathematical model is proposed to partition tick data tables into clusters with the aim of minimizing the required storage space. The genetic algorithm is then used to solve the mathematical model which is indeed a clustering model. The proposed method has been evaluated on a real-world weather tick dataset and compared to the storage-optimizing hierarchical agglomerative clustering (SOHAC) algorithm. The experiments show that our proposed method substantially outperforms SOHAC in achieving smaller values for compression ratio while reducing the execution time for small number of clusters.

Keywords: Tick data, compression, clustering

Full Text: PDF