Cost models for distributed pattern mining in the cloud

by Sabeur Aridhi, University of Trento

February 11th 2015 @ 14:30, in Room Ofek

In this DbTrento meeting Sabeur will present his work on:

Cost models for distributed pattern mining in the cloud

Abstract:

Recently, distributed pattern mining approaches have become very popular, especially in certain domains such as bioinformatics, chemoinformatics and social networks. In most cases, the distribution of the pattern mining process generates a loss of information in the output results. Reducing this loss may affect the performance of the distributed approach and thus, the monetary cost when using cloud environments. In this context, cost models are needed to help selecting the best parameters of the used approach in order to achieve a better performance especially in the cloud. In this paper, we address the multi-criteria optimization problem of tuning thresholds related to distributed frequent pattern mining in cloud computing environment while optimizing the global monetary cost of storing and querying data in the cloud. To achieve this goal, we design cost models for managing and mining graph data with large scale pattern mining framework over a cloud architecture. We define four objective functions, with respect to the needs of customers. We present an experimental validation of the proposed cost models in the case of distributed subgraph mining in the cloud.

Speaker:

Sabeur Aridhi is a postdoctoral research fellow with the DBTrento group at the University of Trento since September 2014. He is currently working on large graph analysis. He obtained his PhD degree in Computer Science from the Blaise Pascal University in Clermont Ferrand, France.

Contact Info: Matteo Lissandrini (ml@disi.unitn.eu)