Parallel Evaluation of Large Scale Hierarchical Clustering Results

Cruz Rodríguez, David

Ver/

SP-12_Articulo Final_David Cruz.pdf (5.179Mb)

Fecha

2012

Autor

Cruz Rodríguez, David

Metadatos

Mostrar el registro completo del ítem

Resumen

Abstract ⎯ Data clustering refers to the automatic grouping of object based on their similarity, i.e., similar objects should be in the same group and dissimilar objects should be in different groups. In particular, for hierarchical clustering algorithms there is also the notion of a hierarchy in which the objects and the cluster fit. Clustering is a fundamental task in data mining, machine learning, information retrieval, bioinformatics, and image analysis, among others. It is important to evaluate the result of clustering algorithms. However most evaluations approaches are geared towards nonhierarchical clustering approaches; this research explores how to use traditional validity measures to evaluate and assess hierarchical clustering results. Key Terms ⎯ Clustering, Data Clustering, Hierarchical Clustering, and Validity Measures.

URI

http://hdl.handle.net/20.500.12475/426

Colecciones

Computer Engineering