Abstract
Along with the constant development of the Internet and the ever-increasing amount of data, the role of search engines has become increasingly evident. More users rely on search engines to find the information needed. In order to more effectively cluster the search results, thus facilitating the positioning of information among the original unstructured results, a new label-based clustering algorithm is introduced in this paper. The key idea is to use the dictionary resource and Dependency Syntax Parsing in NLP to extract the ontologies related to the query. These extracted ontologies will further guide the choosing of centroids in K-means clustering. Furthermore, the various features of K-means algorithm have been fully investigated, and a way of improvement is proposed by using the cluster labels. Experiments show that this algorithm not only yields more effective cluster results but also provides more informative descriptions of the results; meanwhile, the efficiency has also been largely improved.
| Original language | English |
|---|---|
| Pages (from-to) | 166-170+156 |
| Journal | Tien Tzu Hsueh Pao/Acta Electronica Sinica |
| Volume | 36 |
| Issue number | SUPPL. |
| State | Published - Dec 2008 |
| Externally published | Yes |
Keywords
- Label
- Ontology
- Search results clustering
Fingerprint
Dive into the research topics of 'Search result clustering based on centroid optimization by ontology extraction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver