Skip to main navigation Skip to search Skip to main content

Tree-based metric learning for distance computation in data mining

  • Ming Yan
  • , Yan Zhang
  • , Hongzhi Wang*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Distance is an essential measurement of data mining. A good metric often leads to a good performance. Then how to obtain a proper metric systematically is critical. Distance metric learning is a classic method to learn distances between instances on data set with complex distributions. However, most researches on distance metric learning are based on Mahalanobis metric, which is equivalent to linear transformation on distance space that has limitation on complex data. To solve this problem, we propose a metric learning method based on non-linear transformation suitable for complex data. By using the tree model, we could address non-linearly separable data that rearrange input data and represent them to another forms, and tree model could be able to implicitly represent data to a new distance space with a non-linear activator function. Furthermore, single tree model will lead to overfit that has higher generalization errors. Therefore, we design a randomize algorithm to combining different tree models which could reduce the generalization errors in theory and practice. According to analysis, we prove the correctness and effectiveness of our algorithm in theory. Extensive experiments demonstrate that algorithm is stable and suitable for data mining.

Original languageEnglish
Title of host publicationWeb Technologies and Applications - 17th Asia-PacificWeb Conference,APWeb 2015, Proceedings
EditorsReynold Cheng, Bin Cui, Zhenjie Zhang, Ruichu Cai, Jia Xu
PublisherSpringer Verlag
Pages377-388
Number of pages12
ISBN (Print)9783319252544
DOIs
StatePublished - 2015
Event17th Asia-PacificWeb Conference, APWeb 2015 - Guangzhou, China
Duration: 18 Sep 201520 Sep 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9313
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Asia-PacificWeb Conference, APWeb 2015
Country/TerritoryChina
CityGuangzhou
Period18/09/1520/09/15

Fingerprint

Dive into the research topics of 'Tree-based metric learning for distance computation in data mining'. Together they form a unique fingerprint.

Cite this