TY - GEN
T1 - ITCI:An information theory based classification algorithm for incomplete data
AU - Chen, Yicheng
AU - Li, Jianzhong
AU - Luo, Jizhou
PY - 2014
Y1 - 2014
N2 - In the field of data mining, classification is an important aspect which has been studied widely. However, most of the existing studies assumed the data for classification is complete, while in practice, a lot of data with missing values exists. When dealing with these data, deleting the incomplete instances will result in a reduction of available information and filling in missing values may introduce skew and errors. To avoid the above problems, it is of great importance to study how to classify directly with incomplete data. In the paper, an information theory based classification algorithm, ITCI, is proposed. ITCI calculates the initial uncertainty of each class and attributes' contribution to decrease class uncertainty in the training stage and then, in the testing stage, an instance is assigned to the class whose uncertainty is minimum after all of the attributes are taken into consideration. Extended experiments proved the effectiveness and feasibility of the proposed method.
AB - In the field of data mining, classification is an important aspect which has been studied widely. However, most of the existing studies assumed the data for classification is complete, while in practice, a lot of data with missing values exists. When dealing with these data, deleting the incomplete instances will result in a reduction of available information and filling in missing values may introduce skew and errors. To avoid the above problems, it is of great importance to study how to classify directly with incomplete data. In the paper, an information theory based classification algorithm, ITCI, is proposed. ITCI calculates the initial uncertainty of each class and attributes' contribution to decrease class uncertainty in the training stage and then, in the testing stage, an instance is assigned to the class whose uncertainty is minimum after all of the attributes are taken into consideration. Extended experiments proved the effectiveness and feasibility of the proposed method.
KW - classification
KW - incomplete data
KW - information theory
UR - https://www.scopus.com/pages/publications/84958536805
U2 - 10.1007/978-3-319-08010-9_19
DO - 10.1007/978-3-319-08010-9_19
M3 - 会议稿件
AN - SCOPUS:84958536805
SN - 9783319080093
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 167
EP - 179
BT - Web-Age Information Management - 15th International Conference, WAIM 2014, Proceedings
PB - Springer Verlag
T2 - 15th International Conference on Web-Age Information Management, WAIM 2014
Y2 - 16 June 2014 through 18 June 2014
ER -