Skip to main navigation Skip to search Skip to main content

A survey of open source data mining systems

  • Harbin Institute of Technology Shenzhen
  • Australian Taxation Office, Canberra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterprise can easily initiate a data mining project using the most current technology. Often the software is available at no cost, allowing the enterprise to instead focus on ensuring their staff can freely learn the data mining techniques and methods. Open source ensures that staff can understand exactly how the algorithms work by examining the source codes, if they so desire, and can also fine tune the algorithms to suit the specific purposes of the enterprise. However, diversity, instability, scalability and poor documentation can be major concerns in using open source data mining systems. In this paper, we survey open source data mining systems currently available on the Internet. We compare 12 open source systems against several aspects such as general characteristics, data source accessibility, data mining functionality, and usability. We discuss advantages and disadvantages of these open source data mining systems.

Original languageEnglish
Title of host publicationEmerging Technologies in Knowledge Discovery and Data Mining - PAKDD 2007 International Workshops, Revised Selected Papers
PublisherSpringer Verlag
Pages3-14
Number of pages12
ISBN (Print)354077016X, 9783540770169
DOIs
StatePublished - 2007
Externally publishedYes
EventPacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007 - Nanjing, China
Duration: 22 May 200722 May 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4819 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferencePacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007
Country/TerritoryChina
CityNanjing
Period22/05/0722/05/07

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 8 - Decent Work and Economic Growth
    SDG 8 Decent Work and Economic Growth
  2. SDG 9 - Industry, Innovation, and Infrastructure
    SDG 9 Industry, Innovation, and Infrastructure

Keywords

  • Data mining
  • FLOSS
  • Open source software

Fingerprint

Dive into the research topics of 'A survey of open source data mining systems'. Together they form a unique fingerprint.

Cite this