Abstract
Open source data mining software represents a new trend in data mining research, education and industrial applications, especially in small and medium enterprises (SMEs). With open source software an enterprise can easily initiate a data mining project using the most current technology. Often the software is available at no cost, allowing the enterprise to instead focus on ensuring their staff can freely learn the data mining techniques and methods. Open source ensures that staff can understand exactly how the algorithms work by examining the source codes, if they so desire, and can also fine tune the algorithms to suit the specific purposes of the enterprise. However, diversity, instability, scalability and poor documentation can be major concerns in using open source data mining systems. In this paper, we survey open source data mining systems currently available on the Internet. We compare 12 open source systems against several aspects such as general characteristics, data source accessibility, data mining functionality, and usability. We discuss advantages and disadvantages of these open source data mining systems.
| Original language | English |
|---|---|
| Title of host publication | Emerging Technologies in Knowledge Discovery and Data Mining - PAKDD 2007 International Workshops, Revised Selected Papers |
| Publisher | Springer Verlag |
| Pages | 3-14 |
| Number of pages | 12 |
| ISBN (Print) | 354077016X, 9783540770169 |
| DOIs | |
| State | Published - 2007 |
| Externally published | Yes |
| Event | Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007 - Nanjing, China Duration: 22 May 2007 → 22 May 2007 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 4819 LNAI |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2007 |
|---|---|
| Country/Territory | China |
| City | Nanjing |
| Period | 22/05/07 → 22/05/07 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 8 Decent Work and Economic Growth
-
SDG 9 Industry, Innovation, and Infrastructure
Keywords
- Data mining
- FLOSS
- Open source software
Fingerprint
Dive into the research topics of 'A survey of open source data mining systems'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver