Parallel SVM For Large Data-set Mining
Price
Free (open access)
Volume
29
Pages
10
Published
2003
Size
471 kb
Paper DOI
10.2495/DATA030631
Copyright
WIT Press
Author(s)
L. Qian & T. Hung
Abstract
Support Vector Machine is gaining popularity as a data mining technique. As it is being extended and applied to more industrial applications, performance limitation will be a real concern when dealing with real commercial activities where the data set can be very huge and quick analysis turnaround is required for quick business decision making. This issue needs to be addressed in order for SVM to become a viable and practical commercial data mining tool. In this paper we will present, evaluate and compare the performance of one particular parallel SVM training algorithm based on MPI programming. This parallel version of SMO algorithm is proposed to speed up SVM training. One performance model based on generalized Arndahl's formulation is applied to analyze the scalability in terms of problem size and complexity. Further, it is used to provide some guideline to determine the best
Keywords