WIT Press


Parallel SVM For Large Data-set Mining

Price

Free (open access)

Paper DOI

10.2495/DATA030631

Volume

29

Pages

10

Published

2003

Size

471 kb

Author(s)

L. Qian & T. Hung

Abstract

Support Vector Machine is gaining popularity as a data mining technique. As it is being extended and applied to more industrial applications, performance limitation will be a real concern when dealing with real commercial activities where the data set can be very huge and quick analysis turnaround is required for quick business decision making. This issue needs to be addressed in order for SVM to become a viable and practical commercial data mining tool. In this paper we will present, evaluate and compare the performance of one particular parallel SVM training algorithm based on MPI programming. This parallel version of SMO algorithm is proposed to speed up SVM training. One performance model based on generalized Arndahl's formulation is applied to analyze the scalability in terms of problem size and complexity. Further, it is used to provide some guideline to determine the best

Keywords