Effect of over-sampling versus under-sampling for SVM and LDA classifiers for activity recognition
Free (open access)
Volume 11 (2016), Issue 3
306 - 316
M.B ABIDINE, B. FERGANI & F.J ORDÓÑEZ
Accurately recognizing the rare activities from sensor network-based smart homes for monitoring the elderly person is a challenging task. Activity recognition datasets are generally imbalanced, meaning certain activities occur more frequently than others. Not incorporating this class imbalance results in an evaluation that may lead to disastrous consequences for elderly persons. To overcome this problem, we evaluate two resam- pling methods using Over-sampling (OS) and Under-sampling (US). Then, these methods were combined with the discriminative classifiers named support vector machines (SVM) and linear discriminant analysis (LDA). experimental results carried out on multiple real-world smart home datasets demonstrate the feasibility of the proposal. Besides, a comparison with some state–of-the-art techniques based on Conditional Random Field (CRF) and Hidden Markov Model (HMM), we demonstrate that the US-SVM and OS-LDA are able to surpass HMM, CRF, SVM, LDA, OS-SVM and US-LDA. However, OS-LDA is the most effective method in terms of recognition of activities.
humanactivity recognition, imbalanced data, LDA, machine learning, SVM