WIT Press

Stroke Risk Factors Classification Modelling


Free (open access)

Paper DOI








579 kb


A Lourenço, A C Braga & O Belo


Medicine like many other business and scientific areas has began to realise the advantages of Knowledge Discovery and Data Mining applications. As data volumes are increasing almost exponentially and physicians do not have to perform data analysis, new methodologies bringing light to the subject are more than welcome. This paper presents a very actual medical problem, studying the behaviour of distinct types of strokes according to several well-known risk factors. The aim of this work is to perform risk factors evaluation, figuring out what are the best data mining techniques to be taken and dealing with the conditioning of the problem and data. This can be viewed as a feature selection problem, where the most prevalent features to each type of stroke are to be selected. There were built several classification models relating stroke types, comparing the discriminative strength of each one and the acquired knowledge. Additionally, an association rules approach was also taken, confronting results and enriching the obtained knowledge. Introduction Knowledge Discovery in Databases (KDD) and data mining are by far one of the most prominent research areas of the moment. Knowledge acquisition is the key of any organisation’s success, in spite of its particular domain, interests and resources. Adequate, timely data analysis has a preponderant role within in organisational processes. During the last decades, KDD has been enlarging its \“domains” as it began to prove its real value in areas like Finances, Stock Market Analysis, Business Analysis and Medicine. The idea of seeking knowledge with automatic, or semi-automatic, processing seemed quite appealing especially because data volumes have been increasing almost exponentially. Today, there are an enormous number