WIT Press

Knowledge Discovery And Supervised Machine Learning In A Construction Project Database


Free (open access)

Paper DOI








694 kb


H Kim & L Soibelman


The construction industry is experiencing explosive growth in its capability to generate and collect data. Advances in data storage technology have allowed the transformation of an enormous amount of data into computerized database systems. Nowadays, there are many efforts to convert the large amounts of data into useful patterns or trends. Knowledge Discovery in Database (KDD) is a process that combines Data Mining (DM) techniques from machine learning, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from a large database. By applying KDD and DM to the analysis of construction project data, this paper presents the results of a research that discovers the knowledge through KDD process to better identify recurring construction problems. 1 Introduction Nowadays the explosive growth of many business, government, and scientific databases has far outpaced our ability to interpret and digest the available data. Such volumes of data clearly overwhelm traditional methods of data analysis such as spreadsheets and ad-hoc queries. Traditional methods can create informative reports from data, but cannot analyze the contents of those reports. Thus, a significant need exists for a new generation of techniques and tools with the ability to automatically assist humans in analyzing the mountains of data for useful knowledge (Soibelman & Kim, 2002). As the construction industry is adapting to new computer technologies in terms of hardware and software, computerized construction data are becoming more and more available. However, in most cases, these data may not be used, or