WIT Press

Association And Classification Models For Web Mining


Free (open access)

Paper DOI








509 kb


E Blanc & P Giudici


We present methods for the search of association rules in the analysis of web clickstream data. We first show association rules, and show how to draw data mining conclusions on the basis of them. We then compare association rules, which are local models, with tree-based models, and show how association rules can be deduced from them. Our analysis have been conducted on a real e-commerce dataset. 1 The data Every time an user links up at a web site, the server keeps track of all the actions accomplished in the log file. What is captured is the \“click flow” (click-stream) of the mouse and the keys used by the user during the navigation inside the site. Usually at every click of the mouse corresponds the visualization of a web page. Therefore, we can define a click-stream as the sequence of the requested pages. The succession of the pages shown by a single user during his navigation inside the Web identifies an user session. Typically, the analysis only concentrates on the part of each user session concerning the access at a specific site. The set of the pages seen, inside a user session, coming from a determinate site is known with the term server session. The data set that we consider for the analysis is the result of the elaboration of a log file concerning a site of e-commerce. The source of the data cannot be specified; however it is the website of a company that sells hardware and software products; it will be referred to as \“a web shop”. The accesses to the website have been registered in a logfile for a period of about two years, since 30 September 1997 to 30 June 1999. The logfile has then been processed to produce