WIT Press

Dependency Networks And Bayesian Networks For Web Mining


Free (open access)

Paper DOI








617 kb


C Tarantola & E Blanc


Following the approach described by Heckerman et al. ([5]), we present an application of Dependency Networks and Bayesian Networks to the analysis of a click-stream data set. Our target is to discover which paths are more often followed by the users. The relation between one web page and another one is represent by a direct graph. Whereas Bayesian Networks use direct acyclic graphs, Dependency Networks may contain cyclic structures. The analysis will be performed with the WinMine Toolkit software. 1 Introduction Web mining is a methodology that applies data mining techniques to discover usage patterns from Web data in order to optimally design a web site and better satisfies needs of different visitors. The aim of this work is to use different graphical models: Bayesian Networks (BN hereafter) and Dependency Networks (DN hereafter), to analyse click stream data. A click stream is a sequential series of page view requests. The click stream of page views for a single user across the entire Web is a user session. Typically, only the portion of each user session that is accessing a specific site can be used for analysis (for more details see e.g. J. Srivastava et al. [8]). The methodology presented will be applied to the analysis of a real set of data regarding an e-commerce web site. All computations will be performed with the WinMine Toolkit software. WinMine is a free software developed by the Machine Learning and Applied Statistics group of Microsoft Research that permits to learn BN and DN from data. From our experience we think that this software is quite user friendly, even