Searching Relationships Between Enterprise Websites Using Graph Based Web Crawling
Free (open access)
61 - 69
R. C. F. De Souza, G. M. Caputo & N. F. F. Ebecken
The objective of this paper is to find explicit web relationships using enterprise websites as seeds. We apply a web crawler to find these relationships in a hierarchy starting from the given seed using the external links to construct a Jaccard Score weighted tree. The proposed methodology aims to search related enterprises from the root node based on the link, which are potential partners, suppliers, clients, etc. We crawl the whole site to find external links using the Breadth First Search (BSF) algorithm and build a tree structure containing just the interesting external links. The applied algorithms were programmed with very simple computational components and may produce interesting results to analyze the domain of sites, their structure, and how they link with each other in their acting range. Keywords: link analysis, BSF algorithm, web crawling.
link analysis, BSF algorithm, web crawling.