The Application of Sampling to the Design of Structural Analysis Web Crawlers
The growth of the World Wide Web (WWW) has seen it evolve
into a rich information resource. It is constantly traversed with
the aid of crawlers so as to harvest web content. When collecting
data, crawlers have the potential of causing service denial to web
servers. This paper proposes the application of sampling as a
selection strategy in the design of structural analysis web
crawlers. This has the benefit of alleviating the problems of
bandwidth costs to web servers whilst retaining the quality of the
data that is mined by crawlers. The initial results of this study are
promising and are presented in this paper.
Keywords: web crawler, sampling, web server, denial of
service attacks
Download Full-Text








