Skip to main navigation Skip to search Skip to main content

Design and implementation of a distributed high-performance web crawler

  • Ling Zhang
  • , Yun Ming Ye
  • , Hui Song
  • , Shui Yu
  • , Fan Yuan Ma*
  • *Corresponding author for this work
  • Shanghai Jiao Tong University

Research output: Contribution to journalArticlepeer-review

Abstract

Web crawler is the core component of WWW search engine and information retrieval systems. This paper discussed the architecture of a distributed Web crawler and the design ideas about the Web crawler data structure, system modules and related algorithms. The key problems encountered in the design and implementations were also commented, and the solutions to those problems were presented.

Original languageEnglish
Pages (from-to)59-61
Number of pages3
JournalShanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University
Volume38
Issue number1
StatePublished - Jan 2004
Externally publishedYes

Keywords

  • Distributed system
  • Java
  • Search engine
  • Web crawler

Fingerprint

Dive into the research topics of 'Design and implementation of a distributed high-performance web crawler'. Together they form a unique fingerprint.

Cite this