排名前50的开源爬虫

2016/06/30 13:20
阅读数 390

某英文站点整理

 

名字 开发语言 平台
Heritrix Java Linux
Nutch Java Cross-platform
Scrapy Python Cross-platform
DataparkSearch C++ Cross-platform
GNU Wget C Linux
GRUB C#, C, Python, Perl Cross-platform
ht://Dig C++ Unix
HTTrack C/C++ Cross-platform
ICDL Crawler C++ Cross-platform
mnoGoSearch C Windows
Norconex HTTP Collector Java Cross-platform
Open Source Server C/C++, Java PHP Cross-platform
PHP-Crawler PHP Cross-platform
YaCy Java Cross-platform
WebSPHINX Java Cross-platform
WebLech Java Cross-platform
Arale Java Cross-platform
JSpider Java Cross-platform
HyperSpider Java Cross-platform
Arachnid Java Cross-platform
Spindle Java Cross-platform
Spider Java Cross-platform
LARM Java Cross-platform
Metis Java Cross-platform
SimpleSpider Java Cross-platform
Grunk Java Cross-platform
CAPEK Java Cross-platform
Aperture Java Cross-platform
Smart and Simple Web Crawler Java Cross-platform
Web Harvest Java Cross-platform
Aspseek C++ Linux
Bixo Java Cross-platform
crawler4j Java Cross-platform
Ebot Erland Linux
Hounder Java Cross-platform
Hyper Estraier C/C++ Cross-platform
OpenWebSpider C#, PHP Cross-platform
Pavuk C Lunix
Sphider PHP Cross-platform
Xapian C++ Cross-platform
Arachnode.net C# Windows
Crawwwler C++ Java
Distributed Web Crawler C, Java, Python Cross-platform
iCrawler Java Cross-platform
pycreep Java Cross-platform
Opese C++ Linux
Andjing Java  
Ccrawler C# Windows
WebEater Java Cross-platform
JoBo Java Cross-platform

参考:http://bigdata-madesimple.com/top-50-open-source-web-crawlers-for-data-mining/

 

展开阅读全文
打赏
0
16 收藏
分享
加载中
更多评论
打赏
0 评论
16 收藏
0
分享
返回顶部
顶部