crawler17.5kMIT2.0.2Crawler is a ready-to-use web spider that works with proxies, asynchrony, rate limit, configurable request pools, jQuery, and HTTP/2 support.
bda-research3 months agocrawler, javascript, spider, scraper @nodelib/fs.walk238.4mMIT3.0.1A library for efficiently walking a directory recursively
nodelib9 months agocrawler, NodeLib, fs, FileSystem fdir131mMIT6.5.0The fastest directory crawler & globbing alternative to glob, fast-glob, & tiny-glob. Crawls 1m files in < 1s
thecodrr21 days agocrawler, util, os, sys isbot7mUnlicense5.1.30🤖/👨🦰 Recognise bots/crawlers/spiders using the user agent string.
omrilotan18 days agocrawlers, bot, spiders, googlebot pdf-parse5mMIT1.1.1Pure javascript cross-platform module to extract text from PDFs.
autokentover 3 years agopdf-crawler, pdf-parse, xpdf, pdf.js