
Probably most interesting to the reader is the setup for the backend. */5 * * * * /home/scraper/torscraper/scripts/pastebin.shġ */6 * * * /home/scraper/torscraper/scripts/portscan_up.shģ2 */2 * * * /home/scraper/torscraper/scripts/stronghold_paste_rip.shġ6 3 * * * /home/scraper/torscraper/scripts/detect_clones.shįresh Onions runs on two servers, a frontend host running the database and hidden service web site, and a backend host running the crawler. # scrape pastebin for onions (needs paid account / IP whitelisting) # mark sites as genuine / fake from the /r/darknetmarkets superlistġ 9 * * 1 /home/scraper/torscraper/scripts/get_valid.sh Official TOR Browser App released for Android cronjobs # harvest onions from various sourcesġ 18 * * * /home/scraper/torscraper/scripts/harvest.shġ 4,16 * * * /home/scraper/torscraper/scripts/update_fingerprints.sh If elasticsearch is disabled there will be no fulltext search, however crawling and discovering new sites will still work. Run scripts/elasticsearch_migrate.sh to perform the initial setup after configuration. Edit etc/elasticsearch and set vars or set ELASTICSEARCH_ENABLED=false to disable. The torscraper comes with optional elasticsearch capability (enabled by default). Init/isup_service.sh # to keep site status up to date Run: init/scraper_service.sh # to start crawling Script/push.sh anotheroniondirectory.onionĮdit etc/uwsgi_only and set BASEDIR to wherever torscraper is installed (i.e. Fuzzy clone detection (requires elasticsearch, more advanced than superlist clone detection)Īlso read: ToRat - Remote Administation Tool using Tor as a Transport Mechanism pip install:Įdit etc/database for your database setupĮdit etc/proxy for your TOR setup script/push.sh someoniondirectory.onion.Search for “interesting” URL paths, useful 404 detection.Up-to-date alive / dead hidden service status.Shows incoming / outgoing links to onion domains.Finds bitcoin addresses across hidden services.Finds email addresses across hidden services.



This is a copy of the source for the hidden service, which implements a tor hidden service crawler / spider and web site.
