TopAlter.com

Heritrix Alternatives

Heritrix Alternatives

Heritrix

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Heritrix (sometimes spelled heretrix, or misspelled or mis-said as heratrix/heritix/ heretix/heratix) is an archaic word for heiress (woman who inherits). Since our crawler seeks to collect and preserve the digital artifacts of our culture for the benefit of future researchers and generations, this name seemed apt.

Best Heritrix Alternatives

Need an alternative to Heritrix? Read on. We've looked at the best Heritrix alternatives available for Windows, Mac and Android.

Algolia

Algolia

Free PersonalWebAndroid SDKRubyPythonJavaScriptAngularJScURLRuby on RailsNode.JSObjective-C

Algolia helps product teams connect their users with information by providing the building blocks they need to create fast, relevant, personalized search.

Features:

  • Api
  • Developer Tools
  • Full text search
  • Indexed search
  • Real-time
  • REST API
  • Search engine
  • Search-server
Mixnode

Mixnode

CommercialWeb

Mixnode is a fast, flexible, massively scalable platform to extract and analyze data from the web. Mixnode allows you to think of all resources on the web as rows in...

Features:

  • Content-Type Filtering
  • Support for Amazon S3
  • URL Filtering
  • WARC Output
Google Custom Search Engine

Google Custom Search Engine

FreemiumWeb

With Google Custom Search, add a search box to your homepage to help people find what they need on your website.

Features:

  • Embeddable
  • Search engine
Expertrec Search Engine

Expertrec Search Engine

CommercialSoftware as a Service (SaaS)

Expertrec custom search started as a replacement for google site search. It adds super-fast search autocomplete, spell correct, search listing pages to your website.

Features:

  • Ad-free
  • Full text search
  • Instant search
  • Multiple languages
  • Support for Right-to-Left
  • Search Analytics
  • Search engine
  • Voice Search
  • Autocompletion
  • Python
  • Search engine
  • Software as a Service
ACHE Crawler

ACHE Crawler

FreeOpen SourceMacWindowsLinux

ACHE is a web crawler for domain-specific search.

Apache Nutch

Apache Nutch

FreeOpen SourceMacWindowsLinux

Apache Nutch is a highly extensible and scalable open source web crawler software project. Nutch is coded entirely in the Java programming language, but data is...

Features:

  • Extensible by Plugins/Extensions
  • Scalable
StormCrawler

StormCrawler

FreeOpen SourceMacWindowsLinux

StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.

Apisearch

Apisearch

FreemiumOpen SourceSelf-HostedInstagramTwitterGitHub Pages

Search over millions of documents, and give to your users unique, amazing and unforgettable experiences.

Features:

  • Embeddable
  • Full text search
  • Indexed search
  • Search engine
  • Search-server

Upvote Comparison

Interest Trends

Heritrix Reviews

Add your reviews & share your experience when using Heritrix to the world. Your opinion will be useful to others who are looking for the best Heritrix alternatives.

Copyright © 2021 TopAlter.com

Sites we Love: AnswerBun, MenuIva, UKBizDB, Sharing RPP