Release 1.3 of Norconex HTTP Collector is now available. Among new features added to our open-source web crawler, you can expect the following:
- Now supports NTLM authentication. Experimental support added for SPNEGO and Kerberos.
- Document checksums are added to each document metadata.
- Refactoring of HTTPClient creation with many new configuration options added (connection timeout, charset, maximum redirects, and several more).
- Can optionally trust all SSL certificate now.
- Integrates new features of Norconex Importer 1.2.0 such as support for WordPerfect document parsing, new filter and transformers, etc.
- Integrates new features of Norconex Committer 1.2.0 such as defining multiple committers, retrying upon commit failure, etc.
- Other third-party library upgrades.