Norconex is currently showcasing its new Norconex Content Analytics product at the GTEC event in Ottawa.  Mike Clark and Khalid Alhomoud are having a good time meeting new faces and existing customers.  If you are nearby Ottawa, come for a visit at booth 908 in the Ottawa Convention Center (Shaw Center) for a free demo that could change the way you look at your data.  The event ends tomorrow (Wednesday, October 29th).

Khalid and Mike at Ottawa GTEC


GATINEAU, QC, CANADA — Thursday, September 22, 2014Norconex is excited to announce the launch of Norconex Content Analytics, enabling organizations to get deep insights on their current information assets.

Norconex believes its Content Analytics product will provide customers with valuable statistical reports on documents from all kinds of enterprise repository sources, ranging from local file systems to remote secure servers, at a fraction of the cost of compiling reports manually or with competing products.

“I can already assess that this affordable enterprise solution will save some of our customers a fortune on their data migration projects,” said David Gaulin, Vice President of Professional Services at Norconex.

Norconex Content Analytics Availability

Norconex Content Analytics is a product driven by customer feedback and is part of Norconex’s commitment to delivering quality commercial products. Norconex Content Analytics is available immediately for purchase. Additional information can be found at /enterprise-search-software/content-analytics/.

About Norconex

Founded in 2007, Norconex is a leader in enterprise search and data discovery. The company offers a wide range of products and services designed to help with the processing and analysis of structured and unstructured data.

Norconex Content Analytics

For more information on Norconex Content Analytics:

Website: /enterprise-search-software/content-analytics/



GATINEAU, QC, CANADA – Thursday, August 25, 2014 Norconex is announcing the launch of Norconex Filesystem Collector, providing organizations with a free “universal” filesystem crawler. The Norconex Filesystem Collector enables document indexing into target repositories of choice, such as enterprise search engines.

Following on the success of Norconex HTTP Collector web crawler, Norconex Filesystem Collector is the second open source crawler contribution to the Norconex “Collector” suite. Norconex believes this crawler allows customers to adopt a full-featured enterprise-class local or remote file system crawling solution that outlasts their enterprise search solution or other data repository.

“This not only facilitates any future migrations but also allows customer addition of their own ETL logic into a very flexible crawling architecture, whether using Autonomy, Solr/LucidWorks, ElasticSearch, or any others data repository,” said Norconex President Pascal Essiembre.

Norconex Filesystem Collector Availability

Norconex Filesystem Collector is part of Norconex’s commitment to deliver quality open-source products, backed by community or commercial support. Norconex Filesystem Collector is available for immediate download at /collectors/collector-filesystem/download.

Founded in 2007, Norconex is a leader in enterprise search and data discovery. The company offers a wide range of products and services designed to help with the processing and analyzing of structured and unstructured data.

For more information on Norconex Filesystem Collector:

Website: /collectors/collector-filesystem




Release 1.3.0 of Norconex Importer is now available.  Release overview:

  • Now stores the content “family” for each documents as “importer.contentFamily”.
  • New SplitTagger: Split values into multiple-values using a separator of choice.
  • New CopyTagger: copies document metadata fields to other fields.
  • New HierarchyTagger: splits a field string into multiple segments representing each node of a hierarchical branch.
  • ReplaceTagger now supports regular expressions.
  • Improved mime types detection.
  • More…

Download it now.

Web site: /collectors/importer/

Norconex Commons Lang 1.4.0 was just released.

New features:

  • New DataUnit classe to perform data unit (KB, MB, GB, etc) conversions much like Java TimeUnit class.
  • New DataUnitFormatter to format any data unit ot a human-readable format taking into account locale and decimals
  • New percentage formatter.
  • New ContentType class to represent a file media/MIME type and obtain its usual name, content family, and file extension(s).
  • New ContentFamily class to represent a group of files of similar content types. Useful for content categorization.
  • New ObservableMap class.
  • More…

Download it now.

Web site: /product/commons-lang/

Norconex HTTP Collector 1.3Release 1.3 of Norconex HTTP Collector is now available.  Among new features added to our open-source web crawler, you can expect the following:

  • Now supports NTLM authentication. Experimental support added for SPNEGO and Kerberos.
  • Document checksums are added to each document metadata.
  • Refactoring of HTTPClient creation with many new configuration options added (connection timeout, charset, maximum redirects, and several more).
  • Can optionally trust all SSL certificate now.
  • Integrates new features of Norconex Importer 1.2.0 such as support for WordPerfect document parsing, new filter and transformers, etc.
  • Integrates new features of Norconex Committer 1.2.0 such as defining multiple committers, retrying upon commit failure, etc.
  • Other third-party library upgrades.

Download it now!

UpgradeNorconex Importer 1.2.0 was just released along with a new website for it.

New features:

  • Now support text extraction from WordPerfect documents.
  • New transformer to reduce consecutive instances of the same string to only one instance.
  • New transformer to perform search and replace on document content using regular expression.
  • New filter to exclude/include documents with no data for one or more specified metadata properties.
  • Now attempts to detect the character encoding from a character stream by looking at a Content-Type metadata. If none is present, defaults to UTF-8.

Download it now!

Web site: /collectors/importer/


UpgradeNorconex Committer and all is current concrete implementations (Solr, Elasticsearch, IDOL) have been upgraded and have seen a redesign of their web sites.  Committers are libraries responsible for posting data to various repositories (typically search engines).  They are in other products or projects, such as Norconex HTTP Collector. (more…)

Norconex Commons Lang 1.3.0 was just released along with a new website for it.

New features:

  • New YearMonthDay class for a local date without time.
  • New YearMonthDayInterval class for a local date range without time.
  • New FileMonitor and IFileChangeListener to be notified of file changes.
  • New methods on FileUtil to visit empty directories or delete empty directories older than a date.

Grab it while it is still warm!

Web site: /product/commons-lang/

Happy coding!

Norconex HTTP CollectorNorconex just released version 1.2 of Norconex HTTP Collector, its open-source web crawler.  Along with it comes a complete product web site redesign and a new logo: a lovely web crawling spider wearing a Norconex hat.

Some changes in this feature release: (more…)