public class ScriptTagger extends AbstractStringTagger
Tag incoming documents using a scripting language.
The default script engine is JavaScript
.
Refer to ScriptRunner
for more information on using a scripting
language with Norconex Importer.
The following are variables made available to your script for each document:
maxReadSize
length).ImporterMetadata
object.There are no expected return value from your script. Returning one has no effect.
<tagger class="com.norconex.importer.handler.tagger.impl.ScriptTagger" engineName="(script engine name)" sourceCharset="(character encoding)" maxReadSize="(max content characters to read at once)" > <restrictTo caseSensitive="[false|true]" field="(name of header/metadata field name to match)"> (regular expression of value to match) </restrictTo> <!-- multiple "restrictTo" tags allowed (only one needs to match) --> <script>(your script)</script> </tagger>
The following example simply adds new metadata field indicating which fruit is a document about.
<tagger class="com.norconex.importer.handler.tagger.impl.ScriptTagger"> <script><![CDATA[ metadata.addString('fruit', 'apple'); ]]></script> </tagger>
<tagger class="com.norconex.importer.handler.tagger.impl.ScriptTagger" engineName="lua"> <script><![CDATA[ metadata:addString('fruit', {'apple'}); ]]></script> </tagger>
ScriptRunner
Constructor and Description |
---|
ScriptTagger() |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object other) |
String |
getEngineName() |
String |
getScript() |
int |
hashCode() |
protected void |
loadStringTaggerFromXML(org.apache.commons.configuration.XMLConfiguration xml)
Loads configuration settings specific to the implementing class.
|
protected void |
saveStringTaggerToXML(EnhancedXMLStreamWriter writer)
Saves configuration settings specific to the implementing class.
|
void |
setEngineName(String engineName) |
void |
setScript(String script) |
protected void |
tagStringContent(String reference,
StringBuilder content,
ImporterMetadata metadata,
boolean parsed,
int sectionIndex) |
String |
toString() |
getMaxReadSize, loadCharStreamTaggerFromXML, saveCharStreamTaggerToXML, setMaxReadSize, tagTextDocument
getSourceCharset, loadHandlerFromXML, saveHandlerToXML, setSourceCharset, tagApplicableDocument
tagDocument
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public String getEngineName()
public void setEngineName(String engineName)
public String getScript()
public void setScript(String script)
protected void tagStringContent(String reference, StringBuilder content, ImporterMetadata metadata, boolean parsed, int sectionIndex) throws ImporterHandlerException
tagStringContent
in class AbstractStringTagger
ImporterHandlerException
protected void saveStringTaggerToXML(EnhancedXMLStreamWriter writer) throws XMLStreamException
AbstractStringTagger
saveStringTaggerToXML
in class AbstractStringTagger
writer
- the xml writerXMLStreamException
- could not save to XMLprotected void loadStringTaggerFromXML(org.apache.commons.configuration.XMLConfiguration xml) throws IOException
AbstractStringTagger
loadStringTaggerFromXML
in class AbstractStringTagger
xml
- xml configurationIOException
- could not load from XMLpublic boolean equals(Object other)
equals
in class AbstractStringTagger
public int hashCode()
hashCode
in class AbstractStringTagger
public String toString()
toString
in class AbstractStringTagger
Copyright © 2009–2021 Norconex Inc.. All rights reserved.