public class RegexReferenceFilter extends AbstractDocumentFilter
Accepts or rejects a document based on its reference (e.g. URL).
<filter class="com.norconex.importer.handler.filter.impl.RegexReferenceFilter" onMatch="[include|exclude]" caseSensitive="[false|true]"> <restrictTo caseSensitive="[false|true]" field="(name of header/metadata field name to match)"> (regular expression of value to match) </restrictTo> <!-- multiple "restrictTo" tags allowed (only one needs to match) --> <regex>(regular expression of reference to match)</regex> </filter>
Can be used both as a pre-parse or post-parse handler.
The following will reject documents having "/login/" in their reference.
<filter class="com.norconex.importer.handler.filter.impl.RegexReferenceFilter" onMatch="exclude"> <regex>.*/login/.*</regex> </filter>
Constructor and Description |
---|
RegexReferenceFilter() |
RegexReferenceFilter(String regex) |
RegexReferenceFilter(String regex,
OnMatch onMatch) |
RegexReferenceFilter(String regex,
OnMatch onMatch,
boolean caseSensitive) |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
String |
getRegex() |
int |
hashCode() |
boolean |
isCaseSensitive() |
protected boolean |
isDocumentMatched(String reference,
InputStream input,
ImporterMetadata metadata,
boolean parsed) |
protected void |
loadFilterFromXML(org.apache.commons.configuration.XMLConfiguration xml) |
protected void |
saveFilterToXML(EnhancedXMLStreamWriter writer) |
void |
setCaseSensitive(boolean caseSensitive) |
void |
setRegex(String regex) |
String |
toString() |
acceptDocument, getOnMatch, loadHandlerFromXML, saveHandlerToXML, setOnMatch
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public RegexReferenceFilter()
public RegexReferenceFilter(String regex)
public String getRegex()
public boolean isCaseSensitive()
public void setCaseSensitive(boolean caseSensitive)
public final void setRegex(String regex)
protected boolean isDocumentMatched(String reference, InputStream input, ImporterMetadata metadata, boolean parsed) throws ImporterHandlerException
isDocumentMatched
in class AbstractDocumentFilter
ImporterHandlerException
protected void loadFilterFromXML(org.apache.commons.configuration.XMLConfiguration xml) throws IOException
loadFilterFromXML
in class AbstractDocumentFilter
IOException
protected void saveFilterToXML(EnhancedXMLStreamWriter writer) throws XMLStreamException
saveFilterToXML
in class AbstractDocumentFilter
XMLStreamException
public String toString()
toString
in class AbstractDocumentFilter
public int hashCode()
hashCode
in class AbstractDocumentFilter
public boolean equals(Object obj)
equals
in class AbstractDocumentFilter
Copyright © 2009–2021 Norconex Inc.. All rights reserved.