org.opencms.search.extractors
Interface I_CmsExtractionResult

All Known Implementing Classes:
CmsExtractionResult

public interface I_CmsExtractionResult

The result of a document text extraction.

This data structure contains the extracted text as well as (optional) meta information extracted from the document.

Since:
6.0.0
Version:
$Revision: 1.14 $
Author:
Alexander Kandzior

Field Summary
static java.lang.String ITEM_AUTHOR
          Key to access the document author name in the item map.
static java.lang.String ITEM_CATEGORY
          Key to access the document catrgory in the item map.
static java.lang.String ITEM_COMMENTS
          Key to access the document comments in the item map.
static java.lang.String ITEM_COMPANY
          Key to access the document company name in the item map.
static java.lang.String ITEM_CONTENT
          Key for accessing the default (combined) content in getContentItems().
static java.lang.String ITEM_CREATOR
          Key to access the document creator name in the item map.
static java.lang.String ITEM_KEYWORDS
          Key to access the document keywords in the item map.
static java.lang.String ITEM_MANAGER
          Key to access the document manager name in the item map.
static java.lang.String ITEM_PRODUCER
          Key to access the document producer name in the item map.
static java.lang.String ITEM_RAW
          Key for accessing the raw content in getContentItems().
static java.lang.String ITEM_SUBJECT
          Key to access the document subject in the item map.
static java.lang.String ITEM_TITLE
          Key to access the document title in the item map.
 
Method Summary
 byte[] getBytes()
          Returns this extraction result serialized as a byte array.
 java.lang.String getContent()
          Returns the extracted content combined as a String.
 java.util.Map<java.lang.String,java.lang.String> getContentItems()
          Returns the extracted content as individual items.
 void release()
          Releases the information stored in this extraction result, to free up the memory used.
 

Field Detail

ITEM_AUTHOR

static final java.lang.String ITEM_AUTHOR
Key to access the document author name in the item map.

See Also:
Constant Field Values

ITEM_CATEGORY

static final java.lang.String ITEM_CATEGORY
Key to access the document catrgory in the item map.

See Also:
Constant Field Values

ITEM_COMMENTS

static final java.lang.String ITEM_COMMENTS
Key to access the document comments in the item map.

See Also:
Constant Field Values

ITEM_COMPANY

static final java.lang.String ITEM_COMPANY
Key to access the document company name in the item map.

See Also:
Constant Field Values

ITEM_CONTENT

static final java.lang.String ITEM_CONTENT
Key for accessing the default (combined) content in getContentItems().

See Also:
Constant Field Values

ITEM_CREATOR

static final java.lang.String ITEM_CREATOR
Key to access the document creator name in the item map.

See Also:
Constant Field Values

ITEM_KEYWORDS

static final java.lang.String ITEM_KEYWORDS
Key to access the document keywords in the item map.

See Also:
Constant Field Values

ITEM_MANAGER

static final java.lang.String ITEM_MANAGER
Key to access the document manager name in the item map.

See Also:
Constant Field Values

ITEM_PRODUCER

static final java.lang.String ITEM_PRODUCER
Key to access the document producer name in the item map.

See Also:
Constant Field Values

ITEM_RAW

static final java.lang.String ITEM_RAW
Key for accessing the raw content in getContentItems().

See Also:
Constant Field Values

ITEM_SUBJECT

static final java.lang.String ITEM_SUBJECT
Key to access the document subject in the item map.

See Also:
Constant Field Values

ITEM_TITLE

static final java.lang.String ITEM_TITLE
Key to access the document title in the item map.

See Also:
Constant Field Values
Method Detail

getContent

java.lang.String getContent()
Returns the extracted content combined as a String.

Returns:
the extracted content combined as a String

getBytes

byte[] getBytes()
Returns this extraction result serialized as a byte array.

Returns:
this extraction result serialized as a byte array

getContentItems

java.util.Map<java.lang.String,java.lang.String> getContentItems()
Returns the extracted content as individual items.

The result Map contains all content items extracted by the extractor. The key is always a String, and contains the name of the item. The value is also a String and contains the extracted text.

The detailed form will depend on the resource type indexed:

Returns:
the extracted content as individual items

release

void release()
Releases the information stored in this extraction result, to free up the memory used.