org.opencms.search
Class CmsVfsIndexer

java.lang.Object
  extended by org.opencms.search.CmsVfsIndexer
All Implemented Interfaces:
I_CmsIndexer

public class CmsVfsIndexer
extends java.lang.Object
implements I_CmsIndexer

An indexer indexing CmsResource based content from the OpenCms VFS.

Since:
6.0.0
Version:
$Revision: 1.45 $
Author:
Alexander Kandzior, Carsten Weinholz

Field Summary
protected  CmsObject m_cms
          The OpenCms user context to use when reading resources from the VFS during indexing.
protected  CmsSearchIndex m_index
          The index.
protected  I_CmsReport m_report
          The report.
 
Constructor Summary
CmsVfsIndexer()
           
 
Method Summary
protected  void addResourceToUpdateData(CmsPublishedResource pubRes, CmsSearchIndexUpdateData updateData)
          Adds a given published resource to the provided search index update data.
protected  void deleteResource(org.apache.lucene.index.IndexWriter indexWriter, java.lang.String rootPath)
          Deletes a single resource from the given index.
 void deleteResources(org.apache.lucene.index.IndexWriter indexWriter, java.util.List<CmsPublishedResource> resourcesToDelete)
          Incremental index update - delete the index entry for all resources in the given list.
 CmsSearchIndexUpdateData getUpdateData(CmsSearchIndexSource source, java.util.List<CmsPublishedResource> publishedResources)
          Calculates the data for an incremental search index update.
protected  boolean isResourceInTimeWindow(CmsPublishedResource resource)
          Checks if the published resource is inside the time window set with release and expiration date.
 I_CmsIndexer newInstance(CmsObject cms, I_CmsReport report, CmsSearchIndex index)
          Creates and initializes a new instance of this indexer implementation.
 void rebuildIndex(org.apache.lucene.index.IndexWriter writer, CmsIndexingThreadManager threadManager, CmsSearchIndexSource source)
          Rebuilds the index for the given configured index source.
protected  void updateResource(org.apache.lucene.index.IndexWriter writer, CmsIndexingThreadManager threadManager, CmsResource resource)
          Updates (writes) a single resource in the index.
 void updateResources(org.apache.lucene.index.IndexWriter writer, CmsIndexingThreadManager threadManager, java.util.List<CmsPublishedResource> resourcesToUpdate)
          Incremental index update - create a new index entry for all resources in the given list.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m_cms

protected CmsObject m_cms
The OpenCms user context to use when reading resources from the VFS during indexing.


m_index

protected CmsSearchIndex m_index
The index.


m_report

protected I_CmsReport m_report
The report.

Constructor Detail

CmsVfsIndexer

public CmsVfsIndexer()
Method Detail

deleteResources

public void deleteResources(org.apache.lucene.index.IndexWriter indexWriter,
                            java.util.List<CmsPublishedResource> resourcesToDelete)
Description copied from interface: I_CmsIndexer
Incremental index update - delete the index entry for all resources in the given list.

Specified by:
deleteResources in interface I_CmsIndexer
Parameters:
indexWriter - the writer to the index to delete the entries from
resourcesToDelete - a list of CmsPublishedResource instances that must be deleted
See Also:
I_CmsIndexer.deleteResources(org.apache.lucene.index.IndexWriter, java.util.List)

getUpdateData

public CmsSearchIndexUpdateData getUpdateData(CmsSearchIndexSource source,
                                              java.util.List<CmsPublishedResource> publishedResources)
Description copied from interface: I_CmsIndexer
Calculates the data for an incremental search index update.

Specified by:
getUpdateData in interface I_CmsIndexer
Parameters:
source - the search index source to update
publishedResources - a list of CmsPublishedResource objects that are to be updated
Returns:
a container with the information about the resources to delete and / or update
See Also:
I_CmsIndexer.getUpdateData(org.opencms.search.CmsSearchIndexSource, java.util.List)

newInstance

public I_CmsIndexer newInstance(CmsObject cms,
                                I_CmsReport report,
                                CmsSearchIndex index)
Description copied from interface: I_CmsIndexer
Creates and initializes a new instance of this indexer implementation.

Specified by:
newInstance in interface I_CmsIndexer
Parameters:
cms - the OpenCms user context to use when reading resources from the VFS during indexing
report - the report to write the indexing output to
index - the search index to update
Returns:
a new instance of this indexer implementation
See Also:
I_CmsIndexer.newInstance(org.opencms.file.CmsObject, org.opencms.report.I_CmsReport, org.opencms.search.CmsSearchIndex)

rebuildIndex

public void rebuildIndex(org.apache.lucene.index.IndexWriter writer,
                         CmsIndexingThreadManager threadManager,
                         CmsSearchIndexSource source)
                  throws CmsIndexException
Description copied from interface: I_CmsIndexer
Rebuilds the index for the given configured index source.

This is used when the index is fully rebuild, not for updating only some parts of an existing index.

Specified by:
rebuildIndex in interface I_CmsIndexer
Parameters:
writer - the index writer to write the update to
threadManager - the thread manager to use when extracting the document text
source - the search index source to update
Throws:
CmsIndexException - if something goes wrong
See Also:
I_CmsIndexer.rebuildIndex(org.apache.lucene.index.IndexWriter, org.opencms.search.CmsIndexingThreadManager, org.opencms.search.CmsSearchIndexSource)

updateResources

public void updateResources(org.apache.lucene.index.IndexWriter writer,
                            CmsIndexingThreadManager threadManager,
                            java.util.List<CmsPublishedResource> resourcesToUpdate)
                     throws CmsIndexException
Description copied from interface: I_CmsIndexer
Incremental index update - create a new index entry for all resources in the given list.

Specified by:
updateResources in interface I_CmsIndexer
Parameters:
writer - the index writer to write the update to
threadManager - the thread manager to use when extracting the document text
resourcesToUpdate - a list of CmsPublishedResource instances that must be updated
Throws:
CmsIndexException - if something goes wrong
See Also:
I_CmsIndexer.updateResources(org.apache.lucene.index.IndexWriter, org.opencms.search.CmsIndexingThreadManager, java.util.List)

addResourceToUpdateData

protected void addResourceToUpdateData(CmsPublishedResource pubRes,
                                       CmsSearchIndexUpdateData updateData)
Adds a given published resource to the provided search index update data.

This method decides if the resource has to be included in the "update" or "delete" list.

Parameters:
pubRes - the published resource to add
updateData - the search index update data to add the resource to

deleteResource

protected void deleteResource(org.apache.lucene.index.IndexWriter indexWriter,
                              java.lang.String rootPath)
Deletes a single resource from the given index.

Parameters:
indexWriter - the index to delete the resource from
rootPath - the root path of the resource to delete

isResourceInTimeWindow

protected boolean isResourceInTimeWindow(CmsPublishedResource resource)
Checks if the published resource is inside the time window set with release and expiration date.

Parameters:
resource - the published resource to check
Returns:
true if the published resource is inside the time window, otherwise false

updateResource

protected void updateResource(org.apache.lucene.index.IndexWriter writer,
                              CmsIndexingThreadManager threadManager,
                              CmsResource resource)
                       throws CmsIndexException
Updates (writes) a single resource in the index.

Parameters:
writer - the index writer to use
threadManager - the thread manager to use when extracting the document text
resource - the resource to update
Throws:
CmsIndexException - if something goes wrong