tml.corpus
Class TextPassage

java.lang.Object
  extended by tml.corpus.TextPassage

public class TextPassage
extends java.lang.Object

This class represents a text passage, that is part of a Corpus. It can be a sentence, paragraph, a complete document or any other piece of text of any length.

Author:
Jorge Villalon

Nested Class Summary
 class TextPassage.TextPassageStats
          This class represents the statistics for a TextPassage.
 
Constructor Summary
TextPassage(int id, Corpus corpus, java.lang.String content, java.lang.String title, java.lang.String url, java.lang.String type, java.lang.String externalId)
          Creates a new instance of a TextPassage.
 
Method Summary
 void addTerm(Term term, int frequency)
          Adds a Term to the passage, it adds a number to the statistics but it doesn't calculate the final values
 java.util.Hashtable<java.lang.String,java.lang.String> getAnnotations()
           
 java.lang.String getContent()
           
 Corpus getCorpus()
           
 java.lang.String getExternalId()
           
 int getId()
           
 Stats getStats()
           
 double[] getTermFreqs()
           
 java.util.Collection<Term> getTerms()
           
 int[] getTermsCorpusIndices()
           
 java.lang.String getTitle()
           
 java.lang.String getType()
           
 java.lang.String getUrl()
           
 boolean isEmpty()
           
 void removeTerm(Term term)
          Removes a Term from the passage
 java.lang.String toString()
          Basic output of a text passage
 void updateTermIndex(Term term, int oldIndex, int newIndex)
          Updates the index of a Term in the passage
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

TextPassage

public TextPassage(int id,
                   Corpus corpus,
                   java.lang.String content,
                   java.lang.String title,
                   java.lang.String url,
                   java.lang.String type,
                   java.lang.String externalId)
Creates a new instance of a TextPassage.

Parameters:
id - the id of the passage
title - the title for the passage
corpus - the Corpus to which the passage belongs
content - the content of the passage
url - the url for the passage
type - the type of the passage (document, paragraph or sentence)
externalId - Lucene id of the passage
Method Detail

getExternalId

public java.lang.String getExternalId()
Returns:
the externalId

getAnnotations

public java.util.Hashtable<java.lang.String,java.lang.String> getAnnotations()
Returns:
the annotations

addTerm

public void addTerm(Term term,
                    int frequency)
Adds a Term to the passage, it adds a number to the statistics but it doesn't calculate the final values

Parameters:
term -
frequency -

getContent

public java.lang.String getContent()
Returns:
the content of the passage

getCorpus

public Corpus getCorpus()
Returns:
the Corpus to which the passage belongs

getId

public int getId()
Returns:
the external id of a passage

getStats

public Stats getStats()
Returns:
basic statistics for the passage

getTermFreqs

public double[] getTermFreqs()
Returns:
a packed array with the frequencies of the passage terms

getTerms

public java.util.Collection<Term> getTerms()
Returns:
all the Terms in the passage

getTermsCorpusIndices

public int[] getTermsCorpusIndices()
Returns:
an array of indices of the terms within the passage

getTitle

public java.lang.String getTitle()
Returns:
the title of the passage

getType

public java.lang.String getType()
Returns:
the type of the passage (document, paragraph or sentence)

getUrl

public java.lang.String getUrl()
Returns:
the url of the passage

isEmpty

public boolean isEmpty()
Returns:
if the TextPassage contains any Term

removeTerm

public void removeTerm(Term term)
Removes a Term from the passage

Parameters:
term -

toString

public java.lang.String toString()
Basic output of a text passage

Overrides:
toString in class java.lang.Object

updateTermIndex

public void updateTermIndex(Term term,
                            int oldIndex,
                            int newIndex)
Updates the index of a Term in the passage

Parameters:
term - the Term which index will be updated
oldIndex - the old index
newIndex - the new index