|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objecttml.corpus.Dictionary
public class Dictionary
This class represents a group of Terms or words/symbols, usually
obtained from a set of documents or text passages. It is the common set of
words for a group of documents.
The dictionary can filter Terms based on a selection criteria and its
specific threshold. By default the TermSelection criteria is a
minimum DF or Document Frequency, i.e. the Term must appear in at
least a certain number of different TextPassages indicated by the
threshold.
A Dictionary also maintains the list of Terms inside a
TextPassage. When the TermSelection criteria is applied, the
Dictionary removes the unused Terms from the
TextPassages that contain those Terms.
| Constructor Summary | |
|---|---|
Dictionary(Corpus corpus)
Basic constructor of a Dictionary, initialises the list and index
of Terms |
|
| Method Summary | |
|---|---|
void |
addTerms(java.lang.String[] newTerms,
int[] termFreqs,
TextPassage document)
Adds an array of Terms to the Dictionary and their
frequencies. |
Corpus |
getCorpus()
Gets the Corpus to which the Dictionary belongs |
Term |
getTermByText(java.lang.String word)
Returns a Term that represents a word, null if it is not in the
Dictionary |
java.util.Collection<Term> |
getTerms()
Returns the collection of Terms in the Dictionary |
void |
removeTerms()
Remove the Terms from the Dictionary that doesn't meet
the TermSelection criteria according to the threshold. |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public Dictionary(Corpus corpus)
Dictionary, initialises the list and index
of Terms
corpus - | Method Detail |
|---|
public void addTerms(java.lang.String[] newTerms,
int[] termFreqs,
TextPassage document)
Terms to the Dictionary and their
frequencies. Both must come from a specific TextPassage
newTerms - termFreqs - document - public void removeTerms()
Terms from the Dictionary that doesn't meet
the TermSelection criteria according to the threshold.
public java.util.Collection<Term> getTerms()
Terms in the Dictionary
Collection of Termspublic Corpus getCorpus()
Corpus to which the Dictionary belongs
Corpuspublic Term getTermByText(java.lang.String word)
Term that represents a word, null if it is not in the
Dictionary
word - the word to look for
Term
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||