|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objecttml.corpus.Dictionary
public class Dictionary
This class represents a group of Term
s or words/symbols, usually
obtained from a set of documents or text passages. It is the common set of
words for a group of documents.
The dictionary can filter Term
s based on a selection criteria and its
specific threshold. By default the TermSelection
criteria is a
minimum DF or Document Frequency, i.e. the Term
must appear in at
least a certain number of different TextPassage
s indicated by the
threshold.
A Dictionary
also maintains the list of Term
s inside a
TextPassage
. When the TermSelection
criteria is applied, the
Dictionary
removes the unused Term
s from the
TextPassage
s that contain those Term
s.
Constructor Summary | |
---|---|
Dictionary(Corpus corpus)
Basic constructor of a Dictionary , initialises the list and index
of Term s |
Method Summary | |
---|---|
void |
addTerms(java.lang.String[] newTerms,
int[] termFreqs,
TextPassage document)
Adds an array of Term s to the Dictionary and their
frequencies. |
Corpus |
getCorpus()
Gets the Corpus to which the Dictionary belongs |
Term |
getTermByText(java.lang.String word)
Returns a Term that represents a word, null if it is not in the
Dictionary |
java.util.Collection<Term> |
getTerms()
Returns the collection of Term s in the Dictionary |
void |
removeTerms()
Remove the Term s from the Dictionary that doesn't meet
the TermSelection criteria according to the threshold. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public Dictionary(Corpus corpus)
Dictionary
, initialises the list and index
of Term
s
corpus
- Method Detail |
---|
public void addTerms(java.lang.String[] newTerms, int[] termFreqs, TextPassage document)
Term
s to the Dictionary
and their
frequencies. Both must come from a specific TextPassage
newTerms
- termFreqs
- document
- public void removeTerms()
Term
s from the Dictionary
that doesn't meet
the TermSelection
criteria according to the threshold.
public java.util.Collection<Term> getTerms()
Term
s in the Dictionary
Collection
of Term
spublic Corpus getCorpus()
Corpus
to which the Dictionary
belongs
Corpus
public Term getTermByText(java.lang.String word)
Term
that represents a word, null if it is not in the
Dictionary
word
- the word to look for
Term
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |