Class SemanticSpace

java.lang.Object
  extended by SemanticSpace

public class SemanticSpace
extends java.lang.Object

This class implements the integration with Matlab for TML

Author:
Jorge Villalon

Constructor Summary
SemanticSpace(java.lang.String pathToRepository, java.lang.String pathToMatlab)
          Creates a new instance of a SemanticSpace, the space will be created with all text documents found in a particular folder.
 
Method Summary
 java.lang.String[] getDocuments()
           
 double[][] getTermDocMatrix()
          Returns the matrix that represents the semantic space
 java.lang.String[] getTerms()
           
 void load()
          Loads the semantic space
 void setDimensionalityReductionCriteria(int reduction, double threshold)
          Sets the criteria to select how many dimension will be kept after SVD
 void setTermSelectionCriteria(int selection, double threshold)
          Sets the criteria to select which terms will be included in the LSA space
 void setTermWeightingCriteria(int local, int global)
          Sets the term weighting scheme that will be used to calculate the LSA space
 java.lang.String stemWords(java.lang.String phrase)
          Stemming words with Lucene and making it available in Matlab
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SemanticSpace

public SemanticSpace(java.lang.String pathToRepository,
                     java.lang.String pathToMatlab)
              throws java.lang.Exception
Creates a new instance of a SemanticSpace, the space will be created with all text documents found in a particular folder.

Parameters:
pathToRepository - folder containing the documents to be processed
pathToMatlab - folder where the user's matlab folder is
Throws:
java.lang.Exception
Method Detail

stemWords

public java.lang.String stemWords(java.lang.String phrase)
Stemming words with Lucene and making it available in Matlab

Parameters:
phrase -
Returns:

load

public void load()
          throws java.lang.Exception
Loads the semantic space

Throws:
java.lang.Exception

setTermSelectionCriteria

public void setTermSelectionCriteria(int selection,
                                     double threshold)
Sets the criteria to select which terms will be included in the LSA space

Parameters:
selection - the criteria
threshold - the threshold above which the criteria will be validated

setDimensionalityReductionCriteria

public void setDimensionalityReductionCriteria(int reduction,
                                               double threshold)
Sets the criteria to select how many dimension will be kept after SVD

Parameters:
reduction - the criteria
threshold - the threshold above which the criteria will be validated

setTermWeightingCriteria

public void setTermWeightingCriteria(int local,
                                     int global)
Sets the term weighting scheme that will be used to calculate the LSA space

Parameters:
local - local weight criterion
global - global weight criterion

getTermDocMatrix

public double[][] getTermDocMatrix()
                            throws java.lang.Exception
Returns the matrix that represents the semantic space

Returns:
a matrix of doubles
Throws:
java.lang.Exception

getTerms

public java.lang.String[] getTerms()
                            throws java.lang.Exception
Returns:
all terms in the semantic space
Throws:
java.lang.Exception

getDocuments

public java.lang.String[] getDocuments()
                                throws java.lang.Exception
Returns:
all the documents in the semantic space
Throws:
java.lang.Exception