pkg.gl

Gets the dictionary definition of a slice of strings Parameters terms: The Chinese (simplified or traditional) text of the words Return hws: an array of word senses.

NewDocumentFrequency

Initializes a DocumentFrequency struct.

ReadDocumentFrequency

ReadDocumentFrequency a document frequency object from a CSV file.

SortByWeight

Orders the keyword with given frequency in a document by tf-idf weight Param: vocab - word frequencies for a particular document.

SortedFreq

* Sorts Word struct's based on frequency */.

UpdateDictIndex

UpdateDictIndex writes a list of dicitonary words with subtring array.

UpdateDocTitleIndex

UpdateDocTitleIndex writes a list of document titles from the hierarchical corpus with subtring arrays.

WriteDocLengthToFile

Append document analysis to a plain text file in the index directory.

WriteWFCorpus

Write corpus analysis to plain text files in the index directory.

# Constants

BF_DOC_FILE

Bigram frequencies for each file.

BigramDocFreqFile

No description provided by the author

DocFreqFile

File name for document index.

DocLengthFile

Word frequencies for each document.

KeywordIndexFile

File name for keyword index.

NgramCorpusFile

ngram frequencies for corpus.

UnknownCharsFile

Unknown characters file.

WfCorpusFile

Word frequencies for corpus.

WfDocFile

Word frequencies for each document.

# Structs

CorpusWord

A word with corpus entry label.

CorpusWordFreq

A word frequency with corpus entry label.

DocLength

Records the document length for each document in the corpus.

DocumentFrequency

Map from term to number of documents referencing the term.

IndexConfig

IndexConfig encapsulates parameters for index configuration.

IndexState

A word frequency entry record.

IndexStore

Storage for the keyword index.

Keyword

A keyword in a document.

RetrievalResult

A document-specific word frequency entry record.

SortedWF

Sorted list of word frequencies.

SortedWordItem

An entry in a sorted word array.

TermFreqDocRecord

Remembers the word frequency for each term for each document in the corpus.

WFDocEntry

A document-specific word frequency entry record.

WFEntry

A word frequency entry record.

WordFreqStore

Storage for word frequency data.

# Interfaces

FsClient

FsClient defines Firestore interfaces needed.

# Type aliases

ByFrequencyDoc

No description provided by the author

Keywords

No description provided by the author

TermFreqDocMap

Remembers the word frequency for each term for each document in the corpus.