Categorygithub.com/nycmonkey/stringy
modulepackage
1.0.0
Repository: https://github.com/nycmonkey/stringy.git
Documentation: pkg.go.dev

# README

stringy

String analysis functions for search indexing and fuzzy matching

Build Status

# Functions

Analyze normalizes and tokenizes a given input stream.
AnalyzeBytes normalizes and tokenizes a given input stream.
Bigrams returns the unique token bigrams for a given ordered list of string tokens.
MSAnalyze normalizes and tokenizes a given input stream according to rules reverse engineered to match what MS SQL Server full text indexer does.
MSAnalyzeBytes normalizes and tokenizes a given input according to rules reverse engineered to match what MS SQL Server full text indexer does.
NGramSimilarity calculates the Jaccard similarity of the token ngrams of two input strings.
Shingles returns a sorted array of shingle combinations for the given input.
TokenNGrams turns an input like "abcd" into a series of trigrams like ("abc", "bcd") If the input is empty, the result is empty; if the input is 1 or two characters, the output is padded with '$'.
UnigramsAndBigrams returns the unique token unigrams and bigrams for a given ordered list of string tokens.
URLAnalyze attempts to normalize a URL to a simple host name or returns an empty slice.
URLAnalyzeOrEmpty attempts to normalize a URL to a simple host name or returns an empty string.
VisitAnalyzedShingles applies the provided tokenizer to the input and then calls the supplied visit function for each shingle of the tokenized input.
VisitShingles calls the supplied visit function once per shingle, stopping if the visit function returns true.