pkg.gl

package

0.0.0-20231012084221-3c562c5921af

Documentation: pkg.go.dev

# Functions

Extract processes an HTML document using an HTML tokenizer and a set of extractors.

ExtractFromPage processes an HTML document as a string using a set of extractors.

ExtractUrlsFromPage parses an HTML page represented as a string and extracts valid URLs.

ExtractWordsFromPage parses an HTML page represented as a string and extracts valid words.

No description provided by the author

No description provided by the author

No description provided by the author

Extractor is an interface that defines the extraction behavior for processing HTML content.