# README
Proto Files
sentencepiece_model.proto
is downloaded from the C++ original source, in https://github.com/google/sentencepiece/, but it should match the one used by the github.com/eliben/go-sentencepiece library.
Because of protoc unique file naming requirement (!?), described in email thread in https://groups.google.com/g/protobuf/c/UWWuoRWz1Uk,
we compile by first creating a unique prefix directory. See gen_protos.sh
script.
# Constants
Default values for ModelProto_SentencePiece fields.
Default values for NormalizerSpec fields.
Default values for NormalizerSpec fields.
Default values for NormalizerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
Default values for TrainerSpec fields.
byte symbols.
control symbols.
normal symbol.
unknown symbol.
this piece is not used.
user defined symbols.
Byte Pair Encoding.
tokenizes into character sequence.
Unigram language model with dynamic algorithm.
Delimitered by whitespace.
# Variables
No description provided by the author
Enum value maps for ModelProto_SentencePiece_Type.
Enum value maps for ModelProto_SentencePiece_Type.
Enum value maps for TrainerSpec_ModelType.
Enum value maps for TrainerSpec_ModelType.
# Structs
ModelProto stores model parameters.
No description provided by the author
NormalizerSpec encodes a various parameters for string normalizaiton.
Proto to store samples for self-testing.
No description provided by the author
TrainerSpec encodes a various parameters for SentencePiece training.