package
0.0.0-20200818183458-d966d878d120
Repository: https://github.com/grailbio/bio.git
Documentation: pkg.go.dev
# Functions
FaiToReferenceLengths reads in a fasta fai file and returns a map of reference name to reference length.
GenerateIndex generates an index (*.fai) from FASTA.
New creates a new Fasta that holds all the FASTA data from the given reader in memory.
NewIndexed creates a new Fasta that can perform efficient random lookups using the provided index, without reading the data into memory.
OptClean specifies returned FASTA sequences should be cleaned as described in biosimd.CleanASCIISeq*.
OptEncoding specifies the encoding of the in-memory FASTA sequences.
OptIndex makes New read FASTA file with a provided index, like NewIndexed.
# Constants
CleanASCII encoding capitalizes all lowercase 'a'/'c'/'g'/'t', and converts all non-ACGT characters to 'N'.
TODO(cchang): Add 'Base5' encoding, where 'A'/'a' = 0, 'C'/'c' = 1, 'G'/'g' = 2, 'T'/'t' = 3, anything else = 4.
RawASCII encoding preserves the original bytes, including case.
Seq8 encoding is 'A'/'a' = 1, 'C'/'c' = 2, 'G'/'g' = 4, 'T'/'t' = 8, anything else = 15.
# Interfaces
Fasta represents FASTA-formatted data, consisting of a set of named sequences.