Categorygithub.com/Destaq/cedict
modulepackage
1.0.2
Repository: https://github.com/destaq/cedict.git
Documentation: pkg.go.dev

# README

cedict 漢英詞典 Go 軟件包

GoDoc Go Report Card License

Overview

Golang library for the community maintained Chinese-English dictionary (CC-CEDICT), published by MDBG.

https://www.mdbg.net/chinese/dictionary?page=cedict

The basic format of a CEDICT entry is:

Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/
漢字 汉字 [han4 zi4] /Chinese character/CL:個|个/

Install

First grab the latest version of the package,

go get -u github.com/Destaq/cedict

Next, include it in your application:

import "github.com/Destaq/cedict"

Getting Started

d := cedict.New()
fmt.Printf("%s\n", cedict.PinyinTones(d.HanziToPinyin("你好,世界!")))

Contributing

  1. Fork the repo
  2. Clone your fork (git clone https://github.com/<username>/cedict && cd cedict)
  3. Create your own branch (git checkout -b my-patch)
  4. Make changes and add them (git add .)
  5. Commit your changes (git commit -m 'Fixed #123')
  6. Push to the branch (git push origin my-patch)
  7. Create new pull request

License

Copyright 2020 John Cramb. All rights reserved.

Licensed under the MIT License. See LICENSE in the project root for license information.

# Packages

No description provided by the author

# Functions

ConvertSymbols replaces common hanzi symbols with latin symbols.
Download returns a Dict using the latest CC-CEDICT archive from MDBG.
FixSymbolSpaces removes spaces added by HanziToPinyin conversion and makes the string look more natural.
IsHanzi returns true if the string contains only han characters.
Load returns a Dict loaded from a CC-CEDICT formatted file.
New returns a Dict immediately but downloads the latest CC-CEDICT data in the background.
Parse creates a Dict instance from an io.Reader It expects text input in the format, https://cc-cedict.org/wiki/format:syntax.
PinyinPlaintext returns pinyin string without tones or tone numbers.
PinyinToneNums returns pinyin string converting tones to tone numbers.
PinyinTones returns pinyin string converting tone numbers to tones.
StripDigits returns the string with all unicode digits removed.
StripTones returns the string with all (mark, nonspacing) removed.

# Constants

LineEnding used by Save(), defaults to "\r\n" to match original content.
MaxLD controls the max levenshtein distance allowed for matches.
MaxResults determines the most entries returned for any Dict method.
URL of the latest CC-CEDICT data in .tar.gz archive format.

# Structs

Dict represents an instance of the CC-CEDICT entries.
Entry
Entry represents a single entry in the CC-CEDICT dictionary.
Metadata represents information embedded in the CC-CEDICT header.