# README
cedict
漢英詞典 Go 軟件包
Overview
Golang library for the community maintained Chinese-English dictionary (CC-CEDICT), published by MDBG.
The basic format of a CEDICT entry is:
Traditional Simplified [pin1 yin1] /American English equivalent 1/equivalent 2/
漢字 汉字 [han4 zi4] /Chinese character/CL:個|个/
Install
First grab the latest version of the package,
go get -u github.com/Destaq/cedict
Next, include it in your application:
import "github.com/Destaq/cedict"
Getting Started
d := cedict.New()
fmt.Printf("%s\n", cedict.PinyinTones(d.HanziToPinyin("你好,世界!")))
Contributing
- Fork the repo
- Clone your fork (
git clone https://github.com/<username>/cedict && cd cedict
) - Create your own branch (
git checkout -b my-patch
) - Make changes and add them (
git add .
) - Commit your changes (
git commit -m 'Fixed #123'
) - Push to the branch (
git push origin my-patch
) - Create new pull request
License
Copyright 2020 John Cramb. All rights reserved.
Licensed under the MIT License. See LICENSE in the project root for license information.
# Packages
No description provided by the author
# Functions
ConvertSymbols replaces common hanzi symbols with latin symbols.
Download returns a Dict using the latest CC-CEDICT archive from MDBG.
FixSymbolSpaces removes spaces added by HanziToPinyin conversion and makes the string look more natural.
IsHanzi returns true if the string contains only han characters.
Load returns a Dict loaded from a CC-CEDICT formatted file.
New returns a Dict immediately but downloads the latest CC-CEDICT data in the background.
Parse creates a Dict instance from an io.Reader It expects text input in the format, https://cc-cedict.org/wiki/format:syntax.
PinyinPlaintext returns pinyin string without tones or tone numbers.
PinyinToneNums returns pinyin string converting tones to tone numbers.
PinyinTones returns pinyin string converting tone numbers to tones.
StripDigits returns the string with all unicode digits removed.
StripTones returns the string with all (mark, nonspacing) removed.
# Constants
LineEnding used by Save(), defaults to "\r\n" to match original content.
MaxLD controls the max levenshtein distance allowed for matches.
MaxResults determines the most entries returned for any Dict method.
URL of the latest CC-CEDICT data in .tar.gz archive format.