From 624cb14e00f771f734bab10d3e08282418b668a6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E5=BE=90=E4=BD=B3=E5=86=9B?= Date: Wed, 17 Oct 2018 09:09:22 +0800 Subject: [PATCH] add https://github.com/xujiajun/gotokenizer library (#2160) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Please check if what you want to add to `awesome-go` list meets [quality standards](https://github.com/avelino/awesome-go/blob/master/CONTRIBUTING.md#quality-standard) before sending pull request. Thanks! **Please provide package links to:** - github.com repo: https://github.com/xujiajun/gotokenizer - godoc.org: https://godoc.org/github.com/xujiajun/gotokenizer - goreportcard.com: https://goreportcard.com/report/github.com/xujiajun/gotokenizer - coverage service link: https://coveralls.io/github/xujiajun/gotokenizer?branch=master Very good coverage **Note**: that new categories can be added only when there are 3 packages or more. **Make sure that you've checked the boxes below before you submit PR:** - [✔️ ] I have added my package in alphabetical order. - [✔️ ] I have an appropriate description with correct grammar. - [✔️ ] I know that this package was not listed before. - [✔️ ] I have added godoc link to the repo and to my pull request. - [✔️ ] I have added coverage service link to the repo and to my pull request. - [✔️ ] I have added goreportcard link to the repo and to my pull request. - [✔️] I have read [Contribution guidelines](https://github.com/avelino/awesome-go/blob/master/CONTRIBUTING.md#contribution-guidelines), [maintainers note](https://github.com/avelino/awesome-go/blob/master/CONTRIBUTING.md#maintainers) and [Quality standard](https://github.com/avelino/awesome-go/blob/master/CONTRIBUTING.md#quality-standard). Thanks for your PR, you're awesome! :+1: --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index e6bcdb3e..509bd494 100644 --- a/README.md +++ b/README.md @@ -926,6 +926,7 @@ See [go-hardware](https://github.com/rakyll/go-hardware) for a comprehensive lis * [go2vec](https://github.com/danieldk/go2vec) - Reader and utility functions for word2vec embeddings. * [gojieba](https://github.com/yanyiwu/gojieba) - This is a Go implementation of [jieba](https://github.com/fxsjy/jieba) which a Chinese word splitting algorithm. * [golibstemmer](https://github.com/rjohnsondev/golibstemmer) - Go bindings for the snowball libstemmer library including porter 2. +* [gotokenizer](https://github.com/xujiajun/gotokenizer) - A tokenizer based on the dictionary and Bigram language models for Golang. (Now only support chinese segmentation) * [gounidecode](https://github.com/fiam/gounidecode) - Unicode transliterator (also known as unidecode) for Go. * [gse](https://github.com/go-ego/gse) - Go efficient text segmentation; support english, chinese, japanese and other. * [icu](https://github.com/goodsign/icu) - Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1.