Skip to content

Commit bcdb416

Browse files
committed
monthly update
1 parent 66acacc commit bcdb416

File tree

2 files changed

+13
-13
lines changed

2 files changed

+13
-13
lines changed

README-zh_CN.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ A股 今日 迎来 4 月 开门红 三大 指数 集体 收涨 其中
188188
### 2.指标和得分
189189
#### 2.1 行业数据集测试
190190
##### 2.1.1 金融行业(银行行业),分词测试
191-
###### CaCl2银行词库分词(代码示例
191+
###### 代码示例(python):使用CaCl2词库对某银行《银行贷款合同》进行分词测试
192192
```python
193193
import jieba
194194
dict_name = '480100.txt'
@@ -198,9 +198,9 @@ print("cacl2: " + "/ ".join(seg_list))
198198
```
199199
![金融行业(银行行业)分词测试](https://github.com/limccn/cacl2/blob/master/docs/images/480100.png)
200200

201-
[详细分词测试结果地址](https://github.com/limccn/cacl2/docs/480100_cacl2_seg.txt)
201+
[详细分词测试结果地址](https://github.com/limccn/cacl2/raw/master/docs/480100_cacl2_seg.txt)
202202
##### 2.1.2 金融行业(金融行业,不包含银行),分词测试
203-
###### CaCl2金融标准词库分词(代码示例
203+
###### 代码示例(python):使用CaCl2词库对证券教材《移动平均线》章节进行分词测试
204204
```python
205205
import jieba
206206
dict_name = '490000.txt'
@@ -210,7 +210,7 @@ print("cacl2: " + "/ ".join(seg_list))
210210
```
211211
![金融行业(金融行业,不包含银行)分词测试](https://github.com/limccn/cacl2/blob/master/docs/images/490000.png)
212212

213-
[详细分词测试结果地址](https://github.com/limccn/cacl2/docs/490000_cacl2_seg.txt)
213+
[详细分词测试结果地址](https://github.com/limccn/cacl2/raw/master/docs/490000_cacl2_seg.txt)
214214
#### 2.2 标准数据集测试
215215
##### 2.2.1 标准数据集Chinese Treebank(CTB5)上测试分词,[参考链接](https://www.cs.brandeis.edu/~clp/ctb/)
216216
![标准数据集CTB5上测试分词]()
@@ -232,9 +232,10 @@ ICWB2标准数据集上测试分词的评分结果:
232232
=== IV Recall Rate: 0.795
233233
### pku_cacl2_seg.txt 1796 10090 12567 24453 104372 96078 0.783 0.851 0.815 0.058 0.582 0.795
234234
```
235+
235236
![标准数据集ICWB2上测试分词](https://github.com/limccn/cacl2/blob/master/docs/images/score.png)
236237

237-
[详细评分结果地址](https://github.com/limccn/cacl2/docs/score.txt)
238+
[详细评分结果地址](https://github.com/limccn/cacl2/raw/master/docs/score.txt)
238239

239240
## 五、历史和变更日志
240241
### 1.定期发布版本
@@ -247,7 +248,7 @@ ICWB2标准数据集上测试分词的评分结果:
247248
### 2.自动发布版本
248249
| 最新版本 | 发布周期 | 发布时间 | 变更日志 |
249250
| :----: | :----: | :----: | :---- |
250-
| v0.2.21.04 | monthly | 2021-05-07 | 添加标准数据集测试结果 |
251+
| v0.2.21.04 | monthly | 2021-05-07 | 添加ICWB2标准数据集测试结果 |
251252
| v0.2.21.03 | monthly | 2021-04-06 | 公开金融行业测试数据结果 |
252253
| v0.2.21.02 | monthly | 2021-03-01 | 增加28个行业候选词条约100万 |
253254
| v0.2.21.01 | monthly | 2021-02-01 | 金融行业(银行和非银金融)行业词库发布 |
@@ -273,7 +274,6 @@ CaCl2的源代码在[Apache License 2.0](https://www.apache.org/licenses/LICENSE
273274
See the License for the specific language governing permissions and
274275
limitations under the License.
275276
```
276-
277277
### 2.共同创作许可证
278278
CaCl2开放的词库,语料,模型等资料沿用[Creative Commons BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)知识共享许可协议。
279279

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ A股 今日 迎来 4 月 开门红 三大 指数 集体 收涨 其中
197197
#### 2.1 industrial test dataset
198198
Word segmentation test use for different industrial textual data
199199
##### 2.1.1 Word segmentation use financial industry(banking industry Only)dictionary
200-
###### CaCl2 Word segmentation(demo
200+
###### Sample: Word segmentation using CaCl2(source code written in python
201201
```python
202202
import jieba
203203
dict_name = '480100.txt'
@@ -207,9 +207,9 @@ print("cacl2: " + "/ ".join(seg_list))
207207
```
208208
![Financial industry(banking industry Only) Word segmentation](https://github.com/limccn/cacl2/blob/master/docs/images/480100.png)
209209

210-
[Word segmentation output](https://github.com/limccn/cacl2/docs/480100_cacl2_seg.txt)
210+
[Word segmentation output](https://github.com/limccn/cacl2/raw/master/docs/480100_cacl2_seg.txt)
211211
##### 2.1.2 Word segmentation use Financial industry(Except banking industry)dictionary
212-
###### CaCl2 Word segmentation(demo
212+
###### Sample: Word segmentation using CaCl2(source code written in python
213213
```python
214214
import jieba
215215
dict_name = '490000.txt'
@@ -219,7 +219,7 @@ print("cacl2: " + "/ ".join(seg_list))
219219
```
220220
![Financial industry(Except banking industry) Word segmentation](https://github.com/limccn/cacl2/blob/master/docs/images/490000.png)
221221

222-
[Word segmentation output](https://github.com/limccn/cacl2/docs/490000_cacl2_seg.txt)
222+
[Word segmentation output](https://github.com/limccn/cacl2/raw/master/docs/490000_cacl2_seg.txt)
223223

224224
#### 2.2 Standard test dataset
225225
Word segmentation test use Standard Chinese test dataset
@@ -245,7 +245,7 @@ Score for ICWB:
245245
```
246246
![Test word segmentation with ICWB2](https://github.com/limccn/cacl2/blob/master/docs/images/score.png)
247247

248-
[Score for ICWB](https://github.com/limccn/cacl2/docs/score.txt)
248+
[Score for ICWB](https://github.com/limccn/cacl2/raw/master/docs/score.txt)
249249

250250
## History and changelogs
251251
### 1.Regular releases
@@ -258,7 +258,7 @@ Score for ICWB:
258258
### 2.Monthly/Quarterly releases
259259
| Version | Circle | Date | Changelogs |
260260
| :----: | :----: | :----: | :---- |
261-
| v0.2.21.04 | monthly | 2021-05-07 | industrial test and code added |
261+
| v0.2.21.04 | monthly | 2021-05-07 | ICWB2 test and code added |
262262
| v0.2.21.03 | monthly | 2021-04-06 | Comparsion test and code added |
263263
| v0.2.21.02 | monthly | 2021-03-01 | Candidate entries added |
264264
| v0.2.21.01 | monthly | 2021-02-01 | Release: banking and financials dictionary |

0 commit comments

Comments
 (0)