cnsenti: 中文情感分析python包 #53
chengjun
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
https://github.com/hiDaDeng/cnsenti
Now cnsenti has been integrated into cntext, welcome to star!
cnsenti
中文文档
The cnsenti library can perform sentiment analysis emotion analysis on chinese texts.
Features
Notes
The emotional ontology library of Dalian University of Technology used for sentiment analysis in the code. If you publish a paper, please pay attention to the user license agreement
Installation
method 1
method 2
Quick Start
Count the number of positive and negative emotional words in Chinese text
Run
Count the number of words with different emotions in Chinese text
Run
Documents
cnsenti includes two class type: Emotion class and Sentiment class
3.1 emotion_count(text)
emotion_count(text) is used to count the number of words that appear in various emotional adjectives in the text. Use Dalian University of Technology Emotion Ontology Database Dictionary to support 7 emotion statistics (好good, 乐happy, 哀sad, 怒angry, 惧fear, 恶disgust, 惊shock)
Run
detail
3.2 sentiment_count(text)
隶属于Sentiment类,可对文本text中的正、负面词进行统计。默认使用Hownet词典,后面会讲到如何导入自定义正、负情感txt词典文件。这里以默认hownet词典进行统计。
sentiment_count(text) belongs to the Sentiment class and can count the positive and negative words number of the chinese text. The Hownet dictionary is used by default, and Sentiment class support custom positive and negative emotion dictionary txt files.
Here we use the default hownet dictionary to count the word number of chinese text.
Run
Detail
words :the words number of the chinese text
sentences: the sentence number of the chinese text
pos: the positive words number of text chinese text
neg: the positive words number of text chinese text
3.3 sentiment_calculate(text)
隶属于Sentiment类,可更加精准的计算文本的情感信息。相比于sentiment_count只统计文本正负情感词个数,sentiment_calculate还考虑了
sentiment_calculate(text) belongs to the Sentiment class, which can calculate the emotional information of the chinese text more accurately. Compared with sentiment_count only counts the number of positive and negative sentiment words in the text, sentiment_calculate also considers
for examples:
Run
3.4 custom dictionary
Let's first look at the sentence that contains emotional information but without emotional adjectives
Run
As expected, although the sentence is positive, because cnsenti's own sentiment dictionary(Hownet) is only an adjective sentiment dictionary, for many scenarios, the applicability is limited, so pos=0.
3.4.1 the format of custom dictionary
cnsenti supports importing custom dictionaries, but currently only Sentiment supports importing custom positive and negative emotion dictionaries, custom dictionaries need to meet
3.4.2 Sentiment custom dictionary parameters
3.4.3 Custom dictionary use case
I put this part in the test folder, the code and the custom dictionary are in the test, so I use the relative path to set the path of the custom dictionary
正面词自定义.txt pos custom dictionary txt
Run
For the above parameters, we passed in the positive custom dictionary and the negative custom dictionary, and used the fusion mode (merge=True). You can use the cnsenti's own dictionary and the newly imported custom dictionary for sentiment calculation.
notes:
The library I designed currently only supports two types(for example pos and neg). If your research question is two classification problems, such as good and bad, beautiful and ugly, friendly and hostile etc., you can define two txt files, respectively assign values to pos and neg, after this setting, you can use the cnsenti library to solve your research quesiton.
Beta Was this translation helpful? Give feedback.
All reactions