自然语言处理的情感分析比较复杂,这里有两个好用的python库,针对英文的TextBlob和仿照其针对中文的SnowNLP
TextBlob的使用:
代码语言:javascript复制from textblob import TextBlob
source = open("review3.txt","r",encoding='utf-8')
line = source.readlines()
for i in line:
blob = TextBlob(i)
first = blob.sentiment.polarity
print(first)
其中polarity为情感评分值,范围为[-1,1],大于0为积极情绪,小于0为消极情绪 sentiment中除了polarity,还有一个subject的主观性系数
SnowNLP的使用: 先来看单句话如何使用:
代码语言:javascript复制from snownlp import SnowNLP
text='very good! amazing!'
s = SnowNLP(text)
print(s.sentiments)
导入文件使用并画图:
代码语言:javascript复制import numpy as np
from snownlp import SnowNLP
import matplotlib.pyplot as plt
f=open('./Data/mumachengshi.csv', 'r', encoding='utf-8')
list=f.readlines()
sentimentslist=[]
f.close()
for i in list:
s=SnowNLP(i)
print(s.sentiments)
sentimentslist.append(s.sentiments)
plt.hist(sentimentslist,bins=np.arange(0,1,0.01),facecolor='b')
plt.xlabel('情绪指数')
plt.ylabel('分词数量')
plt.title('情感分析图')
plt.rcParams['font.sans-serif']=['SimHei'] #显示中文标签
plt.rcParams['axes.unicode_minus'] = False # 用来正常显示负号
plt.show()
from snownlp import SnowNLP
#获取情感分数
source = open("review1.txt","r",encoding='utf-8')
line = source.readlines()
sentimentslist = []
for i in line:
s = SnowNLP(i)
print(s.sentiments)
sentimentslist.append(s.sentiments)
results = []
i = 0
while i<len(sentimentslist):
results.append(sentimentslist[i]-0.5)
i = i 1
#可视化画图
import matplotlib.pyplot as plt
import numpy as np
plt.plot(np.arange(0, 47, 1), results, 'k-')
plt.xlabel('分词数量')
plt.ylabel('情绪指数')
plt.title('情感分析图')
plt.show()
其中SnowNLP的返回情感评分为[0,1],略有不同
两者实质是根据字典法进行情感评测,在两个库中有内置字典,TextBlob为纯英文,SnowNLP为纯中文,如果混用,效果很糟