[Kaggle] 회사 평점 예측 #03(감정 분석)

01. 감정 분석

VADER 감정 분석기 사용하여 감정 점수 계산

from nltk.sentiment.vader import SentimentIntensityAnalyzer
import nltk

# NLTK에서 VADER Lexicon 다운로드
nltk.download('vader_lexicon')

# VADER 감정 분석기
sid = SentimentIntensityAnalyzer()

# 감정 점수 계산
df_sentiment = df.copy()
df_sentiment['sentiment_scores'] = df_sentiment['description'].astype(str).apply(lambda desc: sid.polarity_scores(desc))
df_sentiment['compound'] = df_sentiment['sentiment_scores'].apply(lambda score_dict: round(score_dict['compound'], 2))

감정 점수(compound) 분포 시각화
- 1 가까운 값에 주로 분포되어 있음

# compound 분포 시각화
plt.figure(figsize=(8, 6))
sns.histplot(df_sentiment['compound'], bins=20, kde=True)

02. 시각화

감정(긍정/중립/부정) 별 분포 시각화
- 긍정적인 리뷰가 약 90%를 차지함

# 감정 결과 시각화
df_sentiment['sentiment'] = df_sentiment['compound'].apply(lambda c: 'positive' if c >= 0.1 else ('negative' if c <= -0.1 else 'neutral'))
sentiment_counts = df_sentiment['sentiment'].value_counts()

plt.figure(figsize=(8, 6))
sentiment_counts.plot(kind='bar', color=['green', 'blue', 'red'], alpha=0.5)
plt.title('Sentiment Analysis')
plt.xticks(rotation=0)
plt.show()

print(sentiment_counts)

감정(긍정/중립/부정) 별 평점 시각화
- ~~상관관계를 찾기 어려움...~~

# 산점도 그래프 그리기
plt.figure(figsize=(8, 6))
plt.scatter(df_sentiment['rating'], df_sentiment['compound'], c=df_sentiment['compound'], cmap='plasma', alpha=0.5)

plt.colorbar(label='Compound')  # 컬러바 추가
plt.xlabel('Rating')
plt.ylabel('Compound')
plt.title('Review Ratings by Compound')
plt.show()

📌 참고.

https://www.kaggle.com/code/yaelman/company-review-rating-factors

Company review rating factors

Explore and run machine learning code with Kaggle Notebooks | Using data from Company Reviews

www.kaggle.com

🗂️ 데이터셋.
https://www.kaggle.com/datasets/vaghefi/company-reviews

728x90

저작자표시 비영리 변경금지 (새창열림)

'Study > Kaggle' 카테고리의 다른 글

[Kaggle] 아마존 리뷰 분석 #01(데이터 불러오기, 전처리) (0)	2024.06.21
[Kaggle] 회사 평점 예측 #04(예측 모델) (0)	2024.06.20
[Kaggle] 회사 평점 예측 #02(EDA) (0)	2024.06.19
[Kaggle] 회사 평점 예측 #01(데이터 불러오기, 전처리) (0)	2024.06.15
[Kaggle] 주택 가격 예측 EDA #3(결측치, 이상치 처리) (0)	2024.01.02

Home

[Kaggle] 회사 평점 예측 #03(감정 분석)

목차

01. 감정 분석

02. 시각화

'Study > Kaggle' 카테고리의 다른 글

티스토리툴바

[Kaggle] 회사 평점 예측 #03(감정 분석)

목차

01. 감정 분석

02. 시각화

'Study > Kaggle' 카테고리의 다른 글

관련글

티스토리툴바