프로젝트명 : 한글 워드클라우드 만들기¶

1. 한글자연어 처리 라이브러리 설치¶

# !pip install KoNLPy

from konlpy.tag import Twitter 
from collections import Counter

2. 데이터 불러오기¶

file = open('텍스트파일 경로', 'r')

lists = file.readlines()

file.close()

lists

3. 형태소 분석¶

twitter = Twitter() 
morphs = [] 

for sentence in lists: 
    morphs.append(twitter.pos(sentence)) 
print(morphs)

noun_adj_adv_list=[] 
for sentence in morphs : 
    for word, tag in sentence : 
        if tag in ['Noun'] and ("것" not in word) and ("내" not in word)and ("나" not in word)and ("수"not in word) and("게"not in word)and("말"not in word): 
            noun_adj_adv_list.append(word) 

print(noun_adj_adv_list)

count = Counter(noun_adj_adv_list)

words = dict(count.most_common())

4. 워드클라우드 만들기¶

워드클라우드 라이브러리 설치

# !pip install WordCloud

from wordcloud import WordCloud 
import matplotlib.pyplot as plt 
import nltk 
from nltk.corpus import stopwords

%matplotlib inline 

import matplotlib 
from matplotlib import rc
rc('font', family='NanumBarunGothic')

from wordcloud import WordCloud

wordcloud = WordCloud(
    font_path = '/Library/Fonts/NanumBarunGothic.ttf',    # 맥에선 한글폰트 설정 잘해야함.
    background_color='white',                             # 배경 색깔 정하기
    colormap = 'Accent_r',                                # 폰트 색깔 정하기
    width = 800,
    height = 800
)

wordcloud_words = wordcloud.generate_from_frequencies(words)

array = wordcloud.to_array()
print(type(array)) # numpy.ndarray
print(array.shape) # (800, 800, 3)

fig = plt.figure(figsize=(10, 10))
plt.imshow(array, interpolation="bilinear")
plt.axis('off')
plt.show()
fig.savefig('business_anlytics_worldcloud.png')

<class 'numpy.ndarray'>
(800, 800, 3)

파이썬 폴더 내 파일명 한번에 변경하기 (0)	2021.12.29
Jupyter notebook 형식으로 tistory blog 글쓰기 (0)	2021.12.28
python 파이썬 폴더 내 파일리스트 가져오기 (0)	2021.12.28
python 두 리스트(list) 간의 같은값 찾기 (python list comprehensive) (0)	2021.12.27
Python(파이썬) list (리스트)에 데이터 추가, 삭제하기 (0)	2021.12.26

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Business Analytics

티스토리 뷰

Python 한글워드클라우드 만들기

프로젝트명 : 한글 워드클라우드 만들기¶

1. 한글자연어 처리 라이브러리 설치¶

2. 데이터 불러오기¶

3. 형태소 분석¶

4. 워드클라우드 만들기¶

'Data Analytics > Python cheat sheets' 카테고리의 다른 글

티스토리툴바