본문 바로가기
Python_Wiki/Python_Library

matplotlib: word cloud 워드 클라우드

by yj-data 2025. 6. 6.

 

  • 워드클라우드: 문서의 키워드, 개념 등을 직관적으로 파악하기 위해 핵심단어를 시각화하는 기법
  • from wordcloud import WordCloud: 워드 클라우드 생성 모듈
  • from PIL import Image  : 워드 클라우드를 원하는 형태로 그리기 위해 그림을 불러오는 패키지
  • WordCloud().generate(text) : 선언해준 text에서 wordcloud 생성
    • text변환: wordcloud에서 작동하도록 df를 list로 변환시킨 후 str로 2차 변환
    • mask: 단어를 그릴 위치 설정, 흰색 항목은 마스킹 된 것으로 간주
    • cmap = plt.matplotlib.colors.LinearSegmentedColormap.from_list("", ['color1', 'color2'...])
  • plt.imshow(): array에 색을 채워서 이미지를 표시
  • plt.axis('off'): 축 삭제

 

example: step by step

예시 text는 구글에 long english text sample을 검색해서 나온 글을 사용.

링크: https://catdir.loc.gov/catdir/enhancements/fy0711/2006051179-s.html

 

Sample text for Library of Congress control number 2006051179

Bibliographic record and links to related information available from the Library of Congress catalog Copyrighted sample text provided by the publisher and used with permission. May be incomplete or contain other coding. Chapter One A Stop on the Salt Route

catdir.loc.gov

 

실제로 돌릴때는 전문 다 넣었으나 블로그 글이 너무 길어지는 것을 방지하기 위해 단축해서 올림
sample_text = '''
Chapter One
A Stop on the Salt Route
1000 B.C.
As they rounded a bend in the path that ran beside the river, Lara recognized the silhouette of a fig tree atop a nearby hill. The weather was hot and the days were long. The fig tree was in full leaf, but not yet bearing fruit.
...id not begrudge others the use of the rafts, and the island was large enough to share. Nonetheless, the situation required caution. He cupped his hands to his mouth and gave a shout. It was not long before a man appeared on the bank of the island. The man waved.
...
“Fascinus . . . ,” He did not finish the thought, but she understood. She had never seen Fascinus, but he had told her about it. Many times in the past, Fascinus had given guidance to her father. Now, once again, Fascinus had made its will known.
The darkness did not deter her. She knew every twist and turn of every path on the little island. When she came to the metal trader’s camp, she found Tarketios lying in a leafy nook secluded from the others; she recognized him by his brawny silhouette. He was awake and waiting, just as she had been lying awake, waiting, when her father came to her.
At her approach, Tarketios rose onto his elbows. He spoke her name in a whisper. There was a quiver of something like desperation in his voice; his neediness made her smile. She sighed and lowered herself beside him. By the faint moonlight, she saw that he wore an amulet of some sort, suspended from a strap of leather around his neck. Nestled amid the hair on his chest, the bit of shapeless metal seemed to capture and concentrate the faint moonlight, casting back a radiance brighter than the moon itself.
His arms—the arms she had so admired earlier—reached out and closed around her in a surprisingly gentle embrace. His body was as warm and naked as her own, but much bigger and much harder. She wondered if Fascinus was with them in the darkness, for she seemed to feel the beating of wings between their legs as she was entered by the thing that gave origin to life.
'''
 type(sample_text)

from wordcloud import WordCloud
from PIL import Image

plt.figure(figsize=(15,5))
text = sample_text #df에서 정보를 뽑을시 text=str(list(df['column_name'])) 형태로 작성
mask = np.array(Image.open('/content/book.png'))

cmap = plt.matplotlib.colors.LinearSegmentedColormap.from_list('',['#8cb2ff','#208fff'])

wordcloud = WordCloud(background_color='white', width=1400, height=1400, max_words=200, mask=mask, colormap = cmap).generate(text)
#with the mask, width and height are ignored
#max_words: the number of words included in the final wordcloud, default is 200

plt.suptitle('The WorldCloud', fontweight='bold', fontfamily='serif', fontsize=15)

plt.imshow(wordcloud)
plt.show()

book.png

 

 

delete axis

from wordcloud import WordCloud
from PIL import Image

plt.figure(figsize=(15,5))
text = sample_text #df에서 정보를 뽑을시 text=str(list(df['column_name'])) 형태로 작성
mask = np.array(Image.open('/content/book.png')  #런타임 동안 임시로 올려둔 파일

cmap = plt.matplotlib.colors.LinearSegmentedColormap.from_list('',['#8cb2ff','#208fff'])

wordcloud = WordCloud(background_color='white', width=1400, height=1400, max_words=200, mask=mask, colormap = cmap).generate(text)
#with the mask, width and height are ignored
#max_words: the number of words included in the final wordcloud, default is 200

plt.suptitle('The WorldCloud', fontweight='bold', fontfamily='serif', fontsize=15)

plt.imshow(wordcloud)
plt.axis('off')
plt.show()