Written in Python, this small program grabs subtitles from YouTube videos, then generates a word cloud showing the frequency of the words used in this video. It will ignore the stop words like: ‘a,’ ‘an,’ ‘the,’ ‘this’ and ‘that.’ NLTK provided the stop words list.
I wanted to analyze words being used in Super Bowl commercials, to see which words are frequently used.
For example, this word cloud was generated based on the video Loretta Google Super Bowl Commercial 2020.
‘Remember,’ ‘Loretta,’ ‘Google’ are frequent words in this video.