<- Post-Internet Art & Aesthetics

Sentiment Analysis of News Outlets

The pandemic brought on a wave of negativity that shook the world with horrifying stories from all over the globe. This continuous wave of negative news was not just a direct result of the difficult times we were facing, but also a fabrication from content strategists who take advantage the circumstances.

I myself was in locked a habitual cycle of consistently referring to multiple news apps on my phone for incoming news regarding COVID-19. It became out of hand when I realized that as soon as I felt hope and optimism, the news headlines would boost their negative content and pull back into a fearful state. Once in such a state, one falls back into the cycle of continuous news consumption.

Aside from the obvious factors which distinguish news outlets, such as their ability to dissuade and reinforce political attitudes, or their economic motives for more consumption, there's also a process to which they can help control behavior for many other purposes.

Political Spectrum is a a data visualization that uses a web scraper to extract content from popular media outlets such as CNN, Fox News, and the NY Times, among many others. The content is then fed into an machine learning algorithm which analyzes the text for sentiment and generates a score. The sentiment score ranges from -1 to +1, with -1 being most negative, and +1 being most positive. Each score is then mapped to a color swatch, is extracted form a color spectrum between red (-1) and blue(+1, and with grey (0) as neutral.

The usage of the primary colors, such as Red and Blue helps encompass the many attributes into a simple, easily digestible view. Viewing a list of each headline has a specific swatch when viewed in sequence on a grid paints a picture for the overall sentiment direction and patterns of each new outlet.

Technical Details

A web scraper is used to scan through the different websites multiple times (most news websites update content a few times daily). The scraper pulls the headings and paragraphs from different pages (homepage, politics page, business page, etc.) and for each heading, or string of text, a sentiment score is generated.

The sentiment score is an machine learning algorithm that has been trained to determine the sentimentality of text. Based on the string, a number between the range -1 to +1 is calculated, with -1 being most NEGATIVE, and +1 being most Positive. Each score is then mapped to a different color swatch (10 in total), with RED being most NEGATIVE (-1), and BLUE being most Positive (+1), and Grey being NEUTRAL (0)