The saddest day of them all, based on Twitter posts

On May 31, the most commonly used words on Twitter included "terrorist", "violence" and "racist".
On May 31, the most commonly used words on Twitter included "terrorist", "violence" and "racist".PHOTO: NYTIMES

NEW YORK (NYTIMES) - Which was the saddest day of them all? This is the question you may be asking yourself, surveying the wreckage of 2020 thus far.

There are so many contenders: Was it March 12, the day after Tom Hanks announced he was sick and the NBA announced it was cancelled? Was it June 1, the day peaceful protesters were tear-gassed so that US President Donald Trump could comfortably stroll to his Bible-wielding photo op?

Actually, it was neither, according to the Computational Story Lab of the University of Vermont.

Instead, its answer is May 31.

That was not only the saddest day of 2020 so far, but it was also the saddest day recorded by the lab in the last 13 years. Or at least, the saddest day on Twitter.

The researchers call it the Hedonometer. It is the invention of lab co-directors Chris Danforth and Peter Dodds, both mathematicians and computer scientists.

The Hedonometer has been up and running for more than a decade now, measuring word choices across millions of tweets, every day, the world over, to come up with a moving measure of well-being.

In fact, the last time The New York Times checked in with the Hedonometer team in 2015, the main finding was the tendency towards relentless positivity on social media.

"One of the happiest years on Twitter, at least for English," Professor Danforth said. "Since then it has been a long decline."

What has remained constant is this: "Happiness is hard to know. It's hard to measure," he said.

The Lab is part of a small but growing field of researchers who try to parse the national mental health through the prism of online life.

After all, never before have we had such an incredible stockpile of real-time data to choose from. And never has that stockpile towered as high as it does now: In the first months of the pandemic, Twitter reported a 34 per cent increase in daily average user growth. Without normal social life as antidote and anchor, social media now feels more like real life than ever before.

Since 2008, the Hedonometer has gathered a random 10 per cent of all public tweets, every day, across a dozen languages. The tool then looks for words that have been ranked for their happy or sad connotation, counts them and calculates a kind of national happiness average based on which words are dominating the discourse.

On May 31, the most commonly used words on English-language Twitter included "terrorist", "violence" and "racist". This was about a week after black American George Floyd was killed, near the start of the protests that would last all summer.

Since the beginning of the pandemic, the Hedonometer's sadness readings have set multiple records.

This year, "there was a full month of days that the Hedonometer was reading sadder than the Boston Marathon day", Prof Danforth said. "Our collective attention is very ephemeral. So it was really remarkable that the instrument, for the first time, showed this sustained, depressed mood, and it got even worse when the protests started."

Professor James Pennebaker, founder of online language analysis and a social psychologist at the University of Texas at Austin, became interested in what the choice of words reveals about our moods and characters - exactly at the moment when the Internet was first supplying such an enormous stockpile of text to draw from and consider.

"These digital traces are markers that we're not aware of, but they leave marks that tell us the degree to which you are avoiding things, the degree to which you are connected to people," said Prof Pennebaker, author of The Secret Life Of Pronouns, among others. "They are telling us how you are paying attention to the world."

Professor Munmun De Choudhury, from the School of Interactive Computing at Georgia Tech, is also examining digital data for insights into well-being. In 2013, she and her colleagues found that by looking at new mothers on social media, they could help predict which ones might develop postpartum depression, based on their posts before the babies' births.

One of the most telling signs?

The use of first-person singular pronouns, like "I" and "me". "If I'm constantly talking about 'me', it means my attention has inward focus," she said. "In the context of other markers, it can be a correlate of mental illness."

This finding first emerged in the work of Prof Pennebaker.

You may be wondering if Twitter is really a representative place to check the state of the population's mental health. After all, many users tend to refer to it by nicknames such as "hellsite" and "sewer". Some studies have shown that frequent social media use is correlated with depression and anxiety.

Can we really discern national happiness based on this particular digital environment and the fraction of the population - one in five last year - that regularly use Twitter? Dr Angela Xiao Wu, an assistant professor of media, culture and communication at New York University, thinks we cannot.

She argues that in the rush to embrace data, many researchers ignore the distorting effects of the platforms themselves. We know Twitter's algorithms are designed to keep us hooked on our timelines, emotionally invested in the content and coaxed towards remaining in a certain mental state.

"If social scientists then take your resulting state, after all these interventions that these platforms have worked on you, and derive from that a national mood? There's a huge part of platform incitement that's embedded in the data, but is not being identified," she said.

Indeed, Dr Johannes Eichstaedt, a computational social scientist at Stanford and a founder of the World Well-Being Project, concedes that the methods like the ones his own lab uses are far from perfect. "I would say it's about a C+," he said. "It's not that accurate, but it's better than nothing."