Chinese social-media platform WeChat saw spikes in the terms 'SARS,' 'coronavirus,' and 'shortness of breath,' weeks before the first cases were confirmed, a study suggests

Chinese social-media platform WeChat saw spikes in the terms 'SARS,' 'coronavirus,' and 'shortness of breath,' weeks before the first cases were confirmed, a study suggests

Coronavirus mask Wuhan China


A man wears a mask while walking in the street in Wuhan, Hubei province, China.

  • A new paper from Chinese scientists found that posts on WeChat, China's main social-media site, used keywords related to the new coronavirus weeks before the Chinese government confirmed cases.
  • The study, which has not yet been peer reviewed, raises new questions about the timeline of the coronavirus outbreak.
  • The researchers suggest that social-media keyword tracking could be a useful tool in efforts to detect emerging infectious diseases.
  • Visit Business Insider's homepage for more stories.

A research paper released this week describes a surprising trend on the Chinese social-media platform WeChat: Usage of keywords related to the new coronavirus spiked more than two weeks before officials confirmed the first cases.

The authors, five infectious-disease researchers in China, analyzed the prevalence of the terms "SARS," its Chinese equivalent "Feidian," "coronavirus," "shortness of breath," "dyspnea," and "diarrhea" in posts and searches on WeChat from November 17 to December 31. Their findings suggest "abnormal spikes and increases" in the usage of all the keywords during that time.

If confirmed, the findings might indicate that the coronavirus started circulating weeks before the first cases were officially diagnosed and reported.

The researchers also suggest that social media could be used as a tool for early detection of new infectious diseases.


An inconsistent timeline of coronavirus cases

The first cases of the novel coronavirus were reported in Wuhan, China at the end of December. The virus has infected more than 85,000 people and killed nearly 3,000, mostly in China. Cases have been reported in at least 56 other countries.

The coronavirus is a zoonotic disease, meaning that it jumped from animals to humans. Researchers think the virus originated in bats, and many have suggested that it spilled over to people in a market in Wuhan.

But the precise timeline is still unclear. One paper published by Chinese researchers in The Lancet analyzed the first 41 clinical cases and found that the first patient came down with flu-like symptoms on December 1.

Coronavirus patients wuhan

STR/AFP/Getty Images

An exhibition center converted into a hospital in Wuhan on February 5, 2020.

The new analysis relied on an open-source index of posts from WeChat, and also used an index of searches from China's primary search engine, Baidu. The paper is still undergoing peer review, however.


The data showed that usage of "shortness of breath" and "dyspnea" both peaked on December 22. "Diarrhea" peaked on December 18 (gastrointestinal issues are an early sign of the coronavirus in a minority of patients).

"The index for SARS behaved abnormally in the first three days in December with a peak on December 1, 2019," the report says.

Usage of "Feidian," meanwhile, began to rise on December 15 and stayed at relatively high levels through December 29. Usage "rose rapidly on December 29, 2019 with a peak on December 30, 2019," the researchers wrote.

December 30 was the day that Dr. Li Wenliang, an ophthalmologist at a Wuhan hospital, shared a note with fellow medical-school alumni about a SARS-like illness that had stricken several patients. Police later made Li sign a letter acknowledging that he was "making false comments." He contracted the coronavirus and died about a month later.

The World Health Organization and Chinese Center for Disease Control and Prevention first confirmed cases of the new coronavirus in Wuhan on December 31.


The Chinese population is especially attuned to illnesses that resemble SARS, the researchers wrote.

"Of these keywords, Feidian is especially worthy of attention," the report says. "In 2003, the SARS outbreak caused mass panic among people in China and approximately half of the victims were health care workers. Since then, Chinese physicians are alert to SARS and similar diseases."

The researchers added that if Chinese doctors were to see a cluster of patients with symptoms of viral pneumonia, it would be natural for them to think of SARS and mention Feidian on WeChat.

However, they emphasized that correlation is not causation.

"Whether there was an inner link between the word activity in WeChat and early patients is unknown," they wrote.


A new type of medical surveillance

Despite lingering questions about the observed WeChat trends, the researchers suggest their new study demonstrate sa way in which social-media keyword usage could be a medical-surveillance tool.

"Future studies can prospectively gather and analyze data from WeChat to early detect SARS-CoV-2-like outbreaks as well as outbreaks of other diseases in China," they wrote. (SARS-CoV-2 is the clinical name of the coronavirus.) "Tracing the source of keywords that behave abnormally in frequencies in WeChat, following rapid response, may become a promising approach to control a disease outbreak in its very early stage."

Currently, the Chinese government relies on a countrywide medical-surveillance system created after the SARS outbreak to catch emerging diseases.

When a cluster of patients with the same pneumonia-like symptoms came to Wuhan hospitals in December, medics entered their locations, demographic information, and infection statuses into that database. When the system finds a higher-than-normal rate of illness in a particular region, that tells government analysts and officials to take a closer look and perhaps order additional tests.


Feature China/Barcroft Media/Getty Images


Raina MacIntyre, head of the biosecurity-research program at Sydney's Kirby Institute, told Business Insider that many countries have less robust medical-surveillance systems, however.

"Some countries have fewer resources and more rudimentary systems, which can result in delays and missed outbreaks," she said. "One way to overcome this is to use rapid epidemic intelligence from open-source data such as social media and news feeds."