A new academic study shows that Facebook is so plugged into its user base that the social media giant can accurately predict 21 or so user diseases or conditions, based on their Facebook usage.

The study of 999 Facebook users from Stony Brook University and Penn Medicine says the social media network can predict common maladies like anxiety, depression, diabetes, and hypertension.

Study researchers analyzed about 20 million words, clusters, and phrases from Facebook users and, according to the study, matched them with 21 standard categories of medical record diagnoses indicating conditions.

Researchers say the data is as useful as common health checkups in determining the likelihood of disease.

“Our predictions from language captures diagnosis of diabetes about as well as predictions based on one’s body mass index,” says H. Andrew Schwartz, Ph.D., the primary author of the study and assistant professor of computer science in the College of Engineering and Applied Sciences at Stony Brook University. “(For example), we can treat language pattern analogous to a genome and see similar diseases seem to have similar linguistic patterns.”

Schwartz notes that many studies have demonstrated a bond between language patterns and specific diseases, such as language predictive of depression or language that gives insights into whether someone is living with cancer. “However, by looking across many medical conditions, we get a view of how conditions relate to each other, which can enable new applications for AI for medicine,” he says.

Depending on the disease, especially concerning mental health issues and diseases like diabetes, Facebook posts do a better job of predicting disease than demographic data.

“Our study is also important because most of the data tracked in medicine does not capture our everyday ecological and psychological factors that relate to health,” Schwartz says. “With that, Facebook posts have the potential as a tool to monitoring many common and widespread diseases.”

Facebook — not a great track record of privacy

While data can reveal much about a social media user’s personal life, gauging accuracy isn’t so simple, experts say.

“Whenever there is an accumulation of personal information, it’s always shocking what level of secondary inferences can be deduced about people,” says Ray Walsh, a privacy analyst at ProPrivacy, an online resource for helping everyone reclaim their digital privacy and stay safe online. “Seemingly trivial information has been proven to allow artificial intelligence algorithms to figure out extremely invasive facts about people.”

Walsh views data found on social media sites as a “good news-bad news” scenario.

“On the one hand, the idea that data amassed by social media platforms such as Facebook could be used to help in predicting illnesses can definitely be seen as an advantage,” he says. “After all, it is hard to criticize any technological breakthrough that stands to help people to be healthier – or that could allow them to catch an illness earlier and begin receiving treatment.”

“However, the idea that Facebook is already able to find out sensitive medical information about users is concerning, particularly due to the fact that Facebook has a terrible track record of protecting consumer privacy,” he adds.

How Facebook uses and monetizes user data comes into play as well.

“While using information about social media users to help them to be healthier is a commendable concept, the concern is that Facebook will also use that data to extract profit and this could lead to complications,” Walsh notes. “Medical information must always be considered sensitive and is normally highly protected.”

“The idea that Facebook could circulate medical information about users – that perhaps even they are not aware of – to third parties such as advertisers is extremely troubling,” he says.

As it stands, there are no regulations in place to restrict Facebook from analyzing the data on its platform to make these kinds of medical inferences, Walsh says.

“This creates a moral gray area because it could allow Facebook to make a profit by selling medical information to third parties such as life insurance or medical insurance companies,” he notes.

Slowing our roll on Facebook and health care predictions

Some medical experts warn consumers not to get too hyped up over health care predictions from reviewing social media sites.

“There must be clarity about usage of the word ‘prediction,’ which means different things to different people and different types of scientists,” says Eric Feigl-Ding, an epidemiologist and faculty member at the Harvard School of Public Health. “Prediction of having a disease (i.e., discerning current status) is very different from prediction of likelihood of getting a future disease (i.e., to predict the future). The former is much easier than actually predicting the future risk of a specific disease.”

Facebook predict disease
Pexels/Kaboompics

Feigl-Ding believes that in the prediction of even current disease diagnosis, a test could be 99% accurate in terms of sensitivity and specificity, but the positive predictive value (PPV) could still be low, as there are many metrics involved. “The PPV is actually what people want to know, but the PPV is often low in everyday settings even if other data shows high accuracy levels.”

Missing out by underdiagnosing a disease is another reality when using social media sites to predict specific diseases.

“You can get everything right, including a high PPV value but you miss a lot if your criteria is too stringent,” he says. “Also, the PPV can be artificially high if one pre-loads a high prevalence of diseased individuals in the dataset instead of analyzing general populations in which the disease is rare.”

Thus, Facebook likely has a trade-off between having a poor PPV, and between having high underdiagnosis levels, or both. “Not everyone reports symptoms, and those that do, those symptoms could be significant, but the data could miss a lot, too,” says Feigl-Ding.

Going beyond health metrics

While trying to gauge just how accurate Facebook may be in predicting user diseases and maladies, it’s easy to forget that social media usage does generate a ton of data that may be useful in areas outside of health and wellness.

“Facebook can predict far more than just health issues,” says Dr. Tim Lynch, president of Psychsoftpc, a Boston-based gaming computer manufacturer. “It can also determine political persuasion, sexual orientation, liberal vs conservative political views, religious perspective, socioeconomic class, education level and much more.”

Still, while people are putting a positive spin on Facebook’s prediction of health issues and personality, data based on public posts isn’t covered by HIPAA, and may be disseminated to whoever has the ability to pay, Lynch says. “We cannot and should not assume that this information will be used for the public good given the track record of social media companies,” he notes.

A chance to do good

For a company that is battling image problems, Facebook has an opportunity to turn that negative public profile around by properly handling users’ health data.

“I think there’s a nice chance for Facebook to pivot or expand on this area, and one could argue there’s a certain social responsibility on their part,” says Jamie Cambell, founder of GoBestVPN, a digital privacy company. “They’ve been destroying privacy for years, perhaps they could make it up by saving lives.”