From Tom Insel’s perspective, the future of psychiatry had never looked so bright. It was May 2015, and Insel, then the director of the National Institute of Mental Health, the world’s largest and most well-funded research institute focused on mental disorders, had traveled to Portland, Oregon, to speak with the parents of young children grappling with serious mental disorders. He had good news: Researchers were making rapid progress in uncovering the biological basis of serious mental disorders. NIMH researchers who were studying high-resolution brain scans of people with depression had found abnormal neural branching in stem cells from children with schizophrenia. They now also understood how stress results in genetic changes in mice. But as soon as he opened the floor for questions after his presentation, he found that not everyone was impressed.
The first person to grab the microphone was a tall, bearded man in a flannel shirt. Insel had noticed him growing increasingly agitated during his presentation. “He said, ‘Man, you just don’t get it,’” Insel recalls. “‘I have a 23-year-old son with schizophrenia. He’s been hospitalized five times, incarcerated three times, which led to suicide attempts, and he’s currently homeless. Our house is on fire, and you’re talking about the chemistry of the paint.’”
The father’s remark left Insel speechless. His initial reaction was to defend NIMH’s work by offering the man some platitudes about how scientific revolutions take time and basic research was required for better treatment. But deep down, he agreed with him. Despite the remarkable progress made by mental-health researchers at the NIMH and around the world, during his 13 years at the helm of the institute, deaths from suicide had increased by 33 percent, deaths from addiction had tripled, and the number of homeless and incarcerated people with serious mental disorders had doubled.
“That was a wake-up call for me,” Insel says. “It’s not a knock on the NIMH because it’s not their job to keep people with serious mental illness out of the criminal justice system, and it was obvious that what we were doing may help in the long run. But we’re in a mental-health crisis, and it became my calling to figure out what it would take to put out that fire.”
Four months after that fateful presentation, Insel resigned from his position as director of the NIMH and began aiming his work at using AI and other digital technologies to help address the mental-health crisis. “It was really exciting to look at how we could use AI to bend the curve for people with schizophrenia, bipolar illness, kids who are suicidal, and really try to have an impact at the public-health level,” Insel says.
In the years since, Insel has co-founded three AI-driven mental-health start-ups and advised several others. But his focus has remained on finding ways to correlate digital behaviors with mental disorders, a technique known as digital phenotyping. Digital phenotyping draws on the reality that both AI and mental-health professionals are fundamentally in the business of pattern recognition. Psychiatrists, therapists, and social workers study the behaviors of their patients for evidence that their mental health is improving or deteriorating so they can adapt their diagnoses and treatments. But the data available to the human professionals who make these judgments is confined to patients’ notoriously unreliable self-reports and a limited amount of behavioral observation. Digital phenotyping, by contrast, can passively analyze digital behavioral data from consenting patients around the clock to identify patterns that would otherwise escape the notice of a human therapist.
The promise of digital phenotyping bringing data-driven “objectivity” to the diagnosis and treatment of mental disorders was a seductive one for Insel and many of his peers. While traditional mental-health research struggles to bridge the gap between the lab and the clinic, digital phenotyping could—in theory—be immediately applied in the real world, where the vast majority of people already carry sophisticated computers in their pockets. The data produced by these devices could, for instance, help flag the onset of depressive or suicidal episodes, which would allow mental-health professionals to make real-time interventions when needed. It could also help therapists, psychiatrists, and other mental-health professionals to monitor the progress of their patients. If they notice that a patient has stopped sleeping or socializing, for example, a mental-health provider can work with the patient to correct behaviors or adjust treatments.
Digital phenotyping may hold the key to improving outcomes for people living with a broad range of mental disorders. Now Insel and other researchers are trying to develop the tools that can deliver on this promise.
*
Over the past few decades, mental-health researchers have increasingly focused on uncovering the biological basis of the cognitive, emotional, and behavioral dysregulation characteristic of mental disorders. This fixation on biology is largely because psychiatry remains the only medical field that has yet to uncover a single unambiguous biomarker for any of the nearly 300 disorders listed in the Diagnostic and Statistical Manual, the official taxonomy of mental disorders. In the absence of any biomarkers, it’s hard for researchers to know what, exactly, they are looking for.
The hundreds of disorders in the DSM are defined by clusters of symptoms and a threshold for how many of those symptoms must be present for a patient to be diagnosed. Although this system helps mental-health professionals standardize their approach to diagnosis and treatment, decades’ worth of data shows that symptom-based diagnosis is unreliable because of its dependence on subjective clinician judgment.
“I like to say that developing a new antidepressant with the diagnostic system we have today would be like developing a new antibiotic for someone with a fever when you don’t know if it’s caused by a bacterial or viral illness,” Insel says. “It’s really critical that we get better precision around diagnostic categories if we want to do better interventions. Otherwise, we’re just treating the fever, and we’ll never really know what’s in front of us.”
If there are 227 possible symptom combinations for a diagnosis of depression—which pales in comparison to the roughly 60,000 possible symptom combinations for post-traumatic stress disorder—how can researchers looking at the brains of depressed patients be sure that these patients have the same disorder? Although the DSM is still widely used, Insel announced in 2013 that the NIMH would no longer be funding research based purely on DSM diagnostic criteria. Insel is adamant that subjectivity—particularly the lived experience of patients—has a critical role to play in treatment, but it’s clear that the field of mental health has an acute objectivity problem, in part because researchers and clinicians lack data, which makes it difficult to better patient outcomes.
It’s a challenge that John Torous knows all too well. As the director of the digital psychiatry division at Harvard Medical School’s Beth Israel Deaconess Medical Center, Torous splits his time between clinical practice and academic research largely focused on the application of digital tools in mental health. During Torous’s psychiatry residency at Harvard, he witnessed firsthand the struggles people with serious mental disorders face and the shortcomings of conventional approaches to diagnosis and treatment. His background in computer science led him to seek digital solutions, leveraging the capabilities of the internet, mobile phones, AI, and other technologies to improve patient outcomes. In 2015, Torous and three of his colleagues published a paper in which they first defined digital phenotyping as “the moment-by-moment quantification of the individual-level human phenotype in-situ using data from smartphones and other personal digital devices.”
By the mid-1960s, less than a decade after the term artificial intelligence was coined, a handful of psychiatrists were already experimenting with using AI systems to simulate mental disorders for research purposes. But despite decades of research on AI in psychiatry, all of these pioneering experiments fell short of bettering outcomes for patients. They were all missing one crucial ingredient that is the grist of modern AI systems: data.
The proliferation of internet connectivity and mobile devices over the past two decades changed everything. “I think the potential of a new tool and new data sources to understand human behavior is of paramount importance to the field,” Torous says. “It’s not that we have more objective data, but we have new complementary sources of data to better understand our patients’ behavior.”
Torous and his colleagues recognized that it might be possible to draw from this digital exhaust—the way a patient types on their computer or scrolls on their smartphone, the biometric data collected by their wearables, and so on—for valuable insights into people’s mental health.
Theoretically, that information could allow mental-health professionals to identify the onset of a patient crisis and stage an early intervention, help patients better understand their own mental-health status, and assist researchers in developing a more refined picture of mental disorders.
Digital phenotyping involves a broad range of digital technologies and data types, but the basic idea behind all digital-phenotyping systems—regardless of the technologies or data types used—is the same. Certain patient behaviors are associated with certain mental disorders, and many of these behaviors can be measured by how we interact with digital devices. For example, AI can analyze the volume and duration of a patient’s call records—disregarding the content of those calls—or their geolocation data as proxies for their social isolation. If the data shows that the patient has stopped leaving their home and answering calls, it may indicate the onset of a depressive episode. The digital phenotyping system can then flag these behaviors for a mental-health professional, who can contact the patient to provide support.
Aside from the types and volume of data, one of the key attributes that differentiates digital phenotyping from other digital approaches to monitoring patient behavior is the way the data is collected. In contrast to, say, an ecological momentary assessment—essentially a patient survey that can be periodically delivered by phone or computer—digital phenotyping is passive and doesn’t rely on patients’ self-reports. With permission, it can monitor digital behavior 24/7 without intruding into the patient’s life. It is collecting behavioral data as patients go about their day in their normal environment, which should theoretically provide better behavioral insights than patient surveys or data collected in an artificial clinical environment.
Over the past decade, a growing body of evidence suggests that digital-phenotyping systems are capable of reliably identifying some disorders based on digital behaviors, which is the critical first step toward improving patient outcomes. In 2017, for instance, a group of researchers from Harvard, MIT, and other Boston-area universities ran a digital phenotyping study with 73 participants that predicted symptoms of depression and PTSD using a variety of digital behavioral data, including the way a phone was handled, messaging frequency, GPS location, and vocal cues. In 2023, Torous and a team of international collaborators published a study showing that it was possible to predict relapse in people with schizophrenia using a mix of passive data sources such as geolocation and screen state along with active data such as surveys. Despite these promising results, some questions remain about the efficacy of digital phenotyping.
An active area of research, for example, is on what types of data correlate best with symptoms of a given disorder. Consider a study published in 2017 in the journal IEEE Transactions on Biomedical Engineering by researchers from the University of Oxford that showed that “it is possible to detect depressive episodes in individuals with bipolar disorder with 85 percent accuracy using geographic location recordings alone.” Could the accuracy be increased by incorporating other types of data such as vocal cues and call-record data? If a single data type is sufficient for detecting depressive episodes in bipolar patients, will it apply to other disorders, or are different data types more relevant to some disorders than to others?
The answers to these questions have real consequences for the technology and its users. More targeted data collection could help further protect the privacy of the patients using these systems. Already it’s clear that not all data types are created equal. Both Torous and Insel pointed to sleep data as an example of a datastream that has proved beneficial for helping patients with a broad range of disorders. Sleep behavior is well characterized, and research has established strong connections between sleep dysregulation and mental disorder. But what about something more experimental? Can researchers tell if an individual is depressed based on changes in the way they scroll on their phone or type on their computer?
Answering that question is remarkably challenging. In several recent meta-analyses of digital-phenotyping studies on patients with psychotic disorders, researchers found that the accuracy and the effectiveness of these systems vary widely depending on the types of machine learning methods used. The lack of standardized research protocols makes it difficult to generalize results across studies. Moreover, most digital-phenotyping studies use relatively small patient populations, last only a few months, and are plagued with methodological shortcomings—including providing patients with a study-specific smartphone, which could skew the data in ways that using a participant’s own phone would not, or failing to collect basic patient information like age. So, as some studies show that AI can indeed detect mental disorders based on digital behavior and others show that it cannot, larger standardized trials still need to be conducted.
If digital phenotyping one day proves to work, the most important question is whether it will make a difference in the mental-health crisis. Properly identifying and tracking mental disorders is a massive challenge—but it’s not the only one. Many people with these disorders don’t have access to mental-health professionals or these forms of care at all. For those who do, there are doubts about the technique itself. “What these methods are trying to do is take someone’s qualitative lived experience and reduce them to a number that can tell you if they’re about to have some mental-health episode,” says Gabrielle Samuel, a lecturer in the department of global health and social medicine at King’s College London. “But there are so many different possible reasons [for a person’s behavior], and it’s making assumptions about the way people live. My concern is that automation moves you further away from the person in front of you, and that distance is what’s problematic.”
Samuel is also skeptical that digital-phenotyping technologies will help the patients most in need. For example, mental disorders are far more prevalent in incarcerated and homeless populations, many of whom don’t have access to smartphones or may reject digital-phenotyping systems because they are concerned about being constantly monitored. “We’re throwing a huge amount of money into these technologies as though they’re going to solve mental health issues,” Samuel says. “But, actually, it’s not going to be a solution to problems with mental health because so much of mental health is socially, economically, and politically determined.”
Insel is the first to admit that the technology has a long road ahead of it before it can really deliver on its promise. “We’re in Act One of a five-act play,” he says. “The first act has shown us that there’s real potential here, but we’re still not yet fully realizing it.”
Torous and Insel acknowledge these shortcomings of digital phenotyping and emphasize that they don’t see the tool as a silver bullet for the mental-health crisis. But while researchers like Torous are intent on refining these technologies in the lab and conducting foundational research, digital-phenotyping start-up founders like Insel are racing to get these technologies to patients in need. The sense of urgency is understandable given what these tools, once fully developed, could do: Just-in-time interventions for patients in crisis, treatments that adapt to the lived realities of patients, and consensual monitoring to supplement patients’ self-reports are all within reach.
As with so many issues in mental health, both approaches have upsides and downsides in service of the same goal: solving society’s mental-health crisis and delivering relief to millions of people living with mental disorders. “A lot of people say that it’s not a perfect system, and I get that, but for me the question is ‘Compared to what?’” Insel says. “Patient outcomes for the past two decades have gotten worse despite more treatment and more money being spent. So if we continue to ignore data on how people think, feel, and live, I don’t think patient outcomes will get any better. You can’t improve the quality of care until you start to measure it.”
Daniel Oberhaus is a science writer and the founder of HAUS Biographics, a marketing and communications agency for deep tech organizations. He is the author of The Silicon Shrink, a forthcoming book from MIT Press about the past, present, and future of AI in psychiatry, and was previously a staff writer at Wired magazine.