The Human Genome Project, the groundbreaking international scientific undertaking that deciphered the 3 billion DNA base pairs that make up the human genome, was officially completed in 2003. The project had begun 13 years earlier and promised to provide valuable insights into human biology, disease, and evolution, but enterprising companies focused on another area where the findings could be potentially useful (and profitable): diet.
At the time, American culture was awash in talk of dieting: In 2001, former Surgeon General David Satcher declared obesity an epidemic in the United States, sparking a flurry of news, documentaries, and television shows focused on fitness and nutrition, from “The Biggest Loser” and “You Are What You Eat” to “Super Size Me” and MTV’s “Fat Camp.” Not all of these media pieces have aged in the two decades since, but their existence speaks to society’s enduring preoccupation with how we diet. should It nourishes our bodies.
When companies like Nutrigenomics, DNA Fit and Habit started offering expensive nutrition plans based on genetic tests and biomarkers, it was just one example of how the emergence of new scientific technologies and knowledge tends to be proposed as personal health solutions. Digital watches, for example, soon began to double as heart rate monitors, and smartphones could count steps, sleep and menstrual cycles.
Now the question is whether language-based artificial intelligence models like the popular ChatGPT could serve as a tool to create professional nutrition plans cheaper and faster than visiting a nutritionist.
Last year, the researchers published a paper in the Journal of Nutrition and Metabolism comparing ChatGPT’s responses to common nutrition questions with those of human nutritionists.
“Dietitians were asked to provide their most frequently asked nutrition questions and their own answers to those questions. We then asked ChatGPT the same questions and sent both sets of answers to other dietitians, nutritionists, and experts in the field of each question, who scored them based on scientific accuracy, actionability, and understandability,” the study authors wrote. “We also averaged the scores to produce an overall score and compared group means of responses to each question using permutation tests.”
Surprisingly, ChatGPT’s responses often outperformed the nutritionists’ responses across a range of criteria.
“ChatGPT’s overall rating exceeded the nutritionists’ overall rating in five of the eight questions we received,” the researchers continued. “ChatGPT received higher ratings for scientific accuracy five times, for feasibility four times, and for understandability five times. In contrast, the nutritionists’ responses did not receive a higher average score than ChatGPT in any question, both in the overall rating and in each of the rating components.”
These findings were highlighted in a recent paper in “Frontiers of Nutrition.” The study aimed to evaluate the feasibility of clinical use of AI-generated personalized weight loss diet plans through questionnaire-based evaluation by experts in obesity medicine and clinical nutrition. Similarly, researchers used ChatGPT to evaluate the plans’ effectiveness, balance, comprehensiveness, flexibility, and applicability.
Results from 67 participants showed no significant differences between the plans, and AI-generated plans were often indistinguishable from human-created plans. Although some experts discriminated against the AI plans, scores for the AI-generated personalized plans were generally positive.
“Distinguishing AI-generated output from human text, especially that produced by ChatGPT, is a significant challenge,” the study authors wrote. “Our study confirmed this observation, as only 5 out of 67 experts were able to correctly identify and select AI-generated diet plans. These experts highlighted features such as the comprehensiveness of the diet plans and their inclusion of atypical recommendations.”
The researchers continued: “Furthermore, we found an interesting finding that 24 experts who initially reported being unable to identify the AI-created plan correctly selected the AI-created plan. Their reasoning revolved around non-tangible features, such as the absence of a brand name and the perceived unrealistic preparation of the meals. Thus, despite the complexity of the task of identifying an AI-created diet plan, some experts were able to pinpoint it due to factors that are usually not directly related to the quality of the diet plan.”
“Distinguishing AI-generated output from human-generated text, especially that produced by ChatGPT, is a major challenge.”
While there is a lot of promise surrounding AI-generated diet plans, at present, beyond concerns about a lack of specificity and unrealistic preparation suggestions, there are some clear shortcomings that need to be addressed to truly improve the safety and effectiveness of the plans.
For example, when evaluating the plans created by ChatGPT, we noticed that tomatoes were frequently recommended. Tomatoes are an important part of Spanish cuisine and were specified in the prompt as desired by subjects, but may conflict with dietary restrictions for conditions such as gastroesophageal reflux disease (GERD) and chronic kidney disease (CKD). Similarly, ChatGPT’s created plans often emphasized protein intake for weight loss, despite the fact that excess protein can have adverse effects in CKD patients.
This highlights the challenge AI faces in balancing diverse considerations for patients with multiple, potentially conflicting chronic health issues. ChatGPT also seemed to struggle to provide specific portion sizes, macro- and micronutrient breakdowns, and serving suggestions (although, as nutritionist Eliza Savage astutely pointed out, “it’s not very good at math or science. It’s a language model, after all.”).
But while the researchers remain optimistic, they suggest more expertise is needed before any such plans can be proposed or implemented.
“Current AI models like ChatGPT lack the ability to fact-check their outputs,” the researchers wrote, “and therefore it remains the responsibility of human experts to verify these outputs.”
read more
About this topic
