Understanding Obesity Through the Gut Microbiome
A Machine Learning Project
What problem would you dedicate decades of your life to solving?
Personally, that’s a hard question to answer. It’s easier for me to simply work on interesting things. Having done research on the gut microbiome in the past, and having gone through a simple yet enriching Artificial Intelligence journey, I found out that I could intersect these two areas to build something “interesting”.
You’ve probably heard of the human microbiome in the past. It’s this community of trillions of microorganisms living in our mouths, skin, vagina, and yes: in our gut!
What are they useful for? What do they do? Well, apparently the answer is “more than we can imagine”. Scientists have found correlations between the diversity of these microbes and conditions like insomnia, depression, obesity, cancer, HIV, Alzheimer’s disease, and more.
These discoveries have been so compelling, that startups like Viome have already created products and services to analyze your gut microbiome, and then recommend personalized probiotics to improve your health. Further, some scientists think that the gut microbiome could be as important and as complex as our genome.
Going back to the correlations, there was one that especially caught my attention: our gut microbiome can influence our food cravings through the Gut-Brain Microbiome Axis (GBMA). To say that in English, these microbes in our gut are controlling what we want to eat by sending signals to our brain.
After reading this fact, the first question that came to my mind — and for sure, to the minds of many researchers around the world too — is if this mechanism could be used to treat obesity somehow.
The idea that then wanted to work on was an app that would analyze the gut microbiome of people with high BMI, then detect which bacteria are somehow misbalanced, and finally recommend a specific diet to improve the composition of your gut microbiome.
Such solution would promise to help the user feed the bacteria that will make them healthy, and stop feeding the bacteria that contributes negatively.
At the time of writing this article, I have only created a simple AI model to classify healthy and obese microbiomes. My programming skills will definitely need to improve before building the whole idea described above, but I will consider this as a start.
Microbes controlling your brain
This makes sense: each species is fighting for survival, they will tell our brains to feed them the food they want, so the larger a particular type of bacteria is, the more power it will have to manipulate us through higher signals.
In this sense, lower diversity should be associated with more unhealthy eating behavior and greater obesity. I see this as “balance will always be good”. If you have few types of bacteria trying to tell you what to do, chances are that you won’t have a healthy diet.
Apparently, this doesn’t only apply to what kinds of foods you crave, but also how much you want to eat. In a study, germ-free mice were shown to have lower levels of leptin, cholecystokinin, and other satiety peptides. Researchers have found that auto-antibodies are another means for microbes to control appetite.
Since diversity is an important factor, some people have also studied how it changes according to many different factors, like BMI, white blood cells count, how often we exercise, consumption of sugary drinks, food like rice, potatoes, or vitamin A.
AI helping you control those microbes
To create the aforementioned app, my initial reasoning was literally this one:
- Give AI a random microbiome so it tells if it’s healthy or not (model 1)
- It will create a list of the bacteria that are misaligned
- It will tell in which way these bacteria are misaligned (too much or too little)
- Based on that list, we may try re-enforcement learning to give it the goal of the healthy microbiome baseline, so it learns how to get there with what it knows about bacteria’s food preferences
As of now, I’ve only completed the first phase, which looks something like this:
- Getting datasets of obese microbiomes and labeling them as such
- Getting datasets of healthy microbiomes and labeling them as such
- Joining and mixing those 2
- AI will train on specific diversity numbers of independent bacteria and will learn to tell if that concentration is healthy or obese
- AI will also train on whole data sets of obese and healthy microbiomes to learn about them as a whole system
So that’s what I did.
The data
This was, without a doubt, the longest step. At first, I was trying to get data sets from research papers. After looking at many different websites, I found this one. I love it, it’s wonderful because it’s simple to use, and has a lot of different data.
Next, I had to clean that data. That included getting rid of columns that I wouldn’t use (like run IDs) and rows of strains that weren’t useful either. Then I also had to “reshape” the dataset to make clear which were my dependent and independent variables. Finally, I renamed a lot of the information to numbers. As far as I know, I had to do that to make the program work.
The models
Maybe I shouldn’t have been that lazy when choosing a model. However, I had worked on some simple Machine Learning projects in the past, so the easiest thing was to simply adapt those to this project.
A Decision Tree Classifier gave me the best results when classifying specific strains as belonging to a healthy or an obese person. I honestly didn’t have a lot of data, so I think that the 100% accuracy that my model gained was due to luck.
When classifying whole microbiomes, the best-performing model was logistic regression, with 75% of accuracy.
What this means
In short, I coded an AI model that could classify both bacterial strains and whole microbiomes as “healthy or obese”. Meaning, AI could tell if a specific bacterial strain or a whole microbiome belonged to an obese or a healthy person.
My intention behind that first phase of the project was getting familiar with gut microbiome data, and being able to obtain that list of bacteria that play a key role in obesity (if there is one).
Next steps
What I’m already trying to do is obtain that list of important criteria by visualizing the decision tree. My main area of improvement for that now is managing data with python quickly, and solving syntax errors.
Note: If you’re working on something similar or know someone who could give me professional feedback on this project, I would highly appreciate it. You can find my social media links at the bottom of this article :)
Having learned about AI for some months, I wouldn’t say it’s my passion (at least not yet) but I feel proud of what I’ve built so far, and I will want to stay updated in everything that happens in this field, especially things that have to do with biotech too!
One insight I got is that a lot of scientists are already using these tools, and the fact that companies like Viome exist, make it a bit challenging for me to come up with something truly innovative.
However, I will follow a friend’s advice: not because someone’s already doing something, does it mean that it’s the best it can be. You can do it too, and you should try to improve it.