Last Updated on September 4, 2022
Table of contents:
- What are Mental Models in Data Science?
- Why should I use Mental Models in Data Science?
- Why shouldn’t I use Mental Models in Data Science?
- Why is improving thinking and decision-making in Data Science important?
- What decisions do Data Scientists often need to make?
- What are the most common issues with decision-making in Data Science?
- Getting started
- Episode 1: A new problem
- What are some good thinking Mental Models for Data Science?
- Episode 2: The problem strikes back
- What are some good techniques for decision-making in Data Science?
- Episode 3: The return of the Data Scientist
- What are some good problem-solving techniques for Data Science?
- Episode 4: The Intern Menace
- What are some good techniques for improving communication skills in Data Science?
- Where can I learn more?
- Closing thoughts
- References
What are Mental Models in Data Science?
Mental Models are our inner representations of external reality that we use to interact with the world around us.
In Data Science, you often need to solve problems, make decisions, and communicate… and knowing which Mental Models to use and when to use them will make you stand apart.
Why should I use Mental Models in Data Science?
- Mental Models help us make better decisions
- Mental Models allow for easier problem solving
- Make our thinking more efficient
- Help us avoid logical fallacies
- Simplify problems
- Make us better thinkers
- Improve our work quality
Why shouldn’t I use Mental Models in Data Science?
- Mental Models aren’t perfect
- Mental Models types are situational
- Some problems might not be solvable by a Mental Model (or a human at all)
- You might use the wrong ones
- You might spend too much time thinking about mental models and not the problem
- Aren’t immune to logical fallacies
Why is improving thinking and decision-making in Data Science important?
Thinking and decision-making are integral parts of Data Science and their outputs can make or break a project, potential solution, idea, and much more. These are also skills that can be made better and more efficient, and cultivating them will make you a better Data Scientist.
What decisions do Data Scientists often need to make?
Data Scientists often need to juggle the business and technical sides which often don’t get along. These decisions can span from modeling, structural, communication, goal orientation, the scope of work, responsibility, and much more.
What are the most common issues with decision-making in Data Science?
Issues with decision-making in Data Science often revolve around stakeholders asking the impossible, getting unusable data, obscure requirements, navigating a team, finding the right data, having too much or too little information, and more.
Knowing your way on how to navigate the sea of decisions and how to face its issues can be tricky. But, using Mental Models can make your life much easier.
Getting started
The inspiration for writing this article arose from me noticing that many Data Scientists, with whom I worked, tend to assimilate much of the problem-solving process to previous solutions, “jump the gun”, or subject everything to Deep Learning while the solution was a “simple” one.
I’ve often found that each new problem requires a fresh look and that the usage of Mental Models can speed up the process of problem solving and decision-making. Moreover, Mental Models can alleviate much stress and uncertainty, and are also transferable to all parts of life.
Thus, having my background in Psychology and working as a Data Scientist for more than 4 years, I think that I can offer an interesting and practical perspective on different Mental Models and their usage.
We’ll categorize Mental Models into 4 major categories inside of which I’ll be showcasing a couple of them and applying them to example scenarios. Feel free to play along with the example scenarios as there are multiple solutions to each problem.
The 4 categories which will be shown are the following:
- Thinking
- Decision-making
- Problem-solving
- Communication
Have in mind that these categories aren’t perfect and that the Mental Models can easily overlap, be borrowed, and/or combined.
Take note that the example scenarios are made up, boiled down to make the article more interesting, and might have or might have not happened to me. Any resemblance you might find with real-world examples is random and unintentional.
Episode 1: A new problem
You’re a Lead Data Scientist in a well-established company that offers various 360o Data Science solutions to banks, health-based companies, startups, hedge funds, and more. You are in charge of the decision-making process and are leading a data team.
You receive two emails on the next exciting problems that await your attention. Here are the boiled down versions of those emails:
The first one has to deal with a startup that is offering AI-based mental health treatments to users through music. They’re doing this via an app where the user comes in and select the mood that they want to be in and their algorithm impacts the user’s mood in the wished direction.
They’ve found issues with their app where users experience inflation of their current mood and/or don’t reach the required mood goal at all. The startup is confused as its models have been validated to work by scientific research.
The second email is from a bank that desires a solution that will predict the amount of money taken out from their ATMs, 30 days in advance, and daily.
Are you spotting some potential issues and/or overlooked factors with these problems?
What are some good thinking Mental Models for Data Science?
Some good thinking Mental Models for Data Science are the following:
- Concept Map
- The Iceberg Model
- Reinforcement Feedback Loop
- Balancing Feedback Loop
Now, let’s cover each of them and see how to apply them to our problem.
Concept Map
A Concept Map is a useful mental model as it allows us to visually display a system and pinpoint how the linkages between its parts. Having a concept map laid out for almost every project you’ll be dealing with will be quite beneficial.
Joseph Novak and Alberto Caňas, which are the creators of the Concept Map, say the following about it:
“Concept mapping has been shown to help learners learn, researchers create new knowledge, administrators to better structure and manage organizations, writers to write, and evaluators assess learning.”
There are mainly 3 steps that we need to go through to implement a Concept Map.
Step 1: Formulate a problem question
What are the exact questions that you need to answer to be able to visually represent the system in which the problem is situated? Start by asking “How does X work?”, “What’s the context of X in which it exists?” and “How is X linked to Y?”.
For example, we might want to ask the following:
- How does your algorithm work exactly?
- What’s the app ecosystem like in which the algorithm exists?
- How are the algorithm’s music curations linked to the user’s current mood?
Step 2: Identify key entities and sort them
Now that you have the context of the problem, try to create a list of the key entities that impact the problem and are linked to it. These entities might be people, algorithms, processes, places, protocols, and more.
Compile these entities in a list that should have about 20 entities in it. Have in mind that you might have fewer or more entities depending on the problem.
Now, that you have a list of entities try sorting it by specificity and/or importance. This will help you to uncover the hierarchy that is needed to create the Concept Map.
Step 3: Outline the map and fill it in
Use a whiteboarding tool like Miro and start adding entities according to your hierarchy and understanding of the problem. Then proceed with linking them together with various arrows that showcase the direction of impact.
Make sure to write the actual action of the said connection by adding phrases to the arrows like “adds to”, “creates”, “selects from”, “picks according to”…
When done, you should end up with something like this:
The above Concept Map is a simple one which is enough to get a sense of what the startup is doing and what questions we might have for them to see how we can help out.
Some of these boxes (e.g. algorithm) might have their specific concept maps. It all depends on the level of specificity you need/want to use.
The Iceberg Model
The Iceberg Models help us notice what are the underlying causes and implications of a system or event that aren’t apparent at first. It is very useful when approaching problems in Data Science as they often prove to be more complex than anticipated or noticed at first.
The Iceberg Model is composed of four main components through which we should go and they are known as:
- Events
- Patterns
- Structures
- Mental Models
Each of these components should be explored by asking questions like these:
Events: What is happening right now? What is being asked? What do we know for sure?
Patterns: Are there any trends? Has this happened before? What historical data do we have?
Structures: What might produce the patterns? How are the parts connected? Where do the parts originate from and where do they end?
Mental Models: Are there any beliefs, assumptions, or other mental models on which the system is built? What kind of Mental Models are they and how do they behave?
For example, the bank that approached us was pretty clear on what they wanted and the problem might have appeared as a simpler one to some of you.
You might have thought of it like: “Oh, banks have a lot of data and this is obviously a time series problem. Let’s get that data and start making time series for each ATM.”
But wait! Let’s use the Iceberg Model and uncover things underneath this request.
They want us to “…predict the amount of money taken out from their ATMs, 30 days in advance, and daily”. Why? For what use case? Why 30 days? Why daily? How often do they provision the ATM?
After asking these questions in an email we got the following response:
- 30 days seems like a convenient number so they picked that one
- knowing when an ATM will run out would make them fill it sooner and/or prioritize it hence the daily request
- they provision the ATMs every week
Knowing this, it makes more sense to model for 7 days out and not 30 days out. Do we need the daily then? Did the person asking the question know that adding more predictions adds to the overall uncertainty because each of those predictions has its uncertainty?
Moreover, we need to have in mind that considering each ATM as a separate time series does not give the benefit of sharing predictors across different ATMs. Is it more sound to consider a model that contains all ATMs or clusters of ATMs?
As for the patterns and their structures we can think of the following:
- salary patterns – when people get paid, ATMs get used more often
- location patterns – ATMs at urban locations will be used more often
- weather patterns – if the weather is horrible people won’t go outside that much
- calendar patterns – before events like Christmass people might use the ATMs more. What about other ethnicities and religions?
Reinforcement Feedback Loop
A Reinforcement Feedback Loop (RFL) is a loop that behaves in such a way that it amplifies itself. The best way to imagine it is if you placed a microphone next to a big speaker and shouted into the microphone. Soon enough you would regret doing that.
This is more of a phenomenon than a pure “mental model” but it is quite useful to notice as it can often be present in many systems and even in your thinking! Its outputs are exponential.
If you observe the above Concept Map and the startup’s information, you can spot the RFL. What if users that are depressed just want to listen to depressing music or pick that sort of a mood to enhance that feeling?
This might be as well the thing that is happening here. Music that you’re listening to might be bad for you at certain times and picking the right one can be harder than one might think.
Another example of an RFL you might observe is when a team is making decisions. It is known in the psychological literature that the more people there are in a group, the more extreme and risky the final decision will be.
This is because people tend to agree or follow people that they like more and even add a bit of boldness at the end as a signal. Then, this signal goes around and amplifies the loop that leads to the final decision.
To combat this, think about having your data team write their opinion on the problem in advance and have it be anonymous. After the answers are in, have a meeting about it and see what the best decision might be. Also, be sure to always have a devil’s advocate when making decisions.
Balancing Feedback Loop
A Balancing Feedback Loop (BFL) is such a loop that brings balance and often counters the Reinforcement Feedback Loop by going in the opposite direction. It can be used either to completely counter the RFL or to introduce a plateau.
A BFL can be spotted in many systems and it is often paired together with an RFL. If you’re making decisions in your data team, a devil’s advocate is the one that will do the balancing.
To help our startup from the example, we’ll ask them to add continuous checks of where the user’s mood is currently and what the mood goal towards which the user is headed is. It should also know what kind of a listening style the user has.
For example, does the user have self-destructive tendencies and picks music to amplify his current state or does the user pick that kind of music out of cathartic elements? If it is the former, we’d need a BFL to warn the user and shuffle the playlist in a more beneficial direction as the goal is mental health.
This essentially means that the algorithm should use more data and that the user should be checked on more often while listening to music and how close he is to reach the goal mood. Moreover, how much can the picked mood goal help with the user’s mental health?
Now we have a few questions for the startup and a few possible solutions.
Episode 2: The problem strikes back
Seems like the two problems proved to be deeper than we anticipated. After some more communication we see the problems unfolding:
The startup’s members implemented our suggestion and are now arguing about what approach to take and make the algorithm better. One member wants to add frequent surveys while the other wants to introduce biometric data tracking.
They’re asking us what would work better as we have a solid track record with wellbeing companies.
The guys from the bank have let us do whatever we want as long as they can get a useful model. Our data team is struggling to know how big exactly might this problem be and at what stage of it we are.
What are some good techniques for decision-making in Data Science?
Some good techniques for decision-making in Data Science are:
- Six Thinking Hats
- The Cynefin Framework
- The Ladder Of Inference
- Second-Order Thinking
Six Thinking Hats
Six Thinking Hats make us approach the problem from different standpoints which should lead us to the right decision. This can be used by a single person or by a data team (recommended). Here is how to use it:
Each team member will get his hat which will represent a different mode of reasoning. Have in mind that team members can also shuffle their hats if you wish to do so. The hat options we have are the following:
- Green = creativity – brainstorm ideas and let them run wild in many directions.
- Yellow = positivity – ponder all the benefits of an approach/decision.
- Black = negativity – ponder all the downsides and look for weaknesses.
- Red = emotions – how do you feel about this? What does your gut say? Why?
- Gray = analytical – focus on the data and be VERY rational.
- Blue = controlling – moderate the other hats so that you make progress. Watch out if one of them becomes too prominent and blocks the others from speaking.
As it is not our job to meddle in startup affairs and as we aren’t domain experts, I’ll introduce the startup to this model and ask them to contact us when they reach a solution. We need to know when to push back and shouldn’t always carry more load.
Because “Data Science” is more often than not a murky term, you might be expected by some companies to be a part of everything and be pulled in many directions. It is important to know when to say no and/or deflect a request.
The Cynefin Framework
The Cynefin Framework helps us make sense of a problem situation and address what are the steps we should go through. This Framework states that there are five categories of problem situations which are the:
- Clear
- Complicated
- Complex
- Chaotic
- Disorder
Each of the categories has its key characteristics that we use for classification. This model will help our bank example so that our data team knows the next general steps. Let’s go over each category and its characteristics:
The Clear category is characterized by problems where everything is clearly defined if straightforward cause-and-effect knowledge. It often solely requires the use of best practices to solve a problem and the solutions are easy to spot.
The course of action for this category is sense-categorize-respond:
- sense – understand the problem
- categorize – regression, classification, clustering…?
- respond – implement the apparent solution
The Complicated category is characterized by problems that require some pondering and might have multiple competing solutions that aren’t clear at first. We can call it the category of known unknowns. These problems often require some domain expert guidance.
The course of action is sense-analyze-respond. As the solution isn’t apparent at first it requires us to analyze the problem, data, context, and more to find an adequate response.
The Complex category seats obscure problems that aren’t clear enough at first and it requires investigation of the problem and its context to get a handle on it. These problems often get solved through discussions and experimentation.
The main course of action is to investigate and bring the problem to the complicated category.
The Chaotic category doesn’t have any stability and the causal relationships are unclear. These problems require us to introduce stability into them before doing anything else. The goal is to bring the problem down to the Complex category.
The main course of action is to act-sense-respond where the “act” is the stabilization.
The Disorder domain rules when you don’t know in what category your problem is in. The best thing to do here is to dissect the problem into multiple smaller ones and try categorizing those.
Now we can see that our bank problem was in the Complex category when we first got it and it is now starting to cross to the Complicated category. But, we still need to solve the problem of what kind of a problem this problem is. 😀
Episode 3: The return of the Data Scientist
We’ve started with two problems and have come a good way to finding optimal solutions to them. Truth be told, we’re not done and are still pondering what the best options are.
The startup has returned with a solution they agreed upon and are asking us to implement it to enhance their algorithm. We also need to nail down what kind of a problem the bank one actually is.
It is time to pull the final trigger and make some decisions before you get your data team to work as a Lead Data Scientist.
What are some good problem-solving techniques for Data Science?
Some good problem-solving techniques for Data Science are:
- First Principles
- Inversion Approach
- Ishikawa Diagram
- Issue Trees
- Abstraction Laddering
First Principles
First Principles allows us to build down the problem to the most basic components that don’t need/can’t be boiled down further. To implement this technique there are two main steps:
- Break the problem down into basic truths
- Use the basic truths to find the solution
To come to the First Principles of a problem we can follow the Socratic method that asks questions constantly. Thus, you start with one question and pose another one when you get the answer and you go until you reach the basic truth.
We can also ask the 5 Whys. This means that we ask ourselves 5 times Why is something the case. It is similar to the Socratic method. Have in mind that not all First Principles are the same for all people. What is the first principle for a domain expert might not be the one for you and vice-versa.
In our bank example here are some basic truths:
- ATMs need to be refilled
- some ATMs need to be refilled sooner than others
- ATMs differ in the amount of money in them
- the bank wants something that works
- the problem we’re facing is (?)
Let’s use the basic truths for different modeling problems in data science to frame ours:
Regression:
- Total money left in an ATM daily
- Daily prediction of taken out money
Classification:
- Daily % chance for the ATM to empty out
Aggregation:
- Summed amount of predicted ATM money take-outs
Optimization:
- The best route to take for money transportation
Survival:
- How many days until an ATM is empty?
Because there ain’t free lunch in Data Science, we’ll need to experiment with multiple solutions to see what the best one might be. We can email this to the bank to hear their opinion as their domain knowledge might choose the top 3 approaches.
Inversion Approach
The Inversion Approach helps us view the problem from a different angle and/or consider the worst possible scenarios. To implement this start thinking of bad solutions and ask yourself why is it bad and how can it be improved upon.
This Mental Model will aid us to see if the startup algorithm improvement solution is something that we can implement and see if there are any outstanding problems with it.
I’ll use this approach by asking the team to imagine that the implementation failed. Now, we brainstorm together why it failed, what was done/gone wrong, what mistakes we made, what we didn’t consider, and more.
Episode 4: The Intern Menace
Your data team is making good progress in solving the problems and things are in motion. When inspecting the team metrics and reviews, you’ve noticed that one of your interns is underperforming.
For example, he recently used a force push into the development branch that caused many issues in the codebase. His mentor gives him tasks on which he underperforms and the intern often reacts aggressively when you try to reason with him.
In Data Science, communication is key and all of us will make mistakes that will require communication to be resolved. Moreover, strong communication skills are quite beneficial as you’ll be often negotiating with stakeholders and your team.
What are some good techniques for improving communication skills in Data Science?
Some good techniques for improving communication skills in Data Science are:
- SBI
- The Minto Pyramid
- Assertive Communication
- Active Listening
SBI
SBI (Situation-Behavior-Impact) helps us give better feedback by removing emotions from it and making it clear and concise. It is mostly used when you need to give hard and/or negative feedback to someone. It can also be used for the positive feedback.
This technique is made of 4 main steps. Let’s go through them and apply them:
Situation
Start the feedback with a specific situation that occurred which serves as a common reference point and is specific.
For example: Remember when you had that task to implement an analysis pipeline last Friday?
Behavior
Refer to a specific behavior that you observed and want to talk about. Make sure to not give any judgments and leave the interpretation out of it.
For example: I’ve seen that you used a force git push to implement your solution.
Impact
Talk about the impact that behavior had and what you think and feel about it. Feel free to address what other people think and how it impacted things.
For example: That caused many issues in the codebase and other people’s commits were overridden. This made you get negative reviews and underperform on your task. Moreover, the implementation was sub-optimal.
Intent
Ask about the person’s intention and try to uncover if the person is aware of what he did and why he did it. Then, work together with the person to see how things can be made better and how to overcome issues.
For example: Can I know why you choose to force a push? Were you using git before or is this your first time? How can we work together to enhance your performance? What do you think of your mentor?
The Minto Pyramid
Nobody likes reading enormous amounts of text when a thing can be laid out in a clear and concise way. The Minto Pyramid helps us by organizing the message that it starts with a conclusion that leads to the arguments that support it and ends in detailed information.
To implement it, just follow that Minto Pyramid order of writing a message and try to keep the first (conclusion) part short by having it be 2-3 sentences max. Try to make the next two parts concise bullet points and link a more detailed explanation at the bottom if need be.
I had a guy in my team once that would write humongous messages which were wordy and overly technical to get a message across that could be written in, I kid you not, 3 sentences. Imagine waking up and opening Slack to see a book waiting for you to be read…
Where can I learn more?
While writing this article I stumbled upon an interesting website that actually houses many of the mental models we spoke about. It is called Untools and I highly recommend checking it out. I also recommend reading about logical fallacies as these mental models and techniques aren’t immune to them.
Closing thoughts
Being a good Data Scientist doesn’t primarily revolve around being a good coder and knowing how to get high scores on problem metrics. This is becoming very prominent as AutoML solutions are getting better each day.
The most important tool of a Data Scientist is his mind and the way he uses it. Knowing how to address problems, think about them, solve them, and effectively communicate are the main things that make you stand apart and drive business goals.
Thus, working each day on these skills and using mental models and techniques like these is useful and practical for you, your career, and the people that count on you.
References
“The Minto Pyramid Principle” by Barbara Minto
“Six Thinking Hats” by Edward de Bono
“A Leader’s Framework for Decisions Making” in Harvard Business Review
“Mental models for designers” by Wes O’Haire