A decision tree is a versatile tool that can be applied to a wide range of problems. Decision trees are commonly used in business for analyzing customer data and making marketing decisions, but they can also be used in fields such as medicine, finance, and machine learning.
The most detailed decision trees can be incredibly complex, but simple decision trees are easy to create and interpret. They’re built around a series of yes/no questions that gradually narrow down your options until the most sensible decision is reached.
When making decisions in business, it helps to have a flexible office space designed around your needs. WeWork provides beautifully designed workplace solutions built with flexibility in mind, offering inspirational workspace environments in locations around the world that encourage creativity and foster innovation.
For even more flexibility, WeWork All Access and WeWork On Demand grant you and your teams access to workspaces and meeting rooms in hundreds of locations across multiple cities, allowing teams to do their best work wherever they are.
What is a decision tree?
A decision tree is a flowchart-like diagram mapping out all of the potential solutions to a given problem. They’re often used by organizations to help determine the most optimal course of action by comparing all of the possible consequences of making a set of decisions.
For example, a decision tree could be used to help a company decide which city to move its headquarters to, or whether to open a satellite office. Decision trees are also a popular tool in machine learning, as they can be used to build predictive models. These types of decision trees can be used to make predictions, such as whether a customer will buy a product based on their previous purchase history.
What are decision tree nodes and symbols?
Decision trees are made up of various connected nodes and branches, expanding outward from an initial node. The three types of nodes are decision nodes, chance nodes, and outcome nodes.
- Decision nodes are square-shaped and represent a point on the tree at which a decision can be made. For example: Should you host a barbecue on Saturday?
- Chance nodes are circle-shaped and represent a point on the tree at which there are multiple uncertain consequences. For example: What are the chances that it will rain on Saturday?
- Outcome nodes are triangle-shaped and represent the final endpoint of a series of decisions. For example: A sudden downpour ruins your barbecue.
Connecting these nodes are the branches of the decision tree, which link decisions and chances to their potential consequences. Evaluating the best course of action is achieved by following branches to their logical endpoints, tallying up costs, risks, and benefits along each path, and rejecting any branches that lead to negative outcomes.
How to make a decision tree step-by-step
You can use software tools or online collaboration platforms to create a decision tree, but all you really need is a whiteboard or a pen and paper.
- Draw your initial node. This square node represents the main decision you’re trying to make. For every possible action you can take at this point, draw a branch and label it with the name of that action. You can include additional information here, such as the financial cost of making that decision.
- Add nodes to the end of each branch. Now consider what happens in each labeled scenario. Would following that course of action lead to another decision point? If so, add another square and repeat the process. If the decision leads to a chance outcome, draw a circle node and try to determine the possible outcomes and the probabilities of each one occurring. In our simplified barbecue example, that would be the chance of it raining on the day.
- Expand the tree until every endpoint is reached. Continue adding decision nodes, chance nodes, and branches until there are no more choices you can make. Then cap off each branch with an outcome node. This outcome node describes the end result of following that path and should include some kind of value or score so that comparisons can be made between each endpoint.
Decision tree advantages and disadvantages
Depending on when and how they’re used, decision trees can come with certain advantages and disadvantages:
- They’re clear and easy to understand. From an infographic point of view, a well-constructed decision tree can condense a huge amount of data into an accessible format that every member of the organization can understand. Marketing teams, for example, don’t need to know the nitty-gritty detail of the statistical analysis behind a potential decision. A decision tree cuts through the noise to bring people the information they need to determine the most effective course of action.
- They’re only as good as the underlying data. Decision trees aren’t a crystal ball. If you don’t have high-quality data to begin with, or you can’t determine the probabilities of certain chance nodes occuring, the tree becomes exponentially less reliable as it carries on. Biases in your data or incomplete data sets will also introduce inaccuracies.
- They’re quick to design and simple to refine. Decision trees don’t need to be overly complex to be useful. Mapping out your options on paper can be done in minutes and helps to bring clarity to the decision-making process.
The role of decision trees in data science
We’ve mostly focused on the use of decision trees in choosing the most effective course of action in business, but this type of informational mapping also has practical applications in data mining and machine learning.
In this context, decision trees aren’t used to manually determine some optimal course of action, but rather as a predictive model to automatically make observations about a given dataset. These algorithms take in enormous amounts of information and use a decision tree to derive accurate predictions about new data points. For example, consider using the medical data of thousands of hospital patients to predict the likelihood of a person developing a disease.
Types of decision trees
There are two main types of decision trees in data science:
- Classification trees. A classification tree is a decision tree where each endpoint node corresponds to a single label. For example, a classification tree could take a bank transaction, test it against known fraudulent transactions, and classify it as either “legitimate” or “fraudulent.”
- Regression trees. A regression tree is a decision tree where the values at the endpoint nodes are continuous rather than discrete. That is, the regression tree predicts a real-valued output rather than a class label—for example, predicting a person’s salary based on their age and occupation.
Decision tree examples
Some examples of when you might use a decision tree include:
- Predicting whether a customer will leave (churn)
- Analyzing credit card data to identify fraudulent transactions
- Identifying which patients are at risk of developing a certain disease
- Forecasting stock market movements
Steve Hogarty is a writer and journalist based in London. He is the travel editor of City AM newspaper and the deputy editor of City AM Magazine, where his work focuses on technology, travel, and entertainment.