From Binary Tree Traversal to Backtracking

1652 words

This article focuses on backtracking algorithms, with two example problems:

Example 1, LeetCode 113 - Path Sum II (Medium)

Given a binary tree and a target sum, find all root-to-leaf paths where the sum of all node values along the path equals the target sum.

Example:

Given the following binary tree, with target sum = 22.

Problem example
Problem example

Return:

[
 [5,4,11,2],
 [5,8,4,5]
]

Example 2, LeetCode 78 - Subsets (Medium)

Given an integer array nums that contains no duplicate elements, return all possible subsets (the power set).

Example:

Input: nums = [1,2,3]

Output:

[
    []
    [1],
    [2],
    [3],
    [1,2],
    [1,3],
    [2,3],
    [1,2,3],
]

You might be wondering: why is the Path Sum problem from the previous article showing up again? And why use a binary tree problem to explain backtracking?

In fact, the process of solving problems with backtracking is closely related to tree traversal. In this article, we’ll look at the Path Sum problem from the perspective of traversal, use it to introduce the fundamental ideas behind backtracking, and then apply those ideas to solve a classic backtracking problem: Subsets.

This article will cover:

  • The binary tree traversal process: using Path Sum as an example
  • Recursion and traversal strategies in backtracking
  • Backtracking example: Subsets
  • Related problems

Looking at Path Sum from a Traversal Perspective

We just covered the Path Sum example in the previous article, and Path Sum II in this article is a variant of it. The two problems are nearly identical, except the required output is slightly different. Previously we only needed to output the number of paths; now we need to output all possible paths.

This small change in requirements also shifts our perspective on the problem. In the original Path Sum, we viewed the binary tree problem from the perspective of subproblems.

Viewing Path Sum from the subproblem perspective
Viewing Path Sum from the subproblem perspective

In Path Sum II, to obtain the actual paths, we need to view the binary tree problem from the perspective of traversal.

Viewing Path Sum from the traversal perspective
Viewing Path Sum from the traversal perspective

To output all possible paths, we need to track the current path during traversal, and save the path whenever we find it meets the condition.

The order of path traversal is essentially the order of DFS. When DFS enters a node, a node is added to the path; when DFS exits a node, a node is removed from the path. The following animated GIF illustrates this process.

Path changes during DFS (animated)
Path changes during DFS (animated)

For the Path Sum problem, we save the desired paths when we reach a leaf node, at the point when the path is at its longest.

The Fundamental Idea of Backtracking

How does this binary tree traversal problem connect to backtracking? Let’s look at the definition of backtracking:

Backtracking uses a trial-and-error approach. When it discovers through exploration that the current partial solution cannot lead to a valid answer, it undoes the last step — or even several steps — and tries other possible partial solutions to find the answer. — Backtracking - Wikipedia

Literally speaking, backtracking means “stepping back.” In binary tree DFS traversal, exiting a node is a form of backtracking. Backtracking and DFS are closely related. For example, when using DFS to traverse paths in this binary tree, after reaching node 7, the next step is to go back one node (the recursive function returns) and then enter node 2. This step of going back is backtracking.

Returning from a recursive call in DFS is a form of backtracking
Returning from a recursive call in DFS is a form of backtracking

Given the nature of the backtracking operation, we use a stack to record the current path during traversal. When entering a node, we push; when exiting a node, we pop — performing backtracking.

Language Tips: What data structure should you use as a stack in different languages?

In C++, vector is typically used.

vector<int> path;
path.push_back(13);
path.pop_back();

In Java, many people use ArrayList. However, ArrayList doesn’t have a convenient pop operation, so I recommend ArrayDeque instead.

Deque<Integer> path = new ArrayDeque<>();
path.addLast(13);
path.removeLast();

In Python, you can simply use a list.

path = []
path.append(13)
path.pop()

The final solution code is:

public List<List<Integer>> pathSum(TreeNode root, int sum) {
    List<List<Integer>> res = new ArrayList<>();
    Deque<Integer> path = new ArrayDeque<>();
    traverse(root, sum, path, res);
    return res;
}

void traverse(TreeNode root, int sum, Deque<Integer> path, List<List<Integer>> res) {
    if (root == null) {
        return;
    }
    path.addLast(root.val);
    if (root.left == null && root.right == null) {
        if (root.val == sum) {
            res.add(new ArrayList<>(path));
        }
    }
    int target = sum - root.val;
    traverse(root.left, target, path, res);
    traverse(root.right, target, path, res);
    path.removeLast();
}

The overall structure of the code is similar to the solution from the previous article, with the addition of a stack path to record the current path. There are two things to note about the push and pop operations on the stack:

  • Make sure to push immediately upon entering a node and only pop right before exiting the node, so that the current path stays in sync with the traversal progress.
  • After the leaf node check, do not return early — otherwise the pop operation at the end would be skipped, causing errors.

Both of these points require hands-on practice to fully understand. I recommend working through the example problem yourself.

The Traversal Approach for the Subsets Problem

Above we explored the backtracking idea within a binary tree problem. You might still feel it doesn’t quite look like classic backtracking. So let’s look at a quintessential backtracking problem: Subsets.

The Subsets problem asks us to enumerate all subsets of a set. There’s a simple strategy for generating subsets: for each subset, choose to include or exclude the first element, then make the same choice for the second element, and so on. This is the essence of backtracking.

Each element has two choices, so nn elements yield 2n2^n possible combinations, corresponding to 2n2^n subsets. We can draw out the decision process. The diagram below shows the decision process for the set {1,2,3}\{1, 2, 3\}, yielding 8 results.

Search tree for the Subsets problem
Search tree for the Subsets problem

This is once again a tree structure. In general, backtracking algorithms can represent their decision paths as a tree — called a search tree. The execution of backtracking is essentially a traversal of this tree. It just so happens that this one is a binary tree, which ties back to binary tree traversal.

So we can try writing the backtracking code using the tree traversal approach. Here, the stack holds the elements currently in the subset. The push operation adds an element to the subset (“choose 1”, “choose 2”, “choose 3” in the diagram above), and the pop operation removes an element from the subset (undoing the choice).

The complete code is:

public List<List<Integer>> subsets(int[] nums) {
    Deque<Integer> current = new ArrayDeque<>(nums.length);
    List<List<Integer>> res = new ArrayList<>();
    backtrack(nums, 0, current, res);
    return res;
}

void backtrack(int[] nums, int k, Deque<Integer> current, List<List<Integer>> res) {
    if (k == nums.length) {
        res.add(new ArrayList<>(current));
        return;
    }

    // Don't choose the k-th element
    backtrack(nums, k+1, current, res);

    // Choose the k-th element
    current.addLast(nums[k]);
    backtrack(nums, k+1, current, res);
    // Undo the choice
    current.removeLast();
}

This code looks very similar to the Path Sum II code — both use a stack, and the recursive parameters are alike. However, there are some subtle differences in how the recursive calls and push/pop operations are arranged.

Here, we perform push/pop before and after the recursive calls. This is because the array itself doesn’t have a recursive structure, so we need the push/pop operations to create different choices. The two recursive calls are actually identical, but since the contents of current differ, they represent two different decision paths.

The complexity of backtracking algorithms is generally quite high. Taking the Subsets problem as an example, you can see from the size of the search tree that the time complexity is a very high O(2n)O(2^n). However, this level of complexity is acceptable for backtracking — most backtracking problems don’t have more efficient solutions. In fact, this code beats 100% of submissions in terms of runtime.

Submission result
Submission result

Summary

Through these two examples, we’ve seen the close relationship between backtracking algorithms and binary tree traversal. When solving backtracking problems, we can first construct a search tree and then traverse it to solve the problem recursively.

It’s worth noting that the search tree in the Subsets example happens to be a binary tree — this is just a coincidence. In practice, search trees can be multi-way trees, and multi-way trees are actually more common.

This article covered the basic ideas of backtracking. There are many more techniques in backtracking, such as the Permutation and Combination series of problems, which will be explored in future articles.

Binary tree traversal problems (to deepen your understanding of traversal):

Backtracking problems (only two simpler ones are listed here; you can find more on LeetCode under the backtracking tag):