Path Sum: Subproblem Decomposition in Binary Trees

1330 words

Today’s problem: LeetCode 112 - Path Sum (Easy)

Given a binary tree and a target sum, determine whether there exists a root-to-leaf path such that the sum of all node values along the path equals the target sum. Return true or false.

Example:

Given the following binary tree and the target sum = 22, the tree contains a path 5->4->11->2 that satisfies the target sum.

Problem example
Problem example

Linked lists and binary trees are the two data structures you’ll encounter most often in interviews. In the previous article, we covered the traversal framework for linked lists. This time, we’ll move on to the traversal framework for binary trees. Binary tree problems are more common and have more variations than linked list problems, but once you grasp the fundamental ideas, they’re not difficult at all.

When it comes to the binary tree traversal framework, many people immediately think of preorder, inorder, and postorder traversal. But mechanically memorizing these three traversal orders isn’t very meaningful. What we really need to master first is the recursive thinking behind binary trees. So how exactly should we use recursion to solve binary tree problems quickly?

This article will cover:

  • The recursive structure of binary trees
  • Subproblem decomposition for binary tree problems
  • Solution to today’s problem: Path Sum
  • Related problems

Binary Trees and Recursion

How do we define a binary tree? A binary tree is a tree in which each node has at most two children. This is a correct definition, but it doesn’t help much with solving problems. What we need is the recursive definition of a binary tree:

  • An empty tree is a binary tree
  • If T1T_1 and T2T_2 are binary trees, then connecting them with a root node also produces a binary tree

As you can see, binary trees are inherently recursive. To traverse a binary tree, we first process the root node, and then the left and right subtrees — which are themselves binary trees — can be processed recursively. This is the principle behind recursive preorder traversal.

Processing a binary tree recursively
Processing a binary tree recursively

We learn about recursive functions early on and think they’re too simple, often forgetting their significance. Recursion is essentially about breaking a problem down into subproblems of the same kind, solving them by repeatedly calling itself. You may be more familiar with subproblems in the context of dynamic programming, but in fact, subproblems exist wherever there are recursive functions.

Many binary tree problems can be solved by decomposing them into subproblems. Once we figure out how to decompose the subproblems, determining which traversal order to use for the recursion becomes straightforward.

Now, let’s use today’s problem, Path Sum, to learn how subproblem decomposition works.

Subproblem Decomposition for Path Sum

As we all know, recursion has two key elements:

  • Repeatedly calling itself
  • A termination condition

When performing recursion on a binary tree structure, these two elements become:

  • Recursively calling itself on the two subtrees
  • Terminating the recursion at leaf nodes

The subtree calls are the critical part. We need to ensure that the subproblem being solved on the subtrees is the same type of problem as the original, so that we can recursively call the same function. The termination condition can be addressed later as a detail.

Looking at this Path Sum problem again, our goal is to find a root-to-leaf path with a sum of 22. If one of the subtrees contains a root-to-leaf path with a sum of 17, adding the root node’s value of 5 gives exactly 17 + 5 = 22. We’ve transformed the overall sum = 22 problem into two subproblems of sum = 17 on the subtrees, enabling recursive self-calls.

The Path Sum problem has several variations of varying difficulty. In this problem, we only care about paths from root to leaf, which is the simplest case and can be solved with the most straightforward recursive approach.

The path sum problem on the entire tree
The path sum problem on the entire tree
The path sum problem on a subtree
The path sum problem on a subtree

Based on the approach above, we can write the following recursive code:

boolean hasPathSum(TreeNode root, int sum) {
    int target = sum - root.val;
    return hasPathSum(root.left, target)
        || hasPathSum(root.right, target);
}

This code is obviously incomplete — we still need to handle the recursion termination conditions and some edge cases. The following sections discuss several details regarding null pointers and leaf nodes.

Details of the Recursive Solution

Detail 1: What does root == null mean?

void processTree(TreeNode root) {
    if (root == null) {
        // empty tree
    }
}

In a binary tree, root being null means an empty tree. But “empty tree” here has two meanings:

The first meaning is that the entire tree is empty. Binary tree problems generally require handling this case; otherwise, the interviewer will consider your edge case coverage insufficient.

The second meaning is that a particular subtree is empty. Since the function is called recursively, the parameter root can represent any subtree. In particular, both subtrees of a leaf node are empty, so the recursion will naturally encounter two root == null cases (as shown in 3 and 4 in the figure below). Typically, we use root == null as the recursion termination condition.

Terminating recursion at root == null
Terminating recursion at root == null

For our Path Sum problem, we can check at root == null whether the target sum has been met — that is, whether sum has been reduced to 0:

boolean hasPathSum(TreeNode root, int sum) {
    if (root == null) {
        return sum == 0;
    }
    int target = sum - root.val;
    return hasPathSum(root.left, target)
        || hasPathSum(root.right, target);
}

Detail 2: Do we need to check for leaf nodes?

However, submitting the code above on LeetCode gives a Wrong Answer. When the tree is empty and sum = 0, our code incorrectly returns true!

The error is that our recursion termination condition was written too hastily. A valid path does reduce sum to 0, but sum being 0 doesn’t necessarily mean a valid path has been found.

Let’s revisit the problem description:

Given a binary tree and a target sum, determine whether there exists a root-to-leaf node path such that the sum of all node values along the path equals the target sum.

We should check and terminate the recursion at leaf nodes, not at the leaf nodes’ two empty subtrees.

boolean hasPathSum(TreeNode root, int sum) {
    if (root == null) {
        return false;
    }
    if (root.left == null && root.right == null) {
        return root.val == sum;
    }
    int target = sum - root.val;
    return hasPathSum(root.left, target)
        || hasPathSum(root.right, target);
}

A useful rule of thumb: whenever the problem description mentions leaf nodes, you need to explicitly check for leaf nodes and terminate the recursion there.

Terminating recursion at leaf nodes
Terminating recursion at leaf nodes

Summary

Most binary tree problems are solved using recursion. When tackling binary tree problems, the steps to follow are:

  1. Determine whether the problem can be decomposed into subproblems, and what those subproblems should look like
  2. Decide whether to use preorder or postorder traversal
  3. Handle details such as null pointers and leaf nodes

Here are some related problems — only those closely related to this article’s example are listed:

Binary trees are a problem category with many patterns and techniques. What we’ve discussed here is only the simplest class of problems. Future articles will cover more advanced binary tree topics, including using global variables during traversal, iterative traversal, and more — stay tuned.