Island Problems: DFS on Grid Structures

2902 words

The example problems in this article are from the LeetCode “Island Problems” series:

The DFS (Depth-First Search) problems we’re familiar with are typically performed on tree or graph structures. But the DFS problems we’ll discuss today operate on a “grid” structure. Island problems are classic representatives of this type of grid DFS problem. Traversing a grid is somewhat more complex than traversing a binary tree, and without the right approach, DFS code can easily become long and convoluted.

This article uses island problems as examples to demonstrate general DFS approaches for grid-based problems and how to keep the code concise. The main topics include:

  • Basic properties of grid-based problems
  • Methods and techniques for DFS traversal on grids
  • Solutions to three island problems
  • Related problems

DFS Traversal Methods for Grid-Based Problems

Basic Concepts of Grid Problems

Let’s first clarify how the grid structure in island problems is defined, so we have a common foundation for our discussion.

A grid problem consists of an m×nm \times n grid of small squares, where each square is considered adjacent to its four neighbors above, below, left, and right. The task is to perform some kind of search on such a grid.

Island problems are a classic type of grid problem. Each cell contains either 0 or 1. We treat cells with 0 as ocean cells and cells with 1 as land cells. Adjacent land cells connect to form an island.

Island problem example
Island problem example

With this setup, various island problem variants arise, including counting islands, calculating area, computing perimeters, and so on. However, most of these problems can be solved using DFS traversal.

Basic Structure of DFS

The grid structure is slightly more complex than a binary tree structure — it’s essentially a simplified graph structure. To write DFS traversal on a grid well, we should first understand DFS traversal on a binary tree, then draw analogies to write DFS traversal on a grid. A typical binary tree DFS traversal looks like this:

void traverse(TreeNode root) {
    // Check base case
    if (root == null) {
        return;
    }
    // Visit two adjacent nodes: left child, right child
    traverse(root.left);
    traverse(root.right);
}

As we can see, binary tree DFS has two key elements: “visiting adjacent nodes” and “checking the base case”.

The first element is visiting adjacent nodes. The adjacent nodes of a binary tree are straightforward — just the left child and right child. A binary tree is itself a recursively defined structure: a binary tree’s left subtree and right subtree are also binary trees. So our DFS traversal simply needs to recursively call on the left and right subtrees.

The second element is checking the base case. Generally, the base case for binary tree traversal is root == null. This condition actually carries two meanings: on one hand, it means the subtree pointed to by root is empty and we don’t need to traverse further. On the other hand, returning promptly when root == null prevents null pointer exceptions in the subsequent root.left and root.right operations.

For DFS on a grid, we can directly reference binary tree DFS and derive the two key elements for grid DFS:

First, how many adjacent nodes does a cell in the grid have? The answer is four: up, down, left, and right. For cell (r, c) (where r and c represent the row and column coordinates respectively), the four adjacent cells are (r-1, c), (r+1, c), (r, c-1), and (r, c+1). In other words, the grid structure is “four-branching”.

Four adjacent cells in a grid structure
Four adjacent cells in a grid structure

Second, what is the base case for grid DFS? Drawing from the binary tree base case, it should be cells where we don’t need to continue traversal and where grid[r][c] would cause an array index out-of-bounds exception — in other words, cells that are outside the grid boundaries.

Base case for grid DFS
Base case for grid DFS

This might seem slightly counterintuitive — can coordinates temporarily go outside the grid boundaries? I call this approach “act first, ask questions later” — regardless of which cell we’re currently on, we take a step in all four directions first, then check if we’ve gone out of the grid boundaries and return if so. This is the same as binary tree traversal: we make the recursive call first, then return when we find root == null.

With this, we arrive at the framework code for grid DFS traversal:

void dfs(int[][] grid, int r, int c) {
    // Check base case
    // If coordinates (r, c) are outside the grid, return immediately
    if (!inArea(grid, r, c)) {
        return;
    }
    // Visit the four adjacent nodes: up, down, left, right
    dfs(grid, r - 1, c);
    dfs(grid, r + 1, c);
    dfs(grid, r, c - 1);
    dfs(grid, r, c + 1);
}

// Check whether coordinates (r, c) are within the grid
boolean inArea(int[][] grid, int r, int c) {
    return 0 <= r && r < grid.length 
        	&& 0 <= c && c < grid[0].length;
}

How to Avoid Revisiting Cells

The biggest difference between grid DFS and binary tree DFS is that during traversal, we may encounter previously visited nodes. This is because the grid structure is essentially a “graph” — we can think of each cell as a node in the graph, with each node having four edges going up, down, left, and right. When traversing a graph, it’s natural to encounter nodes that have already been visited.

At that point, DFS may keep “going in circles” and never stop, as shown in the figure below:

DFS traversal may go in circles (animated)
DFS traversal may go in circles (animated)

How do we avoid such repeated traversal? The answer is to mark cells that have already been visited. Taking island problems as an example, we need to perform DFS traversal on all land cells with value 1. Each time we pass through a land cell, we change its value to 2, so when we encounter a 2, we know this cell has already been visited. In other words, each cell can have one of three values:

  • 0 — ocean cell
  • 1 — land cell (not yet visited)
  • 2 — land cell (already visited)

We add the statements to avoid repeated traversal to our framework code:

void dfs(int[][] grid, int r, int c) {
    // Check base case
    if (!inArea(grid, r, c)) {
        return;
    }
    // If this cell is not an island, return immediately
    if (grid[r][c] != 1) {
        return;
    }
    grid[r][c] = 2; // Mark the cell as "already visited"
    
    // Visit the four adjacent nodes: up, down, left, right
    dfs(grid, r - 1, c);
    dfs(grid, r + 1, c);
    dfs(grid, r, c - 1);
    dfs(grid, r, c + 1);
}

// Check whether coordinates (r, c) are within the grid
boolean inArea(int[][] grid, int r, int c) {
    return 0 <= r && r < grid.length 
        	&& 0 <= c && c < grid[0].length;
}
Marking visited cells
Marking visited cells

With this, we have a general DFS traversal method for island problems and various grid problems alike. The example problems discussed below only require minor modifications to this DFS traversal framework.

Tip:

In some solutions, you might see “already visited land cells” marked as 0, the same value as ocean cells, sometimes called the “land sinking method” — after visiting a land cell, it “sinks” into the ocean. While this approach may seem clever, it actually has significant risks because we can no longer distinguish between “ocean cells” and “already visited land cells.” If the problem is slightly more complex, this can easily lead to bugs.

Solutions to Island Problems

Once you understand the DFS traversal method for grid structures, island problems become straightforward. Let’s look at how to solve each of the three problems using DFS traversal.

Example 1: Max Area of Island

LeetCode 695. Max Area of Island (Medium)

Given a non-empty 2D array grid of 0’s and 1’s, an island is a group of adjacent 1’s (representing land), where “adjacent” means horizontally or vertically adjacent. You may assume all four edges of the grid are surrounded by 0’s (representing ocean).

Find the maximum area of an island in the given 2D array. If there is no island, return 0.

This problem simply requires performing DFS traversal on each island and calculating the area of each one. The method for calculating island area is also straightforward — as shown in the code below, we add one to the area for each cell we visit.

int area(int[][] grid, int r, int c) {  
    return 1 
        + area(grid, r - 1, c)
        + area(grid, r + 1, c)
        + area(grid, r, c - 1)
        + area(grid, r, c + 1);
}

The complete solution code is as follows:

public int maxAreaOfIsland(int[][] grid) {
    int res = 0;
    for (int r = 0; r < grid.length; r++) {
        for (int c = 0; c < grid[0].length; c++) {
            if (grid[r][c] == 1) {
                int a = area(grid, r, c);
                res = Math.max(res, a);
            }
        }
    }
    return res;
}

int area(int[][] grid, int r, int c) {
    if (!inArea(grid, r, c)) {
        return 0;
    }
    if (grid[r][c] != 1) {
        return 0;
    }
    grid[r][c] = 2;
    
    return 1 
        + area(grid, r - 1, c)
        + area(grid, r + 1, c)
        + area(grid, r, c - 1)
        + area(grid, r, c + 1);
}

boolean inArea(int[][] grid, int r, int c) {
    return 0 <= r && r < grid.length 
        	&& 0 <= c && c < grid[0].length;
}

Example 2: Making A Large Island

LeetCode 827. Making A Large Island (Hard)

On a 2D map, 0 represents ocean and 1 represents land. We may change at most one 0 (ocean) to 1 (land). After the land reclamation, what is the maximum area of an island on the map?

This problem is an upgraded version of the max area of island problem. Now we have the ability to reclaim land — we can turn one ocean cell into a land cell, potentially connecting two islands into one. So after land reclamation, what’s the largest possible island we can create?

The general idea isn’t hard to come up with: first, we calculate the area of all islands and mark each cell with its island’s area. Then we search for the ocean cell whose two adjacent islands have the largest combined area. For example, in the figure below, the ocean cell in the red box is adjacent to islands on its top and left sides. We can calculate that after converting it to land, the connected island area would be 7+1+2=107+1+2=10.

An ocean cell connecting two islands
An ocean cell connecting two islands

However, this approach can run into a problem. As shown in the figure below, the ocean cell in the red box is adjacent to islands on its top and left sides — but does the connected island area equal 7+1+77+1+7? Obviously not. Both 7’s come from the same island, so the island area after land reclamation should only be 7+1=87+1 = 8.

An ocean cell adjacent to the same island on two sides
An ocean cell adjacent to the same island on two sides

As you can see, for the algorithm to be correct, we need to be able to distinguish whether two adjacent 7’s of an ocean cell come from the same island. Therefore, instead of marking cells with island areas, we should mark them with island indices, and use a separate array to record each island’s area, as shown in the figure below. This way we can detect that the ocean cell in the red box has “two” adjacent islands that are actually the same one.

Marking each island with its index
Marking each island with its index

As we can see, this problem actually requires two passes of DFS on the grid: the first DFS pass traverses land cells, calculating each island’s area and labeling the islands; the second DFS pass traverses ocean cells, examining the land cells adjacent to each ocean cell.

That’s the basic idea for this problem. The actual code has some additional details to watch out for, but they’re not closely related to the main theme of this article. I’ll leave it to you to think about how to convert the above approach into code.

Example 3: Island Perimeter

LeetCode 463. Island Perimeter (Easy)

Given a 2D grid map of 0’s and 1’s, where 1 represents land and 0 represents ocean. Grid cells are connected horizontally and vertically (not diagonally). The entire grid is completely surrounded by water, but there is exactly one island (one or more connected land cells form the island).

There are no “lakes” within the island (a “lake” is water inside the island that is not connected to the water surrounding the island). Each cell is a square with side length 1. Calculate the perimeter of the island.

Problem example
Problem example

To be honest, DFS isn’t the optimal approach for this problem. For island perimeters, a direct mathematical method would be easier. However, this problem serves as an excellent example for understanding the DFS traversal process — just follow along and you’ll see.

Let’s revisit the basic framework for grid DFS traversal:

void dfs(int[][] grid, int r, int c) {
    // Check base case
    if (!inArea(grid, r, c)) {
        return;
    }
    // If this cell is not an island, return immediately
    if (grid[r][c] != 1) {
        return;
    }
    grid[r][c] = 2; // Mark the cell as "already visited"
    
    // Visit the four adjacent nodes: up, down, left, right
    dfs(grid, r - 1, c);
    dfs(grid, r + 1, c);
    dfs(grid, r, c - 1);
    dfs(grid, r, c + 1);
}

// Check whether coordinates (r, c) are within the grid
boolean inArea(int[][] grid, int r, int c) {
    return 0 <= r && r < grid.length 
        	&& 0 <= c && c < grid[0].length;
}

We can see that the dfs function returns immediately in several cases:

  • !inArea(grid, r, c) — coordinates (r, c) are outside the grid boundaries, which is the “act first, ask questions later” scenario I mentioned
  • grid[r][c] != 1 — the current cell is not an island cell, which further breaks down into two cases:
    • grid[r][c] == 0 — the current cell is an ocean cell
    • grid[r][c] == 2 — the current cell is an already-visited land cell

So what does this have to do with island perimeters? In fact, the perimeter of an island is the sum of all its “edges,” and these edges correspond exactly to the positions where the dfs function returns during DFS traversal. Looking at the problem example, we can classify the edges of the island’s perimeter into two categories, as shown in the figure below. Yellow edges are perimeter segments adjacent to grid boundaries, and blue edges are perimeter segments adjacent to ocean cells.

Classifying island perimeter edges into two categories When our dfs function returns because “coordinates (r, c) are outside the grid boundaries,” it has actually crossed a yellow edge; when the function returns because “the current cell is an ocean cell,” it has actually crossed a blue edge. This is how we connect island perimeters with DFS traversal, and our solution code practically writes itself:

public int islandPerimeter(int[][] grid) {
    for (int r = 0; r < grid.length; r++) {
        for (int c = 0; c < grid[0].length; c++) {
            if (grid[r][c] == 1) {
                // The problem guarantees only one island, so computing one is enough
                return dfs(grid, r, c);
            }
        }
    }
    return 0;
}

int dfs(int[][] grid, int r, int c) {
    // Returning because "coordinates (r, c) are outside the grid" corresponds to a yellow edge
    if (!inArea(grid, r, c)) {
        return 1;
    }
    // Returning because "the current cell is an ocean cell" corresponds to a blue edge
    if (grid[r][c] == 0) {
        return 1;
    }
    // Returning because "the current cell is an already-visited land cell" has nothing to do with the perimeter
    if (grid[r][c] != 1) {
        return 0;
    }
    grid[r][c] = 2;
    return dfs(grid, r - 1, c)
        + dfs(grid, r + 1, c)
        + dfs(grid, r, c - 1)
        + dfs(grid, r, c + 1);
}

// Check whether coordinates (r, c) are within the grid
boolean inArea(int[][] grid, int r, int c) {
    return 0 <= r && r < grid.length 
        	&& 0 <= c && c < grid[0].length;
}

Summary

After comparing the solution code for all three examples, you’ll notice that grid problem code is remarkably similar across problems. This category of problems is really the “easy once you know how” type. Once you understand the basic traversal methods for trees and graphs, and pick up a few small techniques, mastering grid DFS traversal is not difficult at all.

Island problems are classic representatives of grid problems. After solving a few island problems, you’ll find that other problems are essentially the same — for example, LeetCode 130. Surrounded Regions replaces the island’s 1’s and 0’s with the letters O and X, but the underlying approach is exactly the same.

However, some grid problems can’t be solved with DFS and require BFS traversal instead. In the next article, I’ll cover BFS application scenarios and coding techniques — stay tuned!