Trees - Programming Problems in Java: A Primer for The Technical Interview (2014)

Programming Problems in Java: A Primer for The Technical Interview (2014)

Chapter 3. Trees

We continue our review of data types with a discussion on binary search trees.

As a data structure, the binary tree overcomes the inefficiencies of retrieval in a linked-list. However, the logarithmic running time for retrieval comes at the cost of increasing the time for insertion and deletion from constant to logarithmic time. Algorithms for binary tree operations are also more complex to write than the analogous algorithms for linked-lists. For instance, for operations where a linked-list used iteration, the operation on a binary tree may rely upon recursion. In this section we will review the basic operations on binary search trees, and begin to solve more complex interview questions.

3.1 Definition

A binary tree is composed of tree nodes. Tree nodes consist of a data member variable and the address of two other nodes. These other nodes are called children, and are distinguished as left and right.


Listing 3.1: Binary Tree Node


class Node {
int value;
Node left;
Node right;

Node(int value) {
this.value = value;
}
}


As was the case with a linked-list, a binary tree is a referred to by a reference to a designated node. This node is called the root. Empty trees are those whose root isnull.

A binary search tree is a binary tree that maintains a sort order between the parent and children nodes. The order invariant is whenever a node has a left or right child,node.value > left.valueandnode.value <= right.value. A binary search tree is balanced when for every node, the depth of the sub-trees from the left child and right child differ by at most one. Most of the complexity bounds discussed below hold only for balanced binary search trees

In this section, we will focus our attention on operations and algorithms concerning a binary search trees. We further assume that the binary search trees are balanced. Finally, as before we assume that the tree operations below are responsible for memory management but the caller is responsible for updating the root.

3.2 Operations

The basic operations of a binary search tree areinsert,find, andremove. In the following we will also look at the operations offind_parentandfind_successor, as we will see that they are necessary precursors for node removal.

3.2.1 Insertion

Insertion in a binary search tree is a logarithmic time operation. This is because to maintain the order invariant, an insertion requires search to identify which node in a tree will become the parent of the inserted node. After a parent node is identified by a find operation, only a constant number of operations are needed to complete insertion.


Listing 3.2: Insert into a Binary Search Tree


Node insert(Node root, int val) {
if (null == root) {
return new Node(val);
}
if (val < root.value) {
root.left = insert(root.left, val);
} else {
root.right = insert(root.right, val);
}
return root;
}


Also notice the use of recursion during the retrieval phase of insertion which is characteristic of binary search trees. Whereas algorithms on linked-lists are usually iterative, operations on trees are often recursive.

3.2.2 Find

The utility of a binary search tree is derived from the logarithmic time search guarantee of the data structure. This guarantee is only for balanced binary search trees. Search is the motivation for use of a binary tree, and as such is the fundamental operation on binary search trees.


Listing 3.3: Find in a Binary Search Tree


Node find(Node root, int val) {
if (null == root || root.value == val) {
return root;
}
if (val < root.value) {
return find(root.left, val);
}
return find(root.right, val);
}


Note that the logarithmic time for search is guaranteed because when we branch left or right, we are decreasing the search space by a constant fraction. Each branch cuts the search space in half. We can only divide the search space by two a logarithmic number of times before we are left with a single element.

3.2.3 Removal

As with insertion, removal of a value from a binary search tree has logarithmic running time. The search is required for more than the care needed to maintain the order invariant. Indeed, search is required to find the next value in the binary search tree that is compatible with the position of the node being removed. When this is found, that value is replaced and the node found is removed.

In order to carefully write the algorithm for removal, we will first define two ancillary functionsfind_parentandfind_successor. Finding the parent of a node is a straightforward operation. It is simply a modified find operation in which the parent node is returned, not the child. The solution below returnsnullif the node is the root.


Listing 3.4: Find Parent in a Binary Search Tree


Node find_parent(Node root, Node target) {
if (root == target) {
return null;
}
while (root.left != target
&& root.right != target) {
if (target.value < root.value) {
root = root.left;
} else {
root = root.right;
}
}
return root;
}


The find parent operation is used to find the successor of a node in a binary search tree. Every node in binary search tree has a data member with a comparison operation. Suppose the data in a binary tree is sorted in increasing order. The successor of a target node is a node in a binary search binary tree whose data would immediately follow the target node’s data in a sorted list of all the data members associated with the binary tree. The successor plays a role in removal from a binary search tree in that it can replace the target node and maintain the order invariant.


Listing 3.5: Find Successor in a Binary Search Tree


Node find_successor(Node root, Node target) {
Node successor = target.right;
if (null != successor) {
while (null != successor.left) {
successor = successor.left;
}
return successor;
}
do {
if (null != successor) {
target = successor;
}
successor = find_parent(root, target);
} while (null != successor
&& successor.right == target);
return successor;
}


Removal from a binary search tree has a number of cases that need to be considered to maintain the order invariant. Removing a leaf node is simple, since the node has no children it can be deallocated and the parent updated. If a node has a single child, then that child can replace the node in the binary search tree and the node deallocated. In the final case when a node has two children, we want to replace the node with the value of its successor. Doing so would maintain the order invariant at the node’s location in a binary search tree. We then need to recursively remove the successor from the tree to avoid having a duplicate node. Altogether, this provides the following algorithm.


Listing 3.6: Remove a Node from a Binary Search Tree


Node remove(Node root, Node target) {
if (null != target.left
&& null != target.right) {
Node next = find_successor(root, target);
int data = next.value;
remove(root, next);
target.value = data;
return root;
}
if (null != target.left
|| null != target.right) {
Node temp = (null != target.left)
? target.left
: target.right;
target.value = temp.value;
target.left = temp.left;
target.right = temp.right;
return root;
}
if (root == target) {
return null;
}
Node parent = find_parent(root, target);
if (parent.left == target) {
parent.left = null;
} else {
parent.right = null;
}
return root;
}


3.3 Tree properties

Tree algorithms rely on many properties of a tree. For instance a tree’s size, depth and whether or not it is balanced. Below we provide algorithms for calculating commonly asked properties of a binary tree.

3.3.1 Size

The size of a binary tree is its number of nodes. Notice that from any node, the size of the sub-tree rooted at that node is one more than the sum of the size of the sub-trees rooted at its children. We can easily calculate the size of a binary tree recursively.


Listing 3.7: Size of a Binary Tree


int size(Node root) {
if (null == root) {
return 0;
}
return 1 + size(root.left) + size(root.right);
}


As an exercise, consider how to calculate the size of a binary tree without recursion. In this alternative we can count the number of nodes while iterating through the tree.


Listing 3.8: Size of a Binary Tree Iterative


int size_iterative(Node root) {
if (null == root) {
return 0;
}
int count = 0;
Queue<Node> q = new LinkedList<Node>();
q.add(root);
while (!q.isEmpty()) {
Node front = q.poll();
count++;
if (null != front.left) {
q.add(front.left);
}
if (null != front.right) {
q.add(front.right);
}
}
return count;
}


The traversal used above is called a breadth first. We will examine it in a more general setting later.

3.3.2 Depth

The depth of a binary tree is the length of the longest simple path from the root to a leaf node. The depth from a node is one more than the depth of that node’s deepest sub-tree, so there is an immediate recursive solution.


Listing 3.9: Depth of a Binary Tree


int depth(Node root) {
if (null == root) {
return 0;
}
return 1 + Math.max(depth(root.left),
depth(root.right));
}


Let’s consider an iterative solution. We will again traverse the tree, but this time repeatedly to the leaf nodes. In order to determine if we are branching left or right in a depth first enumeration, we keep track of the list of visited nodes in a set. At any point in the algorithm below, the stack contains the path from the root to the node on top of the stack, so the current depth is equal to the size of the stack. The max depth of any leaf is equal to the depth of the binary search tree.


Listing 3.10: Iterative Depth of a Binary Tree


int depth_iterative(Node root) {
if (null == root) {
return 0;
}
int max_depth = 0;
Set<Node> visited = new HashSet<Node>();
Stack<Node> s = new Stack<Node>();
s.push(root);
while (!s.isEmpty()) {
Node top = s.peek();
if (null != top.left
&& !visited.contains(top.left)) {
s.push(top.left);
} else if (null != top.right
&& !visited.contains(top.right)) {
s.push(top.right);
} else {
visited.add(top);
max_depth = Math.max(max_depth, s.size());
s.pop();
}
}
return max_depth;
}


The traversal used above is called a depth first traversal. As with the breadth first traversal, we will examine it in a more general setting later.

3.3.3 Balance

A binary tree is balanced if for every node the difference between the depth of its left and right sub-trees is at most one. To determine whether or not a binary tree is balanced we must evaluate whether or not every sub-tree is balanced. Doing so naively would require multiple calculations of depth.


Listing 3.11: Determine if a Binary Tree is Balanced Naive


boolean balanced_naive(Node root) {
if (null == root) {
return true;
}
if (!balanced(root.left)
|| !balanced(root.right)) {
return false;
}
int left_depth = depth(root.left);
int right_depth = depth(root.right);
return Math.abs(right_depth - left_depth) <= 1;
}


However, this solution is wasteful. We are repeatedly recalculating the depth, and in an unbalanced tree this is a quadratic solution.

A linear solution would require modifying the recursion to return the depth of the sub-tree as well as if it is balanced or not. The run time gains come from reusing the partial calculations made in determining if the left and right sub-trees are balanced.


Listing 3.12: Determine if a Binary Tree is Balanced


boolean balanced_depth(Node root, int[] depth) {
if (null == root) {
depth[0] = 0;
return true;
}
int[] left = new int[1];
int[] right = new int[1];
if (!balanced_depth(root.left, left)) {
return false;
}
if (!balanced_depth(root.right, right)) {
return false;
}
depth[0] = 1 + Math.max(left[0], right[0]);
if (Math.abs(left[0] - right[0]) > 1) {
return false;
}
return true;
}

boolean balanced(Node root) {
int[] depth = new int[1];
return balanced_depth(root, depth);
}


In this solution we use a helper method for the recursive depth calculation. Note that sinceJavadoes not allow primitives or immutable objects by reference, we wrap the primitive integer within anint[]to pass the depth by reference.

3.3.4 Ancestry

The ancestry of a node in a binary tree is the path from the root to the node. A mathematical property of a tree is that there is exactly one simple path between any two nodes in a tree. The calculation of ancestry in a binary search tree is extremely efficient. We need only store the search path used in the find operation.


Listing 3.13: Find Path from Root to Node


boolean find_path(Node root,
Node target,
List<Node> path) {
if (target == null) {
return true;
}
while (null != root
&& (path.isEmpty()
|| path.get(path.size() - 1)
!= target)) {
path.add(root);
if (target.value < root.value) {
root = root.left;
} else {
root = root.right;
}
}
if (path.get(path.size() - 1) == target) {
return true;
} else {
path.clear();
return false;
}
}


Notice that in this solution, we provide the caller with more information than is necessary. If the node is found in the tree, then we return true. When we return false, the path provided is the ancestry of the node if it were to be inserted in the binary search tree.

Ancestry in a binary tree requires more computation, but could be accomplished with a depth first search.

3.3.5 Lowest common ancestor

The lowest common ancestor of two nodes in a binary tree is the last common node in the paths from the root to each node. This node can be found by using our previous implementation offind_path. Having calculated the path from the root to a node, we can compare the two paths until they diverge.


Listing 3.14: Lowest Common Ancestor from Path to Root


Node lca_from_path(Node root, Node x, Node y) {
List<Node> x_path = new ArrayList<Node>();
List<Node> y_path = new ArrayList<Node>();
find_path(root, x, x_path);
find_path(root, y, y_path);
Node lca = null;
Iterator<Node> x_path_iterator =
x_path.iterator();
Iterator<Node> y_path_iterator =
y_path.iterator();
Node current_x, current_y;
while (x_path_iterator.hasNext()
&& y_path_iterator.hasNext()) {
current_x = x_path_iterator.next();
current_y = y_path_iterator.next();
if (current_x != current_y) {
break;
}
lca = current_x;
}
return lca;
}


This solution works for binary trees, but in binary search trees there is a solution that does not require pre-generating either path from the root to the nodes. If we were to consider calculating the paths in tandem, a node would be in both paths only if both parameters were left children or right children of the target node. This observation leads to a recursive solution.


Listing 3.15: Lowest Common Ancestor Recursive


Node lca_recursive(Node root, Node x, Node y) {
if (root.value == x.value
|| root.value == y.value) {
return root;
}
if ((x.value < root.value
&& y.value >= root.value)
|| (y.value < root.value
&& x.value >= root.value)) {
return root;
}
if (x.value < root.value) {
return lca_recursive(root.left, x, y);
}
return lca_recursive(root.right, x, y);
}


An iterative solution to this problem is easier than iterative solutions to calculating size or paths. Since determining the least common ancestor does not require keeping state, we have no need of an ancillary queue or stack.


Listing 3.16: Lowest Common Ancestor Iterative


Node lca_iterative(Node root, Node x, Node y) {
if (null != x
&& null != y
&& x.value > y.value) {
Node temp = y;
y = x;
x = temp;
}
while (null != root
&& root.value != x.value
&& root.value != y.value
&& (y.value < root.value
|| x.value >= root.value)) {
if (y.value < root.value) {
root = root.left;
} else {
root = root.right;
}
}
return root;
}


3.4 Reconstruction

This final discussion of trees deals with the question of how to reconstruct a binary tree from a traversal. This question was a favorite of interviewers at Google for some time. We begin with a discussion of the types of traversals available for a binary tree.

3.4.1 Traversal

As we have seen, many algorithms on binary trees require enumeration of the nodes instead of a search. An enumeration of the nodes in which each data member is visited is called a traversal. We have seen two iterative traversals of a tree above, breadth-first traversal was used to iteratively calculate the size of a tree and depth-first traversal was used to iteratively calculate the depth of a tree.

Traversal of a binary tree is easily accomplished by recursion. At any node recursively visit the children nodes. It is the order in which nodes are visited which differentiates the three binary tree traversals. These are in-order traversal, post-order traversal, and pre-order traversal.

An in-order traversal guarantees that the data members of a binary search tree will be visited in their sort order. This is accomplished by first recursing left, visiting the current node, and then recursing right.


Listing 3.17: In-Order Traversal


void inorder(Node root) {
if (null == root) {
return;
}
inorder(root.left);
System.out.println(root.value);
inorder(root.right);
}


A post-order traversal evaluates a current node’s data value only after recursively visiting both children.


Listing 3.18: Post-Order Traversal


void postorder(Node root) {
if (null == root) {
return;
}
postorder(root.left);
postorder(root.right);
System.out.println(root.value);
}


A pre-order traversal is the opposite of a post-order traversal. A pre-order traversal evaluates a current nodes data value prior to recursively visiting either children.


Listing 3.19: Pre-Order Traversal


void preorder(Node root) {
if (null == root) {
return;
}
System.out.println(root.value);
preorder(root.left);
preorder(root.right);
}


Traversals are useful in serializing a tree to a list of values, and every specific type of traversal produces a deterministic list for a given tree. However suppose we have a list containing a traversal of a general binary tree. Even if that list contains only two values, we can reconstruct multiple trees from an in-order traversal. The reason is that the traversal does not specify which element is the root. In the case where the list contains 3 or more values, pre-order and post-order have ambiguity. Pre-order traversals are ambiguous as to whether or not the second element is a right or left child of the root. Post-order traversals are ambiguous as to whether or not the second to last element is a right or left child of the root. Thus, no single traversal is enough to reconstruct a binary tree. This leads us to the next problem.

3.4.2 Reconstructing a tree from a traversal

We show how to reconstruct a binary tree given any two distinct recursive traversals. The key to solving this problem is recognizing the root from one traversal, and then dividing the traversals into left and right sub-trees using the other. In the following, we provide an algorithm for tree reconstruction from an in-order traversal and a pre-order traversal.

In a pre-order traversal, the first element is always the root of the next sub-tree. The ambiguity is whether the next value is a left or right child of the root. In the in-order traversal, given an index in the traversal, all values to the left are in the left sub-tree and all values to the right are in the right sub-tree. In the solution below, we first extract the root from the pre-order traversal. We then find the index of this value in the in-order traversal. If the value does not appear at the beginning of the in-order traversal there is a left sub-tree. If it does not appear at the end, there is a right sub-tree. In either case we extract from each traversal the subsequences corresponding to a sub-tree, and recurse.


Listing 3.20: Reconstruction of a Binary Tree


Node reconstruct_tree(List<Integer> inorder,
List<Integer> preorder) {
Node node = null;
int inorder_pos;
int preorder_pos;
List<Integer> left_preorder;
List<Integer> right_preorder;
List<Integer> left_inorder;
List<Integer> right_inorder;
if ((preorder.size() != 0)
&& (inorder.size() != 0)) {
node = new Node(preorder.get(0));
inorder_pos = inorder.indexOf(preorder.get(0));
left_inorder = inorder.subList(0, inorder_pos);
right_inorder = inorder.subList(
inorder_pos + 1,
inorder.size());
preorder_pos = left_inorder.size();
left_preorder = preorder.subList(
1,
preorder_pos + 1);
right_preorder = preorder.subList(
preorder_pos + 1,
preorder.size());
node.left = reconstruct_tree(left_inorder,
left_preorder);
node.right = reconstruct_tree(right_inorder,
right_preorder);
}
return node;
}


Similar algorithms can be rewritten for in-order and post-order, as well as pre-order and post-order.