Lists - Programming Problems in Java: A Primer for The Technical Interview (2014)

Programming Problems in Java: A Primer for The Technical Interview (2014)

Chapter 2. Lists

We begin our reference section on data types with the linked-list.

A linked-list is a basic data structure often used as the underlying data type of other structures such as stacks and queues. Linked-lists benefit from constant time insertion and removal, but come at the cost of linear time search. This is because linked-lists do not support random access; hence, search and removal require a full scan of the linked-list.

Common programming problems for lists are often exercises in identifying boundary cases, clever traversal techniques, and judicious use of temporary variables. In this section we will implement a linked-list and its operations, and then solve many of the basic questions about linked-lists.

2.1 Definition

A linked-list is a structure composed of list nodes. A list node contains some data for the node, as well as a reference to the next node in the list. In the following sections, we will use the definition below for a node.


Listing 2.1: Linked-List Node


class Node {
Node next;
int data;

Node(int data) {
this(null, data);
}

Node(Node head, int data) {
this.next = head;
this.data = data;
}
}


The first node, referred to as thehead, identifies the list. A reference to the head is enough to identify the entire list. Empty lists are identified as those whose head isnull.

2.2 Operations

The basic operations of a linked-list areinsert,find, andremove. More complex operations can be built up from these operations. In the next section we implement each of the basic operations.

2.2.1 Insertion

Insertion in a linked-list is constant time because the data type supports only a push-front operation. That is the current head of the list is replaced with a new node with the specified value. There are two special cases for insertion, inserting into an empty list and inserting into the head of a list. However, with a complete listing of the constructors of the data structure, this complexity is easily removed at the cost of requiring an update to head with the result of the insertion.


Listing 2.2: Insert into a Linked-List


Node insert(Node head, int data) {
return new Node(head, data);
}


Interviews often ask for insertion into a sorted list. A clever question for a candidate to ask is whether or not a list was the right choice for a container if the data needs to remain sorted. But for answering this question, the boundary cases are inserting into the head or tail of the list. To insert into a sorted linked-list, we look for the predecessor of where the data should be inserted. When found, we construct a new node, and splice it into the list by updating the propernextmember variables.


Listing 2.3: Insert into a Sorted Linked-List


Node insert_sorted(Node head, int data) {
if (head == null || data <= head.data) {
return new Node(head, data);
}
Node current = head;
while (null != current.next
&& current.next.data < data) {
current = current.next;
}
if (null == current.next) {
current.next = new Node(data);
} else {
current.next = new Node(current.next, data);
}
return head;
}


2.2.2 Find

Because random access is not supported, the only traversal supported considers each node in turn. Hence search in a linked-list is a linear time operation. As such, when retrieving a node of a linked-list by key, we must scan until a target has been located or every node has been considered. This is implemented in thefind operation.


Listing 2.4: Find in a Linked-List


Node find(Node head, int value) {
while (null != head && head.data != value) {
head = head.next;
}
return head;
}


2.2.3 Removal

Removal of data from a linked-list takes a reference to a node as an identifier. The boundary cases are removal of the head or tail, and deletion from a single-item list. Removal of the head requires special care, because lists are identified by their head element. If that is removed, then the head needs to be updated. A simpler method is to copy data of the successor into the head and remove the successor. However this trick requires special casing the single item list. As above, we require special handling for the head after calling remove. Removal from the tail requires maintenance of the tail predecessor.


Listing 2.5: Remove from a Linked-List


Node remove(Node head, Node target) {
while (head != null && head == target) {
head = head.next;
}
if (null == head) {
return null;
}
Node current = head;
while (null != current.next) {
if (current.next == target) {
current.next = current.next.next;
}
current = current.next;
}
return head;
}


Note that as the target of removal was a parameter, we can assume that it was notnulland the list was not empty. For this reason we did not verify that the head of the list was notnull. We will follow this pattern in the subsequent examples as well.

2.3 Circular linked-lists

A common modification to linked-lists is called a circular linked-list. The structure defining a node in a circular linked-list is identical to that of a node in a simple linked-list. However, the update and remove rules are modified to maintain that the tail of the list points back to the head.

Insertion into a circular linked-list is similar to that in a standard list except the head element must reference itself in a list of length 1. We also lose the idea of simply replacing the head with a new reference, since the tail refers to the head.


Listing 2.6: Insert into a Circular Linked-List


Node circular_insert(Node head, int data) {
Node insertion = new Node(data);
if (null == head) {
insertion.next = insertion;
return insertion;
}
if (head.data == head.next.data) {
head.next = insertion;
insertion.next = head;
return insertion;
}
int headData = head.data;
insertion.data = headData;
head.data = data;
insertion.next = head.next;
head.next = insertion;
return head;
}


Retrieval in a circular linked-list is almost as simple as that in a simple linked-list. Again, the entire list must be enumerated until the value has been found or every node has been visited. Unlike in a simple linked-list, termination of the search cannot rely on findingnull. Instead we must cache a reference to the head and use it to determine when the list has looped back.


Listing 2.7: Find in a Circular Linked-List


Node circular_find(Node head, int value) {
Node current = head;
while (null != current
&& current.data != value) {
current = current.next;
if (current.data == head.data) {
return null;
}
}
return current;
}


Removal from a circular linked-list is more complex than removal in a simple linked-list. As with insertion, this is because the tail element refers to the head element. To get around this, we do not remove the desired node, but instead shift all the elements of a list around it. This avoids the problems of updating the value of the predecessor’snext member variable.

Removal also requires checking the boundary case where there is but a single element in the circular linked-list. Then we must clear the list entirely.


Listing 2.8: Remove from a Circular Linked-List


Node circular_remove(Node head, Node target) {
if (null == head
|| (head == target && head.next == target)) {
return null;
}
Node next = target.next;
target.data = next.data;
target.next = next.next;
return target;
}


2.4 Doubly linked-lists

Doubly linked-lists are lists in which each node has both a reference to the next element in the list as well as the previous element in the list. A simple doubly linked-list has a head withnullas itsprevmember, and a tail withnullas itsnextmember. Doubly linked-lists can also be circular, with single element lists having both the head and tail referencing the same node. In the following examples we implement circular doubly linked-lists.

The constructor for a doubly linked-list is only a simple modification of a simple linked-listNodewhere aprevmember variable is added.


Listing 2.9: Doubly Linked-List Node


class DllNode {
DllNode prev;
DllNode next;
int data;

DllNode(int data) {
this(null, null, data);
}

DllNode(DllNode prev, DllNode next, int data) {
this.prev = prev == null ? this : prev;
this.next = next == null ? this : next;
this.data = data;
}
}


Inserting and removing from a circular doubly linked-list integrates inserting into a standard linked-list and the semantics of a circular linked-list. The implementation below pushes the list forward and inserts a new head. This is done by first setting thenextandprev members appropriately, and then correcting the list structure before returning.


Listing 2.10: Insert into a Doubly Linked-List


DllNode insert(DllNode head, int data) {
if (null == head) {
return new DllNode(data);
}
DllNode insertion =
new DllNode(head.prev, head, data);
insertion.prev.next = insertion;
insertion.next.prev = insertion;
return insertion;
}


The find operation for a doubly linked-list is identical to that of the find operation for a circular linked-list, aside from the parameter and return type. The previous code will operate on doubly linked- lists with only that modifcation.


Listing 2.11: Find in a Doubly Linked-List


DllNode find(DllNode head, int value) {
DllNode current = head;
while (null != current
&& current.data != value) {
current = current.next;
if (current.data == head.data) {
return null;
}
}
return current;
}


Removal from a circular doubly linked-list is simpler than that of a simple linked-list. This is because each node of doubly linked-list has a reference to its previous element. This makes splicing a node out of a doubly linked-list trivial, where in a simple linked-list we needed to remove the next node and swap data values.


Listing 2.12: Remove from a Doubly Linked-List


DllNode remove(DllNode head, DllNode target) {
if (head != null
&& head.next == head
&& head == target) {
return null;
}
target.prev.next = target.next;
target.next.prev = target.prev;
if (head == target) {
head = target.next;
}
return head;
}


2.5 Selection

Since a linked-list is unordered and does not support random access, the question of finding the value at interesting positions within the list requires clever solutions. These are called selection problems. To being, consider finding the middle element of a linked-list. A straightforward method is to count the number of elements and then scan for the element at that index.


Listing 2.13: Find the Midpoint of a Linked-List by Counting


Node mid_by_counting(Node head) {
if (null == head) {
return head;
}
Node trailing = head;
int size = 0;
while (null != head) {
size++;
head = head.next;
}
int mid = size / 2;
while (mid-- != 0) {
trailing = trailing.next;
}
return trailing;
}


This operation obviously requires one and a half full scans of the list. But we can do better. Consider what happens when we move two indices through the list in tandem. Concretely, we move one index a single node at a time and another two nodes at a time. When the first index is at node x, the second reference is at position 2x. So when the second node reaches the end of the list the first reference is at the midpoint! So indeed, we can do this with a single scan.


Listing 2.14: Find the Midpoint of Linked-List


Node mid(Node head) {
Node trailing = head;
while (null != head) {
head = head.next;
if (null != head) {
head = head.next;
trailing = trailing.next;
}
}
return trailing;
}


This trick can be generalized to find any element at some fraction of the length of the list. So for any k we can find the element at position length∕k. But often the question is finding the element at the fixed position k from the end. We can find the node at position k from the end of the list by modifying the trick above in that we move two references in tandem, but give one a head start.


Listing 2.15: Find the Node kth from the End


Node kth_from_end(Node head, int k) {
Node trailing = head;
while (k-- > 0 && null != head) {
head = head.next;
}
while (null != head && null != head.next) {
head = head.next;
trailing = trailing.next;
}
return trailing;
}


2.6 Modifications

Along with selection, lack of random access gives rise to a number of problems that ask for modifications of a list. The most basic is the removal of all nodes matching a certain value. The only modification required of theremovemethod is to maintain the current scan position in order to continue iterating after a node has been removed. Maintaining a trailing reference and removing leading nodes that match the value can accomplish this. In this approach, care must be taken to remove all leading nodes that match, else successive matches will not be handled correctly. In the implementation, successive matches are handled by the innerwhileloop in the listing below.


Listing 2.16: Remove Values from a Linked-List


Node remove_values(Node head, int value) {
while (head != null && head.data == value) {
head = head.next;
}
if (null == head) {
return null;
}
Node current = head;
while (null != current.next) {
if (current.next.data == value) {
current.next = current.next.next;
}
current = current.next;
}
return head;
}


Only a bit more complex is the issue of reversing the order of the nodes in a linked-list. In a linked-list, this can be done without constructing a new list by maintaining the current scan position while simultaneously modifying the node previously scanned. As the algorithm iterates over the list, it is reversing the successor-predecessor relationships. After full iteration, the list is reversed.


Listing 2.17: Reverse a Linked-List


Node reverse(Node head) {
Node prev = null;
while (null != head) {
Node temp = head.next;
head.next = prev;
prev = head;
head = temp;
}
return prev;
}


Reversing a circular doubly linked-list can be done with the same technique. However as is usual with circular linked-lists we must keep track of the terminal condition.


Listing 2.18: Reverse a Doubly Linked-List


DllNode reverse(DllNode head) {
if (null == head) {
return null;
}
if (head.next == head.prev) {
return head;
}
DllNode tail = head;
do {
DllNode temp = tail.next;
tail.next = tail.prev;
tail.prev = temp;
tail = temp;
} while (tail.data != head.data);
return head.next;
}


2.7 Multiple lists

Many problems require working with multiple lists simultaneously.

Above, we saw how we can find the midpoint of a list by moving two references in tandem. Using the same idea we can solve the problem of comparing two lists with a single pass over each. Comparison is only difficult in that there is no random access. We use multiple references, one for each list in question.

We first check that both lists are either empty or not empty. Afterward, we take it in turns to compare each element.


Listing 2.19: Compare Linked-Lists


boolean equals(Node x, Node y) {
while (null != x && null != y) {
if (x.data != y.data) {
return false;
}
x = x.next;
y = y.next;
}
return null == x && null == y;
}


The listing above compares two lists, but can easily be generalized to compare an arbitrary number simultaneously.

The other common operation on multiple linked-lists is merging, in which two or more lists become one. Since linked-lists are dynamically bound, merging two linked-lists can be done without the allocation of any new memory. A naive method to merge is to append one list to another by iteratively walking one list and inserting each node into the other. This method requires as many writes as there are elements in the list.

We can do better. We need only append one list to another, as in the listing below. Appending in this manner is a single write operation, although one list must be read to the end.


Listing 2.20: Merge Two Linked-Lists


Node merge(Node h1, Node h2) {
if (null == h1 || null == h2) {
return (null != h1) ? h1 : h2;
}
Node head = h1;
while (null != h1.next) {
h1 = h1.next;
}
h1.next = h2;
return head;
}


2.8 Identification

The final set of problems for linked-lists involves the identification of list properties. Below we consider two problems: determining whether or not a list is a palindrome, and determining if it has self-loops.

A palindrome is a sequence that is the same both forward and backward. Deciding if a linked-list is a palindrome is an exercise in combining state information and selection. At first glance, one can be perplexed how to efficiently compare the first and last nodes, the second and second from last nodes, et cetera until the midpoint. Let’s first consider this question with a doubly linked-list. In this case we have forward and backward enumeration, and a solution is easily found. We iterate forward and backward until the two iterators cross.


Listing 2.21: Identify a Doubly Linked-List as a Palindrome


boolean is_palindrome(DllNode head) {
if (null == head
|| head.data == head.next.data) {
return true;
}
DllNode tail = head.prev;
do {
if (head.data != tail.data) {
return false;
}
head = head.next;
tail = tail.prev;
} while (head.data != tail.data
&& head.data != tail.next.data);
return true;
}


Returning to a simple linked-list, we found ourselves without a reverse iterator. But a single scan with extra memory makes this simpler. We can put the nodes in a stack, and then inspect the list back to front. This allows us to iterate in reverse as we unroll the stack.


Listing 2.22: Identify a Linked-List as a Palindrome


boolean is_palindrome(Node head) {
Node temp = head;
Stack<Node> s = new Stack<Node>();
while (null != temp) {
s.push(temp);
temp = temp.next;
}
while (!s.empty()) {
if (head.data != s.peek().data) {
return false;
}
s.pop();
head = head.next;
}
return true;
}


The most famous interview question regarding lists is the identification of self-loops in a list. On first thought, the idea is simply to collect the nodes until we find two that are identical. Often a candidate will jump on the hash table solution. The reason is hash tables have constant time access, and collision detection is a basic problem in that data type. If any collisions are found, the first one is guaranteed to be the head of the loop. However, since the question was posed as a decision problem, we are doing too much work. This solution requires the allocation of new memory at least as large as the number of nodes in the list. A better option is a modification of our solution to the selection problem.

Consider what happens when we try to find the midpoint of a circularly linked-list. Instead of terminating, eventually the lead reference will catch up to the tail reference. So if ever the lead and trailing references coincide, we know there is a loop. But if the lead reference finds the terminatingnull, then we know there is no loop. We can turn this insight into a solution to the problem of identifying self-loops in a linked-list.


Listing 2.23: Find Loops in a Linked-List


boolean detect_loop(Node head) {
if (null == head) {
return false;
}
Node trailing = head;
Node leading = head;
while (null != leading) {
leading = leading.next;
if (trailing == leading) {
return true;
}
trailing = trailing.next;
if (null != leading) {
leading = leading.next;
}
}
return false;
}


This solves the problem without needing to allocate extra memory and with the added cost savings of not having to compute a hash value, resolve hash collisions, or dynamically resize a hash table. The elegance of this solution is the reason it has been asked as a Microsoft interview question for nearly 30 years.