# 10 Algorithm Puzzles about Sequences

Some of my favorite problems on LeetCode.

LeetCode is an excellent website for practicing programming skills and preparing tech interviews. In this post, I will share with you 10 interesting problems from LeetCode about array or string algorithms. Most of them are concise in description but not intuitive for one to come up with the best solution at once. Meanwhile, the ideas and techniques used in these problems are applicable to many other similar ones.

## LC 373 Find K Pairs with Smallest Sums

Given two sorted integer arrays A and B, find the k pairs of (A[i], B[j]) with the smallest sums, where 0≤i<|A|, 0≤j<|B|.

Apparently, the possible candidates come from the first m elements in A and the first n elements in B, where m = min(|A|, k), n = min(|B|, k). A naive solution would enumerate all the m x n pairs and meanwhile maintain the smallest k pairs in a heap, which requires O(m n log k) time.

Instead, we could incrementally find the smallest k pairs in order by reducing it to a *k-way merge* problem as in LC 23 Merge k Sorted Lists. A key observation is that each time we find a pair (A[i], B[j]), since B is in the ascending order, the next greater pair containing A[i], if it exists, must be (A[i], B[j+1]). In other words, for each A[i] (0≤i<m), the candidate pairs containing it form a sorted list: (A[i], B[0]), (A[i], B[1]), …, (A[i], B[n-1]). Thus, the problem boils down to selecting the smallest k elements from the m sorted lists, which can be solved in O(k log m) time.

## LC 719 Find K-th Smallest Pair Distance

Given an integer array A of size n, find the k-th smallest distance among all the pairs, where the distance of a pair (A[i], A[j]) is defined as |A[i]-A[j]|.

Again, it would be easy to come up with an O(n² log n) heap-based answer, but a better solution can lead to O(n log n + n log W) time, where W is the largest possible distance. First, we sort A in the ascending order in O(n log n) time. Then we do binary search to find the smallest number d such that there are at least k pairs having distances not greater than d. For each 0≤d≤W, it takes O(n) time to check whether there are at least k pairs having distances not greater than d by using the *sliding window* technique. Basically, for each 0≤j<n, we track the smallest i such that A[i]≥A[j]-d, and as j increases, such smallest i could only move in the same direction as j.

To practice more about this sliding window technique, please check out LC 992 Subarray with K Different Integers.

## LC 239 Sliding Window Maximum

Given an integer array A of size n and a window size k, for each 0≤i<n-k, find the maximum value among A[i], A[i+1], …, A[i+k-1].

This is a classic application of the *monotonic queue* technique. In short, as we move the sliding window forward, we properly maintain all the future candidate indices in a deque, so that we don’t need to repeatedly examine k elements for each window. Each index j is pushed to the back of the deque as we encounter it, and will be popped from the front as it leaves the window. Whenever we want to add A[j] as a candidate, we would only keep an existing candidate i (i < j) in the deque if A[i]>A[j], otherwise A[j] will always be a better candidate than A[i] in future windows. Thus the candidate indices kept in the deque should always correspond to descending array elements, and we could pop from the back of the deque all the existing indices whose values are not greater than A[j]. Since each index is enqueued and dequeued at most once, the overall time complexity is O(n), regardless of k.

*Sliding Window Maximum* is a common primitive that repeatedly occur in many problems, especially in dynamic programming. A similar construct is *Next Greater Element*, which will be introduced in the next example.

## LC 84 Largest Rectangle in Histogram

Given an array A of n non-negative integers, return the maximum value of (j-i+1) * h(i, j) for any 0≤i≤j<n, where h(i, j) is the minimum value among A[i], A[i+1], …, A[j].

The brute force approach would be to enumerate all pairs of i and j, and for each i and j scan the whole interval [i, j] in search of the minimum value. This will result in an undesirable O(n³) time complexity. The best approach is to enumerate each element A[i] in the array, and find the largest index interval in which A[i] is the minimum. That requires us to efficiently precompute the closest larger elements for each A[i] to its both sides. The trick is to apply the same technique in the last question. By keeping a monotonic queue (or rather, a stack), in which candidate indices correspond to decreasing array elements, we could find the previous greater element for each element in O(n) time. Likewise, we could scan backwards to find the next greater element for each element in another O(n) time. Finally, enumerating each minimum element also takes O(n) time, so the overall time complexity is O(n).

Another problem that relies on this idea is LC 907 Sum of Subarray Minimums.

## LC 962 Maximum Width Ramp

Given an integer array A of size n, find the largest j-i such that 0≤i<j<n and A[i]≤A[j].

It would not be difficult to think of an O(n log n) time solution given we have learned the two monotonic queue examples. For each j, we want to find the smallest i such that A[i]≤A[j]. So at each j, we could use O(1) time to maintain a deque of candidates for i in the descending value order, and use O(log n) time to do binary search for the first candidate i satisfying A[i]≤A[j].

However, there exists an impressive O(n) time solution. Let’s define that a pair of i and j (i≤j) is a best pair if A[i]≤A[j] and has the largest value of j-i. If i₁<i₂, i₂ is only possible to be in a best pair if A[i₂]<A[i₁]. By the same token, if j₁<j₂, j₁ is only possible to show up in a best pair if A[j₁]>A[j₂]. So we could do a forward scan through A to collect all candidates for i in a monotonic queue Q1, and then do a backward scan to collect all candidates for j in another monotonic queue Q2. Conceivably, Q1 contains increasing indices corresponding to decreasing array values, and Q2 contains decreasing indices corresponding to increasing array values. Then we use *two pointers* to find a best pair from Q1 and Q2. Specifically, we reversely iterate through j-candidates in Q2, and for each j, we track the last i-candidate in Q1 satisfying i≤j and A[i]≤A[j]. As j increases, A[j] will decrease, and thus the pointer in Q1 should always move forward to make A[i] decrease accordingly. The overall time complexity is O(n).

As an exercise, one may want to try LC 862 Shortest Subarray with Sum at Least K. It might be easier than this one although it’s labeled as hard.

## LC 315 Count of Smaller Numbers After Self

Given an integer array A of size n, count the total number of pairs of A[i] and A[j] such that i<j and A[i]>A[j].

Let’s start with a brute force approach. We enumerate j from 0 to n-1, and for each j enumerate i from 0 to i-1 and count all the pairs of (i, j) satisfying A[i]>A[j]. Apparently, the total time complexity will be O(n²). What if we could use some efficient data structure to keep all the values we’ve seen as we scan through the array? Is it possible to make each query and each update an O(log n) operation so that the total time complexity is reduced to O(n log n) time?

The answer is positive. The data structure we might want to use is a *segment tree** *(alternatively, a *binary indexed tree*). To begin with, we need to discretize (or squash) the values in the array into integers from 0 to n-1 without changing the relative order between each pair of them. This can be done in O(n log n) time by mapping each element to the number of elements in the array less than itself. Then we iterate through the discretized array A’, maintaining a segment tree for counting the number of times that we’ve seen each value in [0, n-1]. Specifically, for each j, we first query the sum over the interval [0, A’[j]-1] to accumulate the final result, and then increment the value at A’[j] by one. Both the range query and the point update have an O(log n) time complexity.

LC 327 Count of Range Sum is a very similar question. Both of them can also be solved by divide-and-conquer algorithms in the same time complexity.

## LC 1674 Minimum Moves to Make Array Complementary

Given a positive integer M and an array A containing n integers in the range [1, M], find the minimum number of elements that need to be changed into another value in [1, M] so that A[i]+A[n-i-1] are equal for all 0≤i<n.

Let’s consider a particular pair of A[i] and A[n-i-1]. Suppose we want to make them sum up to S, then there will be three cases: (1) if A[i]+A[n-i-1] is already S, zero changes are needed; (2) otherwise, if 1≤S-A[i]≤M or 1≤S-A[n-i-1]≤M, exactly one change is needed; (3) otherwise, they require two changes. An optimal S will always be one of the unmodified A[i]+A[n-i-1] for some i. So if M is very large, we could enumerate candidates for S from all the unmodified pairs, and for each S, we sum up the changes required for all the pairs, which requires O(n²) total time.

But when M is relatively small, there’s a more efficient O(M+n) solution. As we already noticed, each pair of A[i] and A[n-i-1] contributes to 0, 1 or 2 required changes for different ranges of S. We want to maintain the total required changes for each possible S, and update values for different ranges of S as we scan through all the pairs of A[i] and A[n-i-1]. To efficiently achieve that, we need to employ the *difference array *used in LC 370 Range Addition: keep an array of V[i]-V[i-1] (1≤i≤M), and only update on two endpoints for each range addition. It takes O(n) time to update for all the pairs since each update takes a constant time. In the end, to restore V[i], we just need to calculate the prefix sum of the difference array in O(M) time.

LC 798 Smallest Rotation with Highest Score might be a good exercise on this technique.

## LC 730 Count Different Palindromic Subsequences

Given a string S that consists of n letters among {‘a’, ‘b’, ‘c’, ‘d’}, count the number of distinct palindromic subsequences of it.

This is a typical *dynamic programming* problem. Specifically, we define dp(i, j, c) as the number of distinct palindromic subsequences of S[i:j+1] (Python slice notation) starting and ending at character c . By summing up dp(0, n-1, c) for all possible c, we get the answer to the original question. The state transition is as follow (note that the edge case includes both i=j and i+1=j):

For string problems, the size of the alphabet is considered as a constant (4 in this problem). The total number of states is thus O(n²) and each state transition takes O(1) time, so the overall time complexity is O(n²).

As an exercise, you could check out LC 940 Distinct Subsequences II.

## LC 1147 Longest Chunked Palindrome Decomposition

Given a string S of length n, return the maximum number m such that S can be split into an array A of m substrings and for each 0≤i<m, A[i]=A[m-1-i].

This is another DP problem, but a bit more complicated than the last one. We define dp(i) as the maximum number k such that S[0:i+1] can be split into an array B of k substrings, S[n-i-1:n] can also be split into an array C of k substrings, and for all 0≤j<k, B[j]=C[k-1-j]. For each i satisfying 2(i+1)<n, S can be partitioned into a non-empty prefix S[0:i+1], a non-empty middle part S[i+1:n-i-1], and a non-empty suffix S[n-i-1:n]. In other words, it corresponds to a decomposition of length 2*dp(i)+1 for the original problem. When n is even, there’s also a special case where the middle part is empty, so the corresponding decomposition has a length of 2*dp(n/2–1). In summary, the final answer will be:

Now let’s consider the state transition following the pattern in LC 139 Word Break. For each i, we could enumerate the suffices of S[0:i+1] and check whether it equals to the same-length prefix of S[n-i-1:n]. In other words, for each 0≤j≤i, we can reduce the problem of dp(i) to dp(j-1) when S[j:i+1] equals to S[n-i-1:n-j].

But naively checking whether S[j:i+1] equals to S[n-i-1:n-j] adds another O(n) time to the state transition. That could be avoided by solving another DP beforehand. Let’s define q(i, j) as whether S[i:j+1] is equal to S[n-j-1:n-i] for i≤j≤n/2, and we can easily formulate its state transition as follow:

Solving this DP requires an extra O(n²) time, but it helps to reduce the time complexity of the original DP to O(n²):

A similar question is LC 132 Palindrome Partitioning II, where we also accelerate one DP by solving another auxiliary DP.

## LC 1397 Find All Good Strings

Given strings S₁ and S₂ of the same length n, and an evil string P of length m, count the number of good strings that are in the lexicographical range [S₁, S₂]. A string is good if it doesn’t contain P as a substring.

This problem is a good exercise on the *KMP algorithm*. Given a string S of length n, the KMP algorithm could help to check whether S contains P as a substring in O(n+m) time. To begin with, we use O(m) time to precompute an auxiliary array, which, for each 0≤i<m, contains the largest j≤i such that P[0:j] equals to P[i+1-j:i+1]. Based on this auxiliary array, we could develop a helper function p(j, c) (c is a character), which returns the largest k such that P[0:k] equals to a suffix of P[0:j]+c. Then we use O(n) time to iterate through S, and for each 0≤i<n, we track the longest prefix of P that matches any suffix of S[0:i+1].

Back to this problem, let’s define f(S) as the number of good strings of length n that are lexicographically smaller than S. By using the KMP algorithm, we can check whether S₂ contains P as a substring. If the result is true, the final answer would be f(S₂)-f(S₁)+1; otherwise, the final answer would be f(S₂)-f(S₁).

Now the problem boils down to computing f(S). Let’s define dp(i, j) as the number of strings s of length i (i≥0), such that P[0:j]+s is a good string. By using the helper function p defined earlier, the state transition can be formulated as follow:

Then we enumerate all the prefixes of S, and for each prefix S[0:i+1], we count the number of good strings of length n, which have the same prefix as S[0:i+1] but a lexicographically smaller suffix than S[i+1:n]. That number is actually dp(n-i-1, j), where j is the length of the longest prefix of P that matches any suffix of S[0:i+1] as tracked by the helper function p. The result of f(S) is the sum of dp(n-i-1, j) for all the prefixes.

The pattern of this solution (i.e. *digit DP*) can be applied to many other counting problems. As an exercise, I would recommend LC 1067 Digit Count in Range .

Thanks for reading through this post. I really wish you enjoy this.

Happy coding!