Solution: Minimum Window Subsequence
Let's solve the Minimum Window Subsequence problem using the Sliding Window pattern.
Statement
Given two strings, str1
and str2
, find the shortest substring in str1
such that str2
is a subsequence of that substring.
A substring is defined as a contiguous sequence of characters within a string. A subsequence is a sequence that can be derived from another sequence by deleting zero or more elements without changing the order of the remaining elements.
Let’s say you have the following two strings:
str1
= “”
str2
= “”
In this example, “” is a substring of str1
, from which we can derive str2
simply by deleting both the instances of the character . Therefore, str2
is a subsequence of this substring. Since this substring is the shortest among all the substrings in which str2
is present as a subsequence, the function should return this substring, that is, “”.
If there is no substring in
str1
that covers all characters instr2
, return an empty string.
If there are multiple minimum-length substrings that meet the subsequence requirement, return the one with the left-most starting index.
Constraints:
-
str1.length
-
str2.length
str1
andstr2
consist of uppercase and lowercase English letters.
Solution
So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.
Naive approach
The naive approach would be to generate all possible substrings of str1
and then check which substrings contain str2
as a subsequence. Out of all the substrings in str1
that contain str2
as a subsequence, we’ll choose the one with the shortest length. Now, let’s look at the cost of this solution. We need two nested loops to get all possible substrings and another loop to check whether each substring contains all the required characters. This brings the time complexity to . Since we’re not using any extra space, the space complexity is .
Optimized approach using sliding window
The algorithm finds the smallest subsequence in one string that contains all characters of another string in order. It works as follows:
-
One pointer traverses through the first string character by character, while the second pointer tracks progress in the second string.
I. When a character in the first string matches the current character in the second string, the second pointer moves to the next character in the second string.
II. Regardless of whether a match is found, the first pointer continues traversing the first string.
III. Once all characters of the second string are matched, the algorithm backtracks and locates the smallest valid subsequence.
-
Once all the characters of the second string are matched, the algorithm backtracks. It moves the starting pointer leftward in the first string to find the smallest subsequence that contains all characters of the second string.
I. Backtracking is necessary because, after finding all characters of the second string in order, the algorithm minimizes the window size by removing unnecessary characters from the beginning of the matched subsequence.
-
After a subsequence is found, the algorithm continues searching for other possible subsequences by resetting the first pointer to the new starting position, resetting the second pointer to the beginning, and resuming the search for matches in the first string. The process repeats to find all possible valid subsequences.
-
The algorithm keeps track of the smallest valid subsequence found so far, updating it whenever a smaller subsequence is located during the backtracking process. The smallest subsequence is determined by its length.
-
The process continues until the entire string is traversed and the smallest subsequence is returned.
Note: In the following section, we will gradually build the solution. Alternatively, you can skip straight to just the code.
Step-by-step construction
The first step of the solution is to initialize the variables. We begin by creating two variables, size_str1
and size_str2
, to store the lengths of str1
and str2
, respectively. We then initialize min_sub_len
to infinity, which will be used to store the length of the minimum subsequence.
To help us traverse the two strings, we create two indexes, index_s1
and index_s2
, which initially point to the first characters of str1
and str2
, respectively. These indexes will be incremented as we scan through the strings to find the subsequence.
Finally, we initialize min_subsequence
to an empty string. This variable will store the output, which is the smallest possible subsequence.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.