Repeated DNA Sequences
Explore how to apply the sliding window method to find repeated 10-letter substrings in DNA sequences. This lesson helps you understand problem constraints, analyze substrings, and implement solutions efficiently in coding interviews.
We'll cover the following...
Statement
A DNA sequence consists of nucleotides represented by the letters ‘A’, ‘C’, ‘G’, and ‘T’ only. For example, “ACGAATTCCG” is a valid DNA sequence.
Given a string, s, that represents a DNA sequence, return all the 10-letter-long sequences (continuous substrings of exactly 10 characters) that appear more than once in s. You can return the output in any order.
Constraints:
s.lengths[i]is either'A','C','G', or'T'.
Examples
Understand the problem
Now, let’s take a moment to make sure you’ve correctly understood the problem. The quiz below helps you check if you’re solving the correct problem:
Repeated DNA Sequences
Given the input s = "TTTTTTTTT", what is the output of finding all -letter-long substrings that appear more than once?
["TTTTTTTTT"]
["TTTTTTTTTT"]
[]
["TTTTTTTTT", "TTTTTTTTTT"]
Figure it out!
We have a game for you to play. Rearrange the logical building blocks to develop a clearer understanding of how to solve this problem.
Try it yourself
Implement your solution in the following coding playground.
#include <iostream>#include <unordered_set>#include <unordered_map>#include <vector>#include <string>using namespace std;vector<string> findRepeatedDnaSequences(string s) {// Replace this placeholder return statement with your codereturn {"test"};}