Search⌘ K
AI Features

Repeated DNA Sequences

Explore how to apply the sliding window method to find repeated 10-letter substrings in DNA sequences. This lesson helps you understand problem constraints, analyze substrings, and implement solutions efficiently in coding interviews.

Statement

A DNA sequence consists of nucleotides represented by the letters ‘A’, ‘C’, ‘G’, and ‘T’ only. For example, “ACGAATTCCG” is a valid DNA sequence.

Given a string, s, that represents a DNA sequence, return all the 10-letter-long sequences (continuous substrings of exactly 10 characters) that appear more than once in s. You can return the output in any order.

Constraints:

  • 11 \leq s.length 103\leq 10^{3}

  • s[i] is either 'A''C''G', or 'T'.

Examples

canvasAnimation-image
1 / 5

Understand the problem

Now, let’s take a moment to make sure you’ve correctly understood the problem. The quiz below helps you check if you’re solving the correct problem:

Repeated DNA Sequences

1.

Given the input s = "TTTTTTTTT", what is the output of finding all 1010-letter-long substrings that appear more than once?

A.

["TTTTTTTTT"]

B.

["TTTTTTTTTT"]

C.

[]

D.

["TTTTTTTTT", "TTTTTTTTTT"]


1 / 5

Figure it out!

We have a game for you to play. Rearrange the logical building blocks to develop a clearer understanding of how to solve this problem.

Sequence - Vertical
Drag and drop the cards to rearrange them in the correct sequence.

1
2
3
4
5

Try it yourself

Implement your solution in the following coding playground.

C++
usercode > Solution.cpp
#include <iostream>
#include <unordered_set>
#include <unordered_map>
#include <vector>
#include <string>
using namespace std;
vector<string> findRepeatedDnaSequences(string s) {
// Replace this placeholder return statement with your code
return {"test"};
}
Repeated DNA Sequences