Longest Substring Without Repeating Characters
This is my first post in a series of algorithm problems. Breaking it down and sharing my solution helps me learn & is an easy way to go back and review how to solve certain types of code problems.
In this challenge we receive a string, and we’re supposed to find the longest substring that contains no repeating characters. Let’s look at a few examples of this so we know what exactly we want to achieve:
- Input “abc” should result in 3, since the whole string has no repeating characters and it’s 3 characters long.
- Input “abcda” should result in 4, since we can only find substrings of length 4 that have all unique characters: “abcd”, and “bcda”.
- Input “abcabc” results in 3, since we can only find substrings of length 3 that have all unique characters: “abc”, “bca”, “cab”, “abc”.
- Input “” should result in zero.
- Input “abba” should result in 2.
If you do a lot of algorithm problems you may have already decided that the optimal solution will use a “sliding window”. Congratulations if you did! There are many problems that can be solved with a sliding window strategy, and there are a few more strategies that cover many of the coding problems we encounter. In upcoming posts I’ll cover those strategies in an attempt to help us all recognize which to use for a variety of problems. For now, let’s get back to solving this thing.
Essentially we use two pointers to define a window of characters. One pointer represents the start of the window, and the other pointer represents the end. We also need to keep track of the longest window we’ve seen. As well, we need to remember the characters we’ve seen & their positions (hopefully you’re thinking a dictionary is the perfect way to do that). To find the solution, we start with the window start and end pointers on the first character (index 0). Then we move the window end pointer forward & look at the new character. If we haven’t seen it before, we save the window length if it’s larger than any previous window. If we encounter a letter we’ve already seen, we check to see if that letter is inside the current window. If it is, we move the window start pointer to the character after that duplicate character. In this way we only have to iterate over the characters once, achieving a complexity of O(n).
Here’s the solution with a few comments and descriptive variable names: