Skip to main content

Leetcode 30: Substring with Concatenation of All Words

1. Problem Statement (Simple Explanation):


You are given:

  • A string s.

  • An array of strings words, where:

    • All words[i] have the same length.

concatenated string in this problem is:

  • A string formed by concatenating all the words in words in any order, using each word exactly as many times as it appears in words (i.e., including duplicates).

You must:

  • Find all starting indices i in s such that the substring:

s[i .. i + totalLen - 1]

is a concatenated string of all words (in some permutation).

Return the list of indices (order doesn’t matter).

Where:

  • wordLen = length(words[0])

  • wordCount = len(words)

  • totalLen = wordLen * wordCount


2. Examples:


Example 1:

Input:

s = "barfoothefoobarman"

words = ["foo","bar"]

  • wordLen = 3

  • wordCount = 2

  • totalLen = 6

Valid substrings:

  • Starting at index 0: "barfoo" → "bar" + "foo" (a permutation of ["foo","bar"])

  • Starting at index 9: "foobar" → "foo" + "bar"

Output: [0, 9]


Example 2:

Input:

s = "wordgoodgoodgoodbestword"

words = ["word","good","best","word"]

No substring is a concatenation of all words (with required counts).

Output: []


Example 3:

Input:

s = "barfoofoobarthefoobarman"

words = ["bar","foo","the"]

  • Valid substrings (length 3 words × 3 = 9):

    • i = 6: "foobarthe" → ["foo","bar","the"]

    • i = 9: "barthefoo" → ["bar","the","foo"]

    • i = 12: "thefoobar" → ["the","foo","bar"]

Output: [6, 9, 12]


Constraints:

  • 1 <= |s| <= 104

  • 1 <= words.length <=5000

  • 1 <= |words[i]| <= 30

  • All words[i] have the same length.

  • s and words[i] contain only lowercase English letters.


3. Approach – Sliding Window with Word Frequency (O(n · wordLen)):


This is a classic hard string + hashmap + sliding window problem.


Key Observations:


  1. All words have the same length L = wordLen.

  2. We are looking for substrings of fixed length totalLen = L * wordCount.

  3. Each valid substring can be thought of as wordCount consecutive "blocks" of size L:

    • word1 = s[i .. i+L-1]

    • word2 = s[i+L .. i+2L-1]

    • ...

  4. The order of words does not matter, only their frequency matters.

Thus, for each candidate starting index i, we need to check if the wordCount blocks match words in terms of word counts.

Brute force would be O(n.wordCount) per starting index ⇒ too slow.

Instead, we use a sliding window aligned by word boundaries with hashing.


High-level Strategy:


  1. Precompute a frequency map targetCount from words:

    • targetCount[word] = number of times word appears in words.

  2. Because words are length L, valid window starts are aligned at offsets:

    • offset = 0, 1, 2, ..., L-1

    • We use L separate sliding windows, each stepping by L.

  3. For each offset:

    • Initialize:

      • left = offset

      • currentCount = {} (map from word → count in current window)

      • countWords = 0 (number of valid words in the current window)

    • For right from offset to len(s)-L in steps of L:

      • Extract word = s[right .. right+L-1].

      • If word is in targetCount:

        • Add it to currentCount[word].

        • Increment countWords.

        • If currentCount[word] > targetCount[word]:

          • Window has too many of this word.

          • Shrink window from the left:

            • While currentCount[word] > targetCount[word]:

              • leftWord = s[left .. left+L-1]

              • Decrement currentCount[leftWord]

              • left += L

              • countWords--

        • If countWords == wordCount:

          • We have a valid window of wordCount words.

          • Record left as an answer.

          • Then shrink left by 1 word to continue searching the next window:

            • leftWord = s[left .. left+L-1]

            • Decrement currentCount[leftWord]

            • left += L

            • countWords--

      • Else (word not in targetCount):

        • Reset window:

          • currentCount.clear()

          • countWords = 0

          • left = right + L (start after this junk word)


Why Offsets?


We must ensure that words are always taken in aligned chunks of length L. By iterating over starting positions offset in [0..L-1] and then stepping by L, we inspect all possible word-aligned substrings exactly once.


Pseudo-code (Sliding Window by Offset):



Complexity:


Let:

  • n = |s|

  • L = wordLen

  • k = len(words)

We do:

  • Up to L different offset scans.

  • Each scan is O(n/L) steps.

  • Each step performs O(1) hashmap work on constant-size words.

Overall complexity:

  • Time: O(n.k) in worst interpretations, but practically O(n)

  • Space: O(k)

for the maps.

Given constraints (n ≤ 104), this is efficient.


4. Java code:



5. C code:
 



6. C++ code:



7. Python code:



8. JavaScript code:



Comments

Popular posts from this blog

Leetcode 1: Two Sum

  Leetcode Link: https://leetcode.com/problems/two-sum/ 1. Problem Statement (Simple Explanation) You are given: An array of integers nums An integer target You need to  return the indices of two numbers in the array  such that: Conditions: Each input has  exactly one solution . You  cannot use the same element twice . You can return the answer in  any order . 2. Examples Example 1 Input:                                    nums = [2, 7, 11, 15], target = 9 Output:                                 [0, 1] Explanation:                     nums[0]+nums[1]=2+7=9              So we return [0, 1]. Example 2 Input:           ...

Leetcode 3: Longest Substring Without Repeating Characters

  Leetcode Link 1. Problem Statement (Simple Explanation) You are given a string s. You need to find the length of the longest substring of s that does not  contain any repeated characters. A substring is a contiguous sequence of characters in the string. You must return an integer length, not the substring itself. 2. Examples Example 1 Input: s = "abcabcbb" Output: 3 Explanation: Some substrings without repeating characters: "abc", "bca", "cab"The longest length is 3. Example 2 Input: s = "bbbbb" Output: 1 Explanation: All substrings that are valid are just "b". The longest length is 1. Example 3 Input:   s = "pwwkew" Output:   3 Explanation: Possible substrings without repetition: "pw", "wke", "kew". "wke" or "kew" have length 3. "pwke" is not valid because it’s not contiguous in the original string (p and k are separated by ww). Constraints...

Leetcode 2: Add Two Numbers

Leetcode Link: https://leetcode.com/problems/add-two-numbers/ 1. Problem Statement (Simple Explanation) You are given  two non-empty linked lists , l1 and l2, that represent two non-negative integers. Each node contains a  single digit . The digits are stored in  reverse order  (i.e., the 1’s digit is at the head). You need to  add the two numbers  and return the  sum as a linked list , also in reverse order. You can assume: There are  no leading zeros , except when the number itself is 0. 2. Examples Example 1 Input:                l1 = [2,4,3]                l2 = [5,6,4]           Interpreting linked lists as numbers (reverse order): l1 represents 342 l2 represents 465 Output:                [7,0,8] Explanation:        ...