feat(algorithms, dynamic programming): word break puzzle using backtracking

BrianLusina · BrianLusina · commit 09786c03bec6 · 2025-12-17T10:19:30.000+03:00
diff --git a/algorithms/dynamic_programming/word_break/README.md b/algorithms/dynamic_programming/word_break/README.md
@@ -26,6 +26,12 @@ sequences of words (sentences). The order in which the sentences are listed is n
 
 ## Solutions
 
+1. [Naive Approach](#naive-approach)
+2. [Backtracking](#backtracking)
+3. [Dynamic Programming - tabulation](#optimized-approach-using-dynamic-programming---tabulation)
+4. [Dynamic Programming - memoization](#dynamic-programming---memoization)
+5. [Trie Optimization](#trie-optimization)
+
 ### Naive Approach
 
 The naive approach to solve this problem is to use a traditional recursive strategy in which we take each prefix of the 
@@ -58,6 +64,68 @@ of the string, and `m` is the length of the longest word in the dictionary.
 
 The space complexity is O(k^n * n), where k is the number of words in the dictionary and `n` is the length of the string.
 
+### Backtracking
+
+Initially, we might think of a brute-force approach where we systematically explore all possible ways to break the 
+string into words from the dictionary. This leads us to the backtracking strategy, where we recursively try to form 
+words from the string and add them to a current sentence if they are in the dictionary. If the current prefix doesn't 
+lead to a valid solution, we backtrack by removing the last added word and trying the next possible word. This ensures 
+we explore all possible segmentations of the string.
+
+At each step, we consider all possible end indices for substrings starting from the current index. For each substring, 
+we check if it exists in the dictionary. If the substring is a valid word, we append it to the current sentence and 
+recursively call the function with the updated index, which is the end index of the substring plus one.
+
+If we reach the end of the string, it means we have found a valid segmentation, and we can add the current sentence to 
+the results. However, if we encounter a substring that is not a valid word, we backtrack by returning from that 
+recursive call and trying the next possible end index.
+
+The backtracking approach will be inefficient due to the large number of recursive calls, especially for longer strings. 
+To increase efficiency, we will convert the word dictionary into a set for constant-time lookups. However, the overall 
+time complexity remains high because we explore all possible partitions.
+
+The process is visualized below:
+
+![Backtracking Solution](./images/solution/word_break_backtracking_solution_1.png)
+
+#### Algorithm
+
+- Convert the `word_dict` array into an unordered set `word_set` for efficient lookups.
+- Initialize an empty array `results` to store valid sentences.
+- Initialize an empty string currentSentence to keep track of the sentence being constructed.
+- Call the `backtrack` function with the input string `s`, `word_set`, `current_sentence`, `results`, and a starting 
+  index set to 0, the beginning of the input string.
+  - Base case: If the `start_index` is equal to the length of the string, add the `current_sentence` to `results` and 
+    return as it means that `current_sentence` represents a valid sentence.
+  - Iterate over possible `end_index` values from `start_index` + 1 to the end of the string.
+    - Extract the substring word from startIndex to `end_index - 1`. 
+    - If word is found in `word_set`:
+      - Store the current currentSentence in `original_sentence`. 
+      - Append word to `current_sentence` (with a space if needed). 
+      - Recursively call `backtrack` with the updated `current_sentence` and `end_index`. 
+      - Reset `current_sentence` to its original value (`original_sentence`) to backtrack and try the next `end_index`. 
+    - Return from the backtrack function.
+- Return results.
+
+#### Complexity Analysis
+
+Let n be the length of the input string.
+
+##### Time complexity: O(n⋅2^n)
+
+The algorithm explores all possible ways to break the string into words. In the worst case, where each character can be 
+treated as a word, the recursion tree has 2^n leaf nodes, resulting in an exponential time complexity. For each leaf 
+node, O(n) work is performed, so the overall complexity is O(n⋅2^n).
+
+##### Space complexity: O(2^n)
+
+The recursion stack can grow up to a depth of n, where each recursive call consumes additional space for storing the 
+current state.
+
+Since each position in the string can be a split point or not, and for n positions, there are 2^n possible combinations 
+of splits. Thus, in the worst case, each combination generates a different sentence that needs to be stored, leading to 
+exponential space complexity.
+
 ### Optimized approach using dynamic programming - tabulation
 
 Since the recursive solution to this problem is very costly, let’s see if we can reduce this cost in any way. Dynamic 
@@ -162,3 +230,7 @@ combinations
 
 The space complexity is O(n * v), where n is the length of the string and v is the number of valid combinations stored in
 the `dp` array.
+
+### Dynamic Programming - Memoization
+
+### Trie Optimization
diff --git a/algorithms/dynamic_programming/word_break/__init__.py b/algorithms/dynamic_programming/word_break/__init__.py
@@ -1,4 +1,4 @@
-from typing import List, Dict
+from typing import List, Dict, Set
 from datastructures.trees.trie import AlphabetTrie
 
 
@@ -158,3 +158,42 @@ def word_break_dp_2(s: str, word_dict: List[str]) -> List[str]:
 
     # returning all the sentences formed from the complete string s
     return dp.get(0, [])
+
+
+def word_break_backtrack(s: str, word_dict: List[str]) -> List[str]:
+    """
+    This adds spaces to s to break it up into a sequence of valid words from word_dict.
+
+    Uses backtracking to solve the problem.
+
+    Args:
+        s: The input string
+        word_dict: The dictionary of words
+    Returns:
+        List of valid sentences
+    """
+    # convert word dict into a set for O(1) lookups
+    word_set = set(word_dict)
+    results = []
+
+    def backtrack(sentence: str, words_set: Set[str], current_sentence: List[str], result: List[str], start_index: int):
+        # If we've reached the end of the string, add the current sentence to results
+        if start_index == len(sentence):
+            result.append(" ".join(current_sentence))
+            return
+
+        # Iterate over possible end indices
+        for end_index in range(start_index + 1, len(sentence) + 1):
+            word = s[start_index:end_index]
+            # If the word is in the set, proceed with backtracking
+            if word in words_set:
+                current_sentence.append(word)
+                # Recursively call backtrack with the new end index
+                backtrack(
+                    sentence, words_set, current_sentence, result, end_index
+                )
+                # Remove the last word to backtrack
+                current_sentence.pop()
+
+    backtrack(s, word_set, [], results, 0)
+    return results
diff --git a/algorithms/dynamic_programming/word_break/images/solution/word_break_backtracking_solution_1.png b/algorithms/dynamic_programming/word_break/images/solution/word_break_backtracking_solution_1.png
diff --git a/algorithms/dynamic_programming/word_break/test_word_break.py b/algorithms/dynamic_programming/word_break/test_word_break.py
@@ -1,7 +1,7 @@
 import unittest
 from typing import List
 from parameterized import parameterized
-from algorithms.dynamic_programming.word_break import word_break_trie, word_break_dp, word_break_dp_2
+from algorithms.dynamic_programming.word_break import word_break_trie, word_break_dp, word_break_dp_2, word_break_backtrack
 
 
 class WordBreakTestCases(unittest.TestCase):
@@ -131,6 +131,48 @@ def test_word_break_dp_2(self, s: str, word_dict: List[str], expected: List[str]
         expected.sort()
         self.assertListEqual(expected, actual)
 
+    @parameterized.expand(
+        [
+            (
+                "magiclly",
+                ["ag", "al", "icl", "mag", "magic", "ly", "lly"],
+                ["mag icl ly", "magic lly"],
+            ),
+            (
+                "raincoats",
+                ["rain", "oats", "coat", "s", "rains", "oat", "coats", "c"],
+                ["rain c oats", "rain c oat s", "rain coats", "rain coat s"],
+            ),
+            (
+                "highway",
+                ["crash", "cream", "high", "highway", "low", "way"],
+                ["highway", "high way"],
+            ),
+            ("robocat", ["rob", "cat", "robo", "bo", "b"], ["robo cat"]),
+            (
+                "cocomomo",
+                ["co", "mo", "coco", "momo"],
+                ["co co momo", "co co mo mo", "coco momo", "coco mo mo"],
+            ),
+            (
+                "catsanddog",
+                ["cat", "cats", "and", "sand", "dog"],
+                ["cats and dog", "cat sand dog"],
+            ),
+            (
+                "pineapplepenapple",
+                ["apple", "pen", "applepen", "pine", "pineapple"],
+                ["pine apple pen apple", "pineapple pen apple", "pine applepen apple"],
+            ),
+            ("catsandog", ["cats", "dog", "sand", "and", "cat"], []),
+        ]
+    )
+    def test_word_break_backtrack(self, s: str, word_dict: List[str], expected: List[str]):
+        actual = word_break_backtrack(s, word_dict)
+        actual.sort()
+        expected.sort()
+        self.assertListEqual(expected, actual)
+
 
 if __name__ == "__main__":
     unittest.main()