Skip to content

Commit 09786c0

Browse files
committed
feat(algorithms, dynamic programming): word break puzzle using backtracking
1 parent 2b2681a commit 09786c0

File tree

4 files changed

+155
-2
lines changed

4 files changed

+155
-2
lines changed

algorithms/dynamic_programming/word_break/README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,12 @@ sequences of words (sentences). The order in which the sentences are listed is n
2626

2727
## Solutions
2828

29+
1. [Naive Approach](#naive-approach)
30+
2. [Backtracking](#backtracking)
31+
3. [Dynamic Programming - tabulation](#optimized-approach-using-dynamic-programming---tabulation)
32+
4. [Dynamic Programming - memoization](#dynamic-programming---memoization)
33+
5. [Trie Optimization](#trie-optimization)
34+
2935
### Naive Approach
3036

3137
The naive approach to solve this problem is to use a traditional recursive strategy in which we take each prefix of the
@@ -58,6 +64,68 @@ of the string, and `m` is the length of the longest word in the dictionary.
5864

5965
The space complexity is O(k^n * n), where k is the number of words in the dictionary and `n` is the length of the string.
6066

67+
### Backtracking
68+
69+
Initially, we might think of a brute-force approach where we systematically explore all possible ways to break the
70+
string into words from the dictionary. This leads us to the backtracking strategy, where we recursively try to form
71+
words from the string and add them to a current sentence if they are in the dictionary. If the current prefix doesn't
72+
lead to a valid solution, we backtrack by removing the last added word and trying the next possible word. This ensures
73+
we explore all possible segmentations of the string.
74+
75+
At each step, we consider all possible end indices for substrings starting from the current index. For each substring,
76+
we check if it exists in the dictionary. If the substring is a valid word, we append it to the current sentence and
77+
recursively call the function with the updated index, which is the end index of the substring plus one.
78+
79+
If we reach the end of the string, it means we have found a valid segmentation, and we can add the current sentence to
80+
the results. However, if we encounter a substring that is not a valid word, we backtrack by returning from that
81+
recursive call and trying the next possible end index.
82+
83+
The backtracking approach will be inefficient due to the large number of recursive calls, especially for longer strings.
84+
To increase efficiency, we will convert the word dictionary into a set for constant-time lookups. However, the overall
85+
time complexity remains high because we explore all possible partitions.
86+
87+
The process is visualized below:
88+
89+
![Backtracking Solution](./images/solution/word_break_backtracking_solution_1.png)
90+
91+
#### Algorithm
92+
93+
- Convert the `word_dict` array into an unordered set `word_set` for efficient lookups.
94+
- Initialize an empty array `results` to store valid sentences.
95+
- Initialize an empty string currentSentence to keep track of the sentence being constructed.
96+
- Call the `backtrack` function with the input string `s`, `word_set`, `current_sentence`, `results`, and a starting
97+
index set to 0, the beginning of the input string.
98+
- Base case: If the `start_index` is equal to the length of the string, add the `current_sentence` to `results` and
99+
return as it means that `current_sentence` represents a valid sentence.
100+
- Iterate over possible `end_index` values from `start_index` + 1 to the end of the string.
101+
- Extract the substring word from startIndex to `end_index - 1`.
102+
- If word is found in `word_set`:
103+
- Store the current currentSentence in `original_sentence`.
104+
- Append word to `current_sentence` (with a space if needed).
105+
- Recursively call `backtrack` with the updated `current_sentence` and `end_index`.
106+
- Reset `current_sentence` to its original value (`original_sentence`) to backtrack and try the next `end_index`.
107+
- Return from the backtrack function.
108+
- Return results.
109+
110+
#### Complexity Analysis
111+
112+
Let n be the length of the input string.
113+
114+
##### Time complexity: O(n⋅2^n)
115+
116+
The algorithm explores all possible ways to break the string into words. In the worst case, where each character can be
117+
treated as a word, the recursion tree has 2^n leaf nodes, resulting in an exponential time complexity. For each leaf
118+
node, O(n) work is performed, so the overall complexity is O(n⋅2^n).
119+
120+
##### Space complexity: O(2^n)
121+
122+
The recursion stack can grow up to a depth of n, where each recursive call consumes additional space for storing the
123+
current state.
124+
125+
Since each position in the string can be a split point or not, and for n positions, there are 2^n possible combinations
126+
of splits. Thus, in the worst case, each combination generates a different sentence that needs to be stored, leading to
127+
exponential space complexity.
128+
61129
### Optimized approach using dynamic programming - tabulation
62130

63131
Since the recursive solution to this problem is very costly, let’s see if we can reduce this cost in any way. Dynamic
@@ -162,3 +230,7 @@ combinations
162230

163231
The space complexity is O(n * v), where n is the length of the string and v is the number of valid combinations stored in
164232
the `dp` array.
233+
234+
### Dynamic Programming - Memoization
235+
236+
### Trie Optimization

algorithms/dynamic_programming/word_break/__init__.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from typing import List, Dict
1+
from typing import List, Dict, Set
22
from datastructures.trees.trie import AlphabetTrie
33

44

@@ -158,3 +158,42 @@ def word_break_dp_2(s: str, word_dict: List[str]) -> List[str]:
158158

159159
# returning all the sentences formed from the complete string s
160160
return dp.get(0, [])
161+
162+
163+
def word_break_backtrack(s: str, word_dict: List[str]) -> List[str]:
164+
"""
165+
This adds spaces to s to break it up into a sequence of valid words from word_dict.
166+
167+
Uses backtracking to solve the problem.
168+
169+
Args:
170+
s: The input string
171+
word_dict: The dictionary of words
172+
Returns:
173+
List of valid sentences
174+
"""
175+
# convert word dict into a set for O(1) lookups
176+
word_set = set(word_dict)
177+
results = []
178+
179+
def backtrack(sentence: str, words_set: Set[str], current_sentence: List[str], result: List[str], start_index: int):
180+
# If we've reached the end of the string, add the current sentence to results
181+
if start_index == len(sentence):
182+
result.append(" ".join(current_sentence))
183+
return
184+
185+
# Iterate over possible end indices
186+
for end_index in range(start_index + 1, len(sentence) + 1):
187+
word = s[start_index:end_index]
188+
# If the word is in the set, proceed with backtracking
189+
if word in words_set:
190+
current_sentence.append(word)
191+
# Recursively call backtrack with the new end index
192+
backtrack(
193+
sentence, words_set, current_sentence, result, end_index
194+
)
195+
# Remove the last word to backtrack
196+
current_sentence.pop()
197+
198+
backtrack(s, word_set, [], results, 0)
199+
return results
236 KB
Loading

algorithms/dynamic_programming/word_break/test_word_break.py

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import unittest
22
from typing import List
33
from parameterized import parameterized
4-
from algorithms.dynamic_programming.word_break import word_break_trie, word_break_dp, word_break_dp_2
4+
from algorithms.dynamic_programming.word_break import word_break_trie, word_break_dp, word_break_dp_2, word_break_backtrack
55

66

77
class WordBreakTestCases(unittest.TestCase):
@@ -131,6 +131,48 @@ def test_word_break_dp_2(self, s: str, word_dict: List[str], expected: List[str]
131131
expected.sort()
132132
self.assertListEqual(expected, actual)
133133

134+
@parameterized.expand(
135+
[
136+
(
137+
"magiclly",
138+
["ag", "al", "icl", "mag", "magic", "ly", "lly"],
139+
["mag icl ly", "magic lly"],
140+
),
141+
(
142+
"raincoats",
143+
["rain", "oats", "coat", "s", "rains", "oat", "coats", "c"],
144+
["rain c oats", "rain c oat s", "rain coats", "rain coat s"],
145+
),
146+
(
147+
"highway",
148+
["crash", "cream", "high", "highway", "low", "way"],
149+
["highway", "high way"],
150+
),
151+
("robocat", ["rob", "cat", "robo", "bo", "b"], ["robo cat"]),
152+
(
153+
"cocomomo",
154+
["co", "mo", "coco", "momo"],
155+
["co co momo", "co co mo mo", "coco momo", "coco mo mo"],
156+
),
157+
(
158+
"catsanddog",
159+
["cat", "cats", "and", "sand", "dog"],
160+
["cats and dog", "cat sand dog"],
161+
),
162+
(
163+
"pineapplepenapple",
164+
["apple", "pen", "applepen", "pine", "pineapple"],
165+
["pine apple pen apple", "pineapple pen apple", "pine applepen apple"],
166+
),
167+
("catsandog", ["cats", "dog", "sand", "and", "cat"], []),
168+
]
169+
)
170+
def test_word_break_backtrack(self, s: str, word_dict: List[str], expected: List[str]):
171+
actual = word_break_backtrack(s, word_dict)
172+
actual.sort()
173+
expected.sort()
174+
self.assertListEqual(expected, actual)
175+
134176

135177
if __name__ == "__main__":
136178
unittest.main()

0 commit comments

Comments
 (0)