LeetCode 140. Word Break II Solution 題解

阿新 • • 發佈：2019-01-10

Leetcode 140. Word Break II

@(LeetCode)[LeetCode | Algorithm | DP | Backtracking | DFS]

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, add spaces in s to construct a sentence where each word is a valid dictionary word. Return all such possible sentences.

Note:
The same word in the dictionary may be reused multiple times in the segmentation. You may assume the dictionary does not contain duplicate words.

Example 1:

Input:
s = "catsanddog"
wordDict = ["cat", "cats", "and", "sand", "dog"]

Output:
[
  "cats and dog",
  "cat sand dog"
]

The original problem is

HERE.

Solution #1 DP

Time: $O (2^{n})$
Space: $O (2^{n})$

一種方法是用動態規劃(DP)，思路很簡單。dp[i]表示從起始(第0位)到第i位之前字串所能組成的句子。dp[i]可以表示為:

d p [i] = {s t r [s : i) + d p [s] | s t r [s : i) \in wordDict, d p [s] \neq \emptyset, s \in [0, i)}

如下圖所示，為一個DP構造的示意圖
這裡寫圖片描述

利用DP可以避免重複搜尋之前位置所能構造的句子。

注意：演算法實現中加入了一個breakable

函式，該函式用於判斷該字串能否被break。為什麼要加入這個函式，在本文的最後一節會有說明。

class Solution {
public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        vector<vector<string> > dp(s.length() + 1, vector<string>());
        dp[0].push_back(string());

        unordered_set<string> dict;
        for(string str : wordDict)
            dict.insert(str);

        if( !breakable(s, dict) )
            return dp.back();

        for(int i = 1; i <= s.length(); i++){
            for(int j = 0; j < i; j++){
                string ss = s.substr(j, i - j);
                if(dp[j].size() > 0 && dict.find(ss) != dict.end()){
                    for(string prefix : dp[j]){
                        dp[i].push_back(prefix + (prefix == "" ? "" : " ") + ss);
                    }
                }
            }
        }

        return dp.back();
    }

    bool breakable(string s, unordered_set<string> &dict){
        vector<bool> dp(s.length() + 1, false);
        dp[0] = true;

        for(int i = 1; i <= s.length(); i++){
            for(int j = 0; j < i; j++){
                string ss = s.substr(j, i - j);
                if(dp[j] && dict.find(ss) != dict.end()){
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp.back();
    }
};

Solution #2 DFS (Backtracking) with memory

Time: $O (2^{n})$
Space: $O (2^{n})$

另一種思路是Backtracking。如果[start, end]之間組成的單詞在wordDict中，就遞迴呼叫dfs(end + 1, ...)來查詢以end + 1開頭所能組成的句子。再將[start, end]和dfs(end + 1, ...)返回的句子拼接起來，組成完成的句子。(注：若dfs返回為空，則表明不能組成句子)。

一個關鍵的地方就是使用memory來避免重複計算。比如下圖中，黃色高亮部分的dog，可以直接根據之前相同位置的搜尋結果來獲得。其中memo[7]，表示以index為7的位置為起始所能組成的句子。

這裡寫圖片描述

class Solution {
public:
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict;
        for(string str : wordDict)
            dict.insert(str);

        vector<string >ret = dfs(s, 0, dict);

        return ret;
    }

    vector<string> dfs(string &s, int start, unordered_set<string> &dict){
        if(memo.find(start) != memo.end())
            return memo[start];

        if(start == s.length()){
            cout << "create memo: " << start << endl;
            memo[start] = vector<string>(1, string());
            return memo[start];
        }

        vector<string> ret;
        for(int i = start; i < s.length(); i++){
            string ss = s.substr(start, i - start + 1);
            if(dict.find(ss) != dict.end()){
                vector<string> suffixes = dfs(s, i + 1, dict);

                for(string suffix : suffixes)
                    ret.push_back(ss + (suffix == "" ? "" : " ") + suffix);
            }
        }

        cout << "create memo: " << start << endl;
        memo[start] = ret;
        return ret;
    }

private:
    map<int, vector<string> > memo;
};

Conclusion

雖然DP和Backtracking的時間複雜度都是一樣的，但是Backtracking更好一點。原因在於，當整個句子無法被構建時，Backtracking不會浪費大量的計算資源來計算字首所能組成的句子，直接會判定該句子無法被組成。

舉個例子：

input:
"aaaaaaaaaaaaabaaaaaaaaaaaaa"
["a", "aa", "aaa", "aaaa", "aaaaa", "aaaaaa", "aaaaaaa", "aaaaaaaa", "aaaaaaaaa", "aaaaaaaaaa", "aaaaaaaaaaa", "aaaaaaaaaaaa", "aaaaaaaaaaaaa"]

Output:
[]

字串aaaaaaaaaaaaabaaaaaaaaaaaaa是不能被break的。

Backtracking通過DFS很快就能判斷該句子不能被break。

這裡寫圖片描述

但是DP則不然，如果使用DP會浪費大量時間用於計算字首aaaaaaaaaaaaa所能劃分出的句子(這個字首能劃分出 $2^{12}$ 個句子，會消耗大量的時間和空間)，直到掃描到最後一位的時候，才會發現原字串無法被break。

LeetCode 140. Word Break II Solution 題解

Leetcode 140. Word Break II

Solution #1 DP

Solution #2 DFS (Backtracking) with memory

Conclusion

LeetCode 140. Word Break II Solution 題解

[LeetCode] 140. Word Break II java

LeetCode 140 Word Break II

DP動態規劃專題五：LeetCode 140. Word Break II

LeetCode 140. Word Break II

LeetCode 140. Word Break II（單詞切分）

LeetCode(140) Word Break II

[Leetcode] 140. Word Break II 解題報告

leetcode 140. Word Break II 深度優先搜尋DFS + 很棒的動態規劃DP 做法 + 記錄前驅節點

【leetcode】140. Word Break II

python leetcode 139. Word Break 140. Word Break II

【LeetCode】140. Word Break II 解題報告（Python & C++）

140. Word Break II

140 Word Break II

140. Word Break II 分詞 DP

【leetcode】140.（Hard）Word Break II

【LeetCode】#140單詞拆分II(Word Break II)

word-break-ii leetcode C++

LeetCode:Word Break II

【leetcode】Word Break（python）

LeetCode 140. Word Break II Solution 題解

Leetcode 140. Word Break II

Solution #1 DP

Solution #2 DFS (Backtracking) with memory

Conclusion

相關推薦