1. 程式人生 > >leetcode--839. Similar String Groups題解

leetcode--839. Similar String Groups題解

題目

Two strings X and Y are similar if we can swap two letters (in different positions) of X, so that it equals Y.

For example, “tars” and “rats” are similar (swapping at positions 0 and 2), and “rats” and “arts” are similar, but “star” is not similar to “tars”, “rats”, or “arts”.

Together, these form two connected groups by similarity: {“tars”, “rats”, “arts”} and {“star”}. Notice that “tars” and “arts” are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.

We are given a list A of strings. Every string in A is an anagram of every other string in A. How many groups are there?

Example 1:

Input: ["tars","rats","arts","star"]
Output: 2

Note:

  1. A.length <= 2000
  2. A[i].length <= 1000
  3. A.length * A[i].length <= 20000
  4. All words in A consist of lowercase letters only.
  5. All words in A have the same length and are anagrams of each other.
  6. The judging time limit has been increased for this question.

思路與解法

首先,按照題意,滿足條件的字串組的集合:對該集合中任意字串X,存在字串Y,使得X可以通過交換兩個字母的位置得到Y。所以,在該集合中的任意一個字串X一定可以找到相似(similar)的Y。我們可以採用圖的思想,將字串用節點V來表示,相似的兩個字串用邊E相連,此時我們就得到了圖G。那麼一個集合中的任意兩個字串一定存在一條路徑將二者相連,即該集合所代表的圖為“連通”圖。另一方面,一個連通圖所代表的一定是滿足題目要求的一個字串的集合。最終,原問題“求多少組不同的字串集合”便轉化為了“求所有字串構成的圖中的連通分量”。具體實現如下:

  1. 構建圖 利用雙重迴圈(0A.length)(0 ~ A.length)遍歷所有字串(節點)的組合情況(A[i]A[j])(A[i]、A[j]),再通過一重迴圈(0min(A[i].length,A[j].length))(0 ~ min(A[i].length,A[j].length)),判斷這兩個字串中A[i][k]!=A[j][k]A[i][k] != A[j][k]的情況,記錄kk的值到陣列diffIndexdiffIndex中。 當 diffIndex.length==2diffIndex.length==2 時,且A[i][k0]==A[j][k1]andA[i][k1]==A[j][k0]A[i][k_0] == A[j][k_1] and A[i][k_1] == A[j][k_0]時,表示字串(節點)A[i]和A[j]時相似(similar)的,此時,構建從A[i]到A[j]的邊。時間複雜度為O(A.length2A[i].length)O(A.length^2*A[i].length)
  2. dfs遍歷圖 根據構建的圖,從一個未訪問的節點出發,利用dfs深度優先搜尋,遍歷該節點所能達到的所有節點,進行標記,該集合即為一個連通分量。之後,類似的方法,從一個未訪問的節點出發進行遍歷標記,得到連通分量。最後統計連通分量的個數即可。

程式碼實現

我採用go語言來實現該演算法:


func numSimilarGroups(A []string) int {
	// 利用go語言自帶的資料結構map來實現圖的儲存,graph[str1]即得到與str1相連的其他字串
	// visited儲存節點的訪問狀態(是否被訪問過)
    var graph = make(map[string][]string)
    var visited = make(map[string]bool)
    // 預處理,過濾掉相同字串,減少後續過程中構造圖的時間和空間的浪費
	B := make([]string, 0)
	for _, str1 := range A {
		flag := false
		for _, str2 := range B {
			if str2 == str1 {
				flag = true
				break
			}
		}
		if !flag {
			B = append(B, str1)
		}
	}
	// 三重迴圈進行構造圖 
	for i, str1 := range B {
		for j, str2 := range B {
            if i == j || str1 == str2 {	// i==j 處理自環(str1 == str2 多餘判斷)
                continue
            }
            if length := len(str1); length == len(str2) {
				diffIndex := make([]int, 0)
                for k := 0; k < length; k++ {
					if str1[k] != str2[k] {
						diffIndex = append(diffIndex, k)
					}
				}
				if len(diffIndex) != 2 || !(str1[diffIndex[0]] == str2[diffIndex[1]] &&
					str1[diffIndex[1]] == str2[diffIndex[0]]) {
					continue
				}
				// 儲存str1->str2
				graph[str1] = append(graph[str1], str2)
			}
		}
	}
	// num儲存連通分量的數量
    num := 0
	for _, str := range B {
		if !visited[str] {
			num++
			visited[str] = true
			// 深度優先遍歷圖
			dfs(str, visited, graph)
		}
	}
	return num
}

func dfs(str string, visited map[string]bool, graph map[string][]string) {
	for _, strNext := range graph[str] {
		// 訪問為訪問過的節點
		if !visited[strNext] {
			visited[strNext] = true
            dfs(strNext, visited, graph)
		}
	}
}

遇到的問題

當我將graphvisitedgraph、visited宣告為全域性變數時,最後一組資料無法通過:

var graph = make(map[string][]string)
var visited = make(map[string]bool)
func numSimilarGroups(A []string) int {
    ...
    num := 0
	for _, str := range B {
		if !visited[str] {
			...
			dfs(str)
		}
	}
	return num
}

func dfs(str string) {
	...
}

在這裡插入圖片描述 Frequently Asked Questions中有使用者遇到了submit提交的結果與run code執行custom testcase中相同資料的結果不一致。 解釋如下:

First, please check if you are using any global or static variables. They are Evil, period. If you must declare one, reset them in the first line of your called method or in the default constructor. Why? Because the judger executes all test cases using the same program instance, global/static variables affect the program state from one test case to another. See this Discuss thread for more details.

Are you using C or C++? If the answer is yes, chances are your code has bugs in it which cause one of the earlier test cases to trigger an undefined behavior. See this Discuss thread for an example of undefined behavior. These bugs could be hard to debug, so good luck. Or just give up on C/C++ entirely and code in a more predictable language, like Java. Just kidding.

我的情況與上述情況並不完全一致。所以具體原因並不清楚,有知道的使用者希望可以評論告知。