Maximum Bipartite Matching
算法旨在用盡可能簡單的思路解決這個問題。理解算法也應該是一個越看越簡單的過程,當你看到算法裏的一串概念,或者一大坨代碼,第一感覺是復雜,此時最好還是從樣例入手。通過一個簡單的樣例,並編程實現,這個過程事實上就能夠理解清楚算法裏的最重要的思想,之後擴展。對算法的引理或者更復雜的情況。對算法進行改進。最後,再考慮時間和空間復雜度的問題。
了解這個算法是源於在Network Alignment問題中。圖論算法用得比較多。而對於alignment。特別是pairwise alignment, 又常常遇到maximum bipartite matching問題,解決問題,是通過Network Flow問題的解法來實現。
一、Network Flow
Network Flow,指的是在從source 到 destination的路徑組成一個network, 每條邊有一個capacity, 表示從這條邊上能通過的最大信息流,而Network Flow問題則要找出從源到目的地能通過的最大流, Maximum Flow. 信息在流動的過程中須要遵循兩個原則;
1. 對於每一個節點,流入和流出的信息必須相等。
2.流過每條邊的信息不能超過邊上的capacity.
最大流問題和minimum cut是等價的,找最大流也就是找minimum cut,minimum cut是例如以下定義的:
我們要在Network上刪除一些邊。刪除掉這些邊後,從source 就沒有路徑到目的地了,我們要找到盡可能少的邊,來達到這個目的,這就是minimum cut。
二、 Ford-Fulkerson算法
第一遍讀這個算法的時候。不懂,如今讀這個算法,認為非常清晰,如今把算法的思路復述一遍。不知道第一次讀的人會不會認為easy理解:
1、 構建Residual graph:由於在原network上已經有了capacity, 如今給定這個網絡一個流flow的值, 比如邊是(u,v)我們計算capacity-f, 同一時候我們也計算(v,u),值為f(由於capacity為0),
假設一條邊的這個值為正,則保留,否則刪除。
2、augmenting path: 通過1得到的就是Residual graph,這個graph上的從source到destination的全部路徑都叫做augmenting path.
3、針對每條augmenting path: 改變path上全部邊的capacity,改變規則例如以下(以(u,v)為例):
找到這條path上的最小的capacity, f,
降低u->v的capacity, 添加v->u的capacity.
算法的時間復雜度 O(m+n)f),f是max-flow.
代碼:
// C++ program for implementation of Ford Fulkerson algorithm #include <iostream> #include <limits.h> #include <string.h> #include <queue> using namespace std; // Number of vertices in given graph #define V 6 /* Returns true if there is a path from source ‘s‘ to sink ‘t‘ in residual graph. Also fills parent[] to store the path */ bool bfs(int rGraph[V][V], int s, int t, int parent[]) { // Create a visited array and mark all vertices as not visited bool visited[V]; memset(visited, 0, sizeof(visited)); // Create a queue, enqueue source vertex and mark source vertex // as visited queue <int> q; q.push(s); visited[s] = true; parent[s] = -1; // Standard BFS Loop while (!q.empty()) { int u = q.front(); q.pop(); for (int v=0; v<V; v++) { if (visited[v]==false && rGraph[u][v] > 0) { q.push(v); parent[v] = u; visited[v] = true; } } } // If we reached sink in BFS starting from source, then return // true, else false return (visited[t] == true); } // Returns tne maximum flow from s to t in the given graph int fordFulkerson(int graph[V][V], int s, int t) { int u, v; // Create a residual graph and fill the residual graph with // given capacities in the original graph as residual capacities // in residual graph int rGraph[V][V]; // Residual graph where rGraph[i][j] indicates // residual capacity of edge from i to j (if there // is an edge. If rGraph[i][j] is 0, then there is not) for (u = 0; u < V; u++) for (v = 0; v < V; v++) rGraph[u][v] = graph[u][v]; int parent[V]; // This array is filled by BFS and to store path int max_flow = 0; // There is no flow initially // Augment the flow while tere is path from source to sink while (bfs(rGraph, s, t, parent)) { // Find minimum residual capacity of the edhes along the // path filled by BFS. Or we can say find the maximum flow // through the path found. int path_flow = INT_MAX; for (v=t; v!=s; v=parent[v]) { u = parent[v]; path_flow = min(path_flow, rGraph[u][v]); } // update residual capacities of the edges and reverse edges // along the path for (v=t; v != s; v=parent[v]) { u = parent[v]; rGraph[u][v] -= path_flow; rGraph[v][u] += path_flow; } // Add path flow to overall flow max_flow += path_flow; } // Return the overall flow return max_flow; } // Driver program to test above functions int main() { // Let us create a graph shown in the above example int graph[V][V] = { {0, 16, 13, 0, 0, 0}, {0, 0, 10, 12, 0, 0}, {0, 4, 0, 0, 14, 0}, {0, 0, 9, 0, 0, 20}, {0, 0, 0, 7, 0, 4}, {0, 0, 0, 0, 0, 0} }; cout << "The maximum possible flow is " << fordFulkerson(graph, 0, 5); return 0; }
三、Maximum Bipartite Matching
解決問題就非常easy了。我們先加入上源和目的地節點。如果是任務分配問題,則源能夠有邊指向全部人。全部任務有邊能夠指向目的地,我們要找的是人和任務之間的最優匹配。
代碼:
// A C++ program to find maximal Bipartite matching. #include <iostream> #include <string.h> using namespace std; // M is number of applicants and N is number of jobs #define M 6 #define N 6 // A DFS based recursive function that returns true if a // matching for vertex u is possible bool bpm(bool bpGraph[M][N], int u, bool seen[], int matchR[]) { // Try every job one by one for (int v = 0; v < N; v++) { // If applicant u is interested in job v and v is // not visited if (bpGraph[u][v] && !seen[v]) { seen[v] = true; // Mark v as visited // If job ‘v‘ is not assigned to an applicant OR // previously assigned applicant for job v (which is matchR[v]) // has an alternate job available. // Since v is marked as visited in the above line, matchR[v] // in the following recursive call will not get job ‘v‘ again if (matchR[v] < 0 || bpm(bpGraph, matchR[v], seen, matchR)) { matchR[v] = u; return true; } } } return false; } // Returns maximum number of matching from M to N int maxBPM(bool bpGraph[M][N]) { // An array to keep track of the applicants assigned to // jobs. The value of matchR[i] is the applicant number // assigned to job i, the value -1 indicates nobody is // assigned. int matchR[N]; // Initially all jobs are available memset(matchR, -1, sizeof(matchR)); int result = 0; // Count of jobs assigned to applicants for (int u = 0; u < M; u++) { // Mark all jobs as not seen for next applicant. bool seen[N]; memset(seen, 0, sizeof(seen)); // Find if the applicant ‘u‘ can get a job if (bpm(bpGraph, u, seen, matchR)) result++; } return result; } // Driver program to test above functions int main() { // Let us create a bpGraph shown in the above example bool bpGraph[M][N] = { {0, 1, 1, 0, 0, 0}, {1, 0, 0, 1, 0, 0}, {0, 0, 1, 0, 0, 0}, {0, 0, 1, 1, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 1} }; cout << "Maximum number of applicants that can get job is " << maxBPM(bpGraph); return 0; }
四、對於任務分配問題,還有Hungrian算法,這個後面再講。此算法的時間復雜度和空間復雜度以及改進也能夠探討
Maximum Bipartite Matching