1. 程式人生 > >copy-on-write(寫時複製)

copy-on-write(寫時複製)

寫時複製頁面保護機制是一種優化,記憶體管理器利用它可以節約記憶體。
    當程序為一個包含讀/寫頁面的記憶體區物件映射了一份寫時檢視,而並非在對映該檢視時建立一份程序私有的拷貝(Hewlett、Packard、OpenVMS作業系統就是這樣做的)時,記憶體管理器將頁面拷貝的動作推遲到頁面被寫入資料的時候,所有現代的UNIX系統也都使用了這項技術,如:2個程序正在共享3個頁面,每個頁面都被標記為寫時複製,但是這2個程序都不打算修改頁面上的任何資料。
    如果這2個程序中任何一個執行緒對一個頁面執行了寫操作,則產生一個記憶體管理錯誤,記憶體管理器看到,此寫操作作用在一個寫時複製的頁面上,所以,它不是將此錯誤報告為訪問違例,而是在實體記憶體中分配一個新的讀/寫頁面,並且把原始頁面中的內容拷貝到新的頁面中,同時也更新一下該程序中對應的頁面對映資訊,使它指向新的頁面位置,然後解除異常,從而使得剛才產生錯誤的那條指令得以重新執行。這一次,寫操作成功了。但是,新拷貝的頁面現在對於執行寫操作的那個程序來說是私有的,對於其它仍然在共享這一寫時複製頁面的程序來說,它是不可見的,每個往共享頁面中寫入資料的程序都將獲得它自己的私有拷貝。
   寫時複製的一個應用是:在偵錯程式中實現斷點支援。

例如:在預設情況下,程式碼頁面在起始時都是隻能執行的(即:只讀的),然而,如果一個程式設計師在除錯一個程式時設定了一個斷點,則偵錯程式必須在程式碼中加入一條斷點指令。它是這樣做的:首先將該頁面的保護模式改變為PAGE_EXECUTE_READWRITE,然後改變指令流。因為程式碼頁面是所對映的記憶體區的一部分,所以記憶體管理器為設定了斷點的那個程序建立一份私有拷貝,同時其它程序仍然使用原先未經修改的程式碼頁面。
    寫時複製是“延遲計算(lazy evaluation)”這一計算技術(evaluation technique)的一個例子,記憶體管理器廣泛地使用了延遲計算的技術。延遲計算使得只有當絕對需要時才執行一個昂貴的操作——如果該操作從來也不需要的話,則它不會浪費任何一點時間。
    POSIX子系統利用寫時複製來實現fork函式
,當一個UNIX應用程式呼叫fork函式來建立另一個程序時,新程序所做的第一件事是呼叫exec函式,用一個可執行程式來重新初始化它的地址空間。在fork中,新程序不是拷貝整個地址空間,而是通過將頁面標記為寫時複製的方式,與父程序共享這些頁面。如果子程序在這些頁面中寫入資料了,則生成一份程序私有的拷貝。如果沒有寫操作,則2個程序繼續共享頁面,不會執行拷貝動作。不管怎麼樣,記憶體管理器只拷貝一個程序試圖要寫入資料的那些頁面,而不是整個地址空間

Copy On Write(寫時複製)是在程式設計中比較常見的一個技術,面試中也會偶爾出現(好像Java中就經常有字串寫時複製的筆試題),今天在看《More Effective C++》的引用計數時就講到了Copy On Write——寫時複製。下面簡單介紹下Copy On Write(寫時複製),我們假設STL中的string支援寫時複製(只是假設,具體未經考證,這裡以Mircosoft Visual Studio 6.0為例,如果有興趣,可以自己翻閱原始碼)

Copy On Write(寫時複製)的原理是什麼?
有一定經驗的程式設計師應該都知道Copy On Write(寫時複製)使用了“引用計數”,會有一個變數用於儲存引用的數量。當第一個類構造時,string的建構函式會根據傳入的引數從堆上分配記憶體,當有其它類需要這塊記憶體時,這個計數為自動累加,當有類析構時,這個計數會減一,直到最後一個類析構時,此時的引用計數為1或是0,此時,程式才會真正的Free這塊從堆上分配的記憶體。
引用計數就是string類中寫時才拷貝的原理!

什麼情況下觸發Copy On Write(寫時複製)
很顯然,當然是在共享同一塊記憶體的類發生內容改變時,才會發生Copy On Write(寫時複製)。比如string類的[]、=、+=、+等,還有一些string類中諸如insert、replace、append等成員函式等,包括類的析構時。

示例程式碼:

// 作者:程式碼瘋子
// 部落格:http://www.programlife.net/
// 引用計數 & 寫時複製
#include <iostream>
#include <string>
using namespace std;
 
int main(int argc, char **argv)
{
	string sa = "Copy on write";
	string sb = sa;
	string sc = sb;
	printf("sa char buffer address: 0x%08X\n", sa.c_str());
	printf("sb char buffer address: 0x%08X\n", sb.c_str());
	printf("sc char buffer address: 0x%08X\n", sc.c_str());
 
	sc = "Now writing...";
	printf("After writing sc:\n");
	printf("sa char buffer address: 0x%08X\n", sa.c_str());
	printf("sb char buffer address: 0x%08X\n", sb.c_str());
	printf("sc char buffer address: 0x%08X\n", sc.c_str());
 
	return 0;
}
Copyed From 程式人生 
Home Page:http://www.programlife.net 
Source URL:http://www.programlife.net/copy-on-write.html 

輸出結果如下(VC 6.0):

Copy On Write(寫時複製)

可以看到,VC6裡面的string是支援寫時複製的,但是我的Visual Studio 2008就不支援這個特性(Debug和Release都是):

Visual Studio 2008不支援Copy On Write(寫時複製)
拓展閱讀:(摘自《Windows Via C/C++》5th Edition,不想看英文可以看中文的PDF,中文版第442頁)
Static Data Is Not Shared by Multiple Instances of an Executable or a DLL

When you create a new process for an application that is already running, the system simply opens another memory-mapped view of the file-mapping object that identifies the executable file’s image and creates a new process object and a new thread object (for the primary thread). The system also assigns new process and thread IDs to these objects. By using memory-mapped files, multiple running instances of the same application can share the same code and data in RAM.

Note one small problem here. Processes use a flat address space. When you compile and link your program, all the code and data are thrown together as one large entity. The data is separated from the code but only to the extent that it follows the code in the .exe file. (See the following note for more detail.) The following illustration shows a simplified view of how the code and data for an application are loaded into virtual memory and then mapped into an application’s address space.

Copy On Write(寫時複製)Windows核心程式設計
As an example, let’s say that a second instance of an application is run. The system simply maps the pages of virtual memory containing the file’s code and data into the second application’s address space, as shown next.

Copy On Write(寫時複製)Windows核心程式設計
If one instance of the application alters some global variables residing in a data page, the memory contents for all instances of the application change. This type of change could cause disastrous effects and must not be allowed.

The system prohibits this by using the copy-on-write feature of the memory management system. Any time an application attempts to write to its memory-mapped file, the system catches the attempt, allocates a new block of memory for the page containing the memory the application is trying to write to, copies the contents of the page, and allows the application to write to this newly allocated memory block. As a result, no other instances of the same application are affected. The following illustration shows what happens when the first instance of an application attempts to change a global variable in data page 2:

Copy On Write(寫時複製)Windows核心程式設計
The system allocated a new page of virtual memory (labeled as “New page” in the image above) and copied the contents of data page 2 into it. The first instance’s address space is changed so that the new data page is mapped into the address space at the same location as the original address page. Now the system can let the process alter the global variable without fear of altering the data for another instance of the same application.

A similar sequence of events occurs when an application is being debugged. Let’s say that you’re running multiple instances of an application and want to debug only one instance. You access your debugger and set a breakpoint in a line of source code. The debugger modifies your code by changing one of your assembly language instructions to an instruction that causes the debugger to activate itself. So you have the same problem again. When the debugger modifies the code, it causes all instances of the application to activate the debugger when the changed assembly instruction is executed. To fix this situation, the system again uses copy-on-write memory. When the system senses that the debugger is attempting to change the code, it allocates a new block of memory, copies the page containing the instruction into the new page, and allows the debugger to modify the code in the page copy.

(如果是)原創文章,轉載請註明(文字為系統自動新增,實際意義上本段文字僅針對原創文章而言):
本文出自程式人生 >> Copy On Write(寫時複製)
作者:程式碼瘋子