《設計資料密集型應用》——第一章原文+翻譯(上)
Reliable,Scalable,and Maintainable Applications(可靠,可伸縮,可維護的應用)
Many applications today are data-intensive, as opposed to compute-intensive. Raw CPU power is rarely limiting factor for these application—bigger problems are usually the amount of data, the complexity of data, the speed at which it is changing.
與計算密集型應用相對的,當今的很多應用都是資料密集型應用。這些應用很少受限於原始的CPU計算能力,更多的受限於資料的體量、資料的複雜度以及資料改變的速度。
A data-intensive application is typically built from standard building blocks that provide commonly need functionality. For example, many application need to:
- Store data so that they, or another application, can find it again later (databases)
- Remember the result of an expensive operation, so speed up reads (caches)
- Allow users to search data by keyword or filter it in various ways. (Search indexes)
- Send a message to another process, to be handled asynchronously (stream process)
- Periodically crunch a large amount of accumulated data (batch process)
資料密集型應用通常由一些能夠提供常見功能的標準構建塊構建而成。例如,很多應用都需要下面的功能:
- 儲存資料方便日後供自己或者其他應用訪問(資料庫)。
- 記住一些開銷大的操作,加快資料讀取(快取)。
- 允許使用者根據關鍵字搜尋,使用各種方式查詢資料(搜尋)
- 將訊息傳送至其他的程序,進行非同步處理(流式處理)
- 間斷性的處理累積的歷史資料(批處理)
If that sounds painfully obvious, that’s just because these data systems are such a successful abstractions: we use them all the time without thinking too much. When building an application, most engineers wouldn’t dream of writing a new data storage engine from scratch, because databases are a good tool for the job.
如果上面的這些功能聽起來極其的平淡無奇,只是因為這些資料系統做到了非常成功的抽象:我們不需要考慮太多問題就可以非常輕鬆的使用它們。在開發應用時,大部分的軟體工程師不會幻想著開發一個新的資料儲存引擎,因為有現成的資料庫供我們直接使用。
But reality is not that simple. There are many database systems with different characteristics, because different applications have different requirements. There are various approaches to caching, several ways to search indexes, and so on. When building an application, we still need to figure out which tools and which approaches are the most appropriate for the job at hand. And it can be hard to combine tools when you need to do something that a single tool cannot do alone.
但是現實不是那麼簡單。因為不同的應用有不同的需求,導致出現了不同種類的資料庫系統。例如實現快取的方法有很多,實現搜尋索引的方式也有一些,其他情況也類似。所以在構建應用時,我們必須明確哪些才是最適合系統的工具和方法。另外,當單一的工具不能滿足需求時就需要組合多種工具,工具的組合使用也不是那麼容易。
This book is a journey through both the principles and the practicalities of data systems, and how you can use them to build data-intensive applications. We will explore what different Tools have in common, what distinguish them, and how they achieve their characteristics.
這本書自始至終講解的是資料系統的原理以及實踐,告訴你怎麼使用這些資料系統構建資料密集型應用。我們將會探討不同的資料系統之間的相同點和不同點,以及它們的實現原理。
In this chapter, we will start by exploring the fundamentals of what we are trying to achieve: reliable,scalable,and maintainable Data systems. We’ll clarify what those things mean, outline some ways of thinking about them, and go over the basics we will need for later chapters.
在本章,我們首先探索一些的系統的基本原則:穩定性,伸縮性以及可維護性。我們會闡述它們具體的含義,概述一些思考這些問題的方式,以及準備一些後面將會用到的基礎知識。
Thinking about data systems
We typically think of databases,queues,caches,etc. as being very different categories of tools. Although a database and a message queue have some superficial similarity—both store data for some time—they have different access patterns, which means different performance characteristics, and thus different implementations.
通常情況下,資料庫、佇列、快取等工具被劃為為不同種類的工具。儘管一個數據庫和訊息佇列有著一些非常膚淺的相似點——都能儲存資料——但是它們有著完全不同的資料訪問模式,不同的效能追求,不同的實現。
So why should we lump them all together under an umbrella term like data systems?
那為什麼我們會把這些截然不同的資料工具都放到資料系統這個術語下面?
Many new tools for data storage and processing have emerged in recent years. They are optimizied for a variety of different use cases, and they no longer neatly fit into traditional categories. For example, there are message queues with database-like durability guarantees(Apache Kafka).The boundaries between the categories are becoming blurred.
近些年出現了許多針對於資料儲存和資料處理的新工具。它們針對於一系列不同的用例而設計,不再滿足於傳統的目錄分類。舉個例子,現在的訊息佇列能夠提供和資料庫類似的持久化保證(Apache Kafka)。不同目錄的界限正在變得模糊。
Secondary,increasingly many applications now have such demanding or wide-ranging requirements that a single tool can no longer meet all of its data processing and storing need. Instead, the work is broken down into tasks that can be preformed efficiently by a single tool, and those different tools are stitched together using application code.
第二,越來越多的應用的需求不再是單一的工具能夠解決的。代替的,應用被拆分成了多個部分,分別交給不同的工具進行處理,最後由應用層的程式碼將這些工具粘合起來。
For example, if you have an application-managed caching layer(using Memcached or similar), or a full-text search server(such as ELasticsearch or Solr) separate from your main database, it is normally the application code’s responsibility to keep those caches and indexes in sync with the main database.
舉個例子,你的應用除了主資料庫,可能還會有快取服務(Memcached或者其他快取),全文搜尋服務(ElasticSearch或者Solr)。最終靠應用層的程式碼來保證快取、全文索引和主資料庫的資料同步。
When you combine several tools in order to provide a service, the service’s interface or application programming interface usually hides those implementations details from clients. Now you have essentially created a new, special-purpose data systems From smaller, general-purpose components. Your composite data system may provide certain guarantees:e.e., that the cache will be correctly invalidated or updated on writes so that outside clients can see consistent results. You are now not only an application developer, but also a data system designer.
當結合幾種工具編寫出一個服務時,服務的介面或者應用程式介面會掩蓋掉具體的實現。到此,你基本上使用一些小的、通用的元件創造出了一個嶄新的、專用的資料系統。這個組合出來的系統也可以提供某些保證:例如當資料更新時,快取也應該及時更新,這樣客戶端才能查詢到一致的結果。
If you are designing a data system or service, a lot of tricky questions arise. How do you ensure that data remains correct and complete even when things go wrong internally? How do you provide consistently good performance to clients, even when parts of your system are degraded? How do you scale to handle an increase in load? What does a good API for service look like?
如果你正在設計一個數據系統,必須要考慮一些棘手的問題。當系統內部出錯時,怎麼保證資料的正確和完整?當系統部分降級後,怎麼確保效能不受影響?當負載增加後,怎麼相應的擴充套件系統?怎麼設計一個良好的服務API?
There are many factors that may influence the design of a data system, including the skills and experiences of people involved, legacy system dependencies, the time-scale for delivery, your organization’s tolerance of different kinds of risk, regulatory constraints, etc. Those factors depend very much on situations.
很多因素會影響到系統的設計,例如參與其中的人員的能力和工作經驗,遺產系統的依賴,交付的時間限制,企業對風險的容忍力度,規章的限制等等。這些因素需要根據具體的情況進行討論。
In this book, we focus on three concerns that are important in most software systems
本書主要關注三個軟體設計中的重要原則:
Reliability 穩定性
The system should continue to work correctly even in the face of adversity(hardware or software faults, and even human error)
在發生了某些異常(硬體錯誤,軟體錯誤,人為操作錯誤)時,系統仍能持續正常工作。
Scalability 伸縮性或擴充套件性
As the system grows(in data volume, traffic volumes, or complexity), there should be reasonable ways of dealing with that growth,
伴隨著系統的增長,包括資料體量、流量、或者複雜性的增長,系統應該存在合理的方式應對這些增長。
Maintainability 可維護性
Over time, many different people will work on the system(engineering and operations, both maintaining current behavior and adapting the system to new use cases), and they should all be able to work on it productively.
隨時間推移,會有不同職責的人在這個系統上工作,包括研發人員和運維人員,可能維護當前的系統功能也可能引入新的用例。他們應該能夠高效的在這個系統上工作。
These words are often cast around without a clear understanding of what they mean.In the interest of thoughtful engineering, we will spend the rest of this chapter exploring ways of thinking about reliability, scalability, maintainability. Then, in the following chapter, we will look at various techniques, architectures, and algorithms that are used in order to achieve those goals.
我們經常能夠搜到這些詞語,但是沒有關於這些詞的清晰解釋。考慮到要做一個有想法的軟體工程師,我們會在本章剩餘的部分探討關於穩定性、伸縮性、可維護性的思考方式。在接下來的章節中,討論一些實現這些目標的技術、架構和演算法。
Reliability 穩定性
Everybody has an intuitive idea of what it means for something to be reliable or unreliable. For software, typical expectations include:
- The application performs the function that the user expected.
- It can tolerate the user making mistake or using the software in unexpected ways
- Its performance is good enough for the required use case, under the expected load an data volume.
- The system prevents unauthorized access and abuse.
每個人對於一件東西是否穩定都有直觀的理解。對於軟體來說,穩定的系統應該是:
- 系統能夠按照使用者的預期執行功能。
- 系統能夠允許使用者犯錯、或者按照未預期的方式使用軟體。
- 在預期到的負載情況下,系統能夠提供良好的效能。
- 系統能夠阻止未授權的訪問和濫用
If all those things together mean “working correctly”, then we can understand reliability as meaning, roughly,”continuing to work correctly, even when things go wrong”
如果上面的這些要求結合在一起意味著一個系統能正常工作,那麼我們可以得出一個粗略的穩定性定義“在系統發生某些錯誤的時候,也能持續性的工作”。
The things that can go wrong was called faults,and systems that anticipate faults and can cope with them called fault-tolerant or resilient. The former term is slightly misleading: it suggests that we could make a system tolerant of every possible kind of fault, which in reality is not feasible. If the entire planet Earth was swallowed by a black hole, tolerance of that fault would require web hosting in space. So it only makes sense to talk about tolerating contain types of faults.
系統執行中發生錯誤的東西我們成為fault, 能夠預測到這些錯誤並且處理錯誤的系統我們稱為能夠“容忍錯誤”或者有”彈性”。容錯這個詞語可能會令人產生誤解:字面上看,我們可以使得系統容忍各種可能的錯誤,但實際是不可能的。設想一下,如果整個地球被一個黑洞吞沒,我們必須把服務架設在太空中才能容忍這種錯誤。因此在這裡討論容忍某些型別的錯誤才有意義。
Note that a fault is not the same as a failure. A fault is usually defined as one component of the system deviating from its spec, whereas a failure is when the system as a a whole stops providing the required service to the user. It is impossible to reduce the probability of a fault to zero; therefore it is usually best to design fault-tolerance mechanisms that prevent faults from causing failures. In this book we cover several techniques for building reliable systems from unreliable parts.
注意錯誤不等同於失敗。錯誤被定義為系統的某個元件偏離了其正常的執行,但是失敗代表著整個系統停止了服務。錯誤是不可避免的,所以通常情況下會設計容錯機制來阻止錯誤造成更嚴重的失敗。
Counterintuitively, in such fault-tolerant systems, it can make sense to increase the rate of faults by triaging them deliberately—for example, by randomly killing individual processes without warning. Many Many bugs are actually due to poor error handing;by deliberately inducing faults, you ensure that the fault-tolerate machinery is continually exercised and tested, which can increase you confidence that faults will be handled correctly when they occur naturally.
在容錯系統中,故意觸發錯誤(例如故意殺死某些程序)增加出錯的概率很有必要。因為很多bug都是因為錯誤處理不當引起的,通過故意引入錯誤來考驗我們系統的容錯機制,這樣當錯誤真正發生時才能正確處理錯誤。
Although we generally prefer to tolerating faults over preventing faults, there are cases where prevention is better than tolerance. This is the case with security matters, for example: if an attacker has compromised a system and gained access to sensitive data, that event cannot be undone. However, this book mostly deals with the kinds of faults that can be cured, as described in following sections.
儘管通常情況下容忍錯誤優於阻止錯誤,但是有一些情況阻止錯誤優於容忍錯誤。例如在安全領域,一個黑客攻破了一個系統獲得了資料的操作許可權,這種事情是無法容忍的。然而, 本書主要關注於那些能夠被解決的錯誤。
Hardware faults 硬體錯誤
When we think of causes of system failure, hardware faults quickly come to mind. Hard disks crash, RAM becomes faulty, the power grid has a blackout, someone unplugs the wrong network cable. Anyone who has worked with large data center can tell you that these things happen all the time when you have a lot of machines.
一提起系統失敗的原因,映入腦海的首先是硬體錯誤。例如硬碟故障,記憶體故障,電源故障,人為拔下網線等。有過在大型資料中心工作經驗的人會告訴你這些硬體故障一直在發生。
Hard disks are reported as having a mean time to failure(MTTF) of about 10 to 50 years. Thus, on a storage cluster with 10,000 Disks, we should expect on average one disk to die per day.
硬碟擁有著大約10~50年的平均使用壽命。在一個擁有著10000塊硬碟的儲存叢集中,可以預測平均每天壞一塊硬碟。
Our first response is usually to add redundancy to the individual hardware components in order to reduce the failure rate of the system. Disk may be set up in a RAID configuration, servers may have dual power and hot-swapped CPUs, and data enters may have diesel generators for backup power. When one component dies, redundant component can take its place while the broken component is replaced. This approach cannot completely prevent hardware problems from causing failures, but it is well understood and can often keep a machine running uninterrupted for years.
應對硬體故障的第一反應通常是為每個硬體元件增加冗餘。例如,硬碟可以配置RAID,伺服器可以配置雙電源和熱插拔的CPU,一些資料中心配有柴油發電機作為備份電源。當某個元件故障時,後備的元件接替故障的元件繼續提供服務。這種增加冗餘的技術無法完全阻止硬體故障造成系統失敗,但是這種技術非常成熟,通常能夠保證機器不間斷的執行數年。
Until recently, redundancy of hardware components was sufficient for most applications, since it makes total failure of a single machine fairly rare. As long as you restore a backup onto a new machine fairly quickly, the downtime in case of failure is not catastrophic in most applications. Thus, multi-machine redundancy was only required by a small number of applications for which high availability is essential.
一直到現在,增加硬體冗餘的技術仍然可以滿足大多數的軟體系統。而且對大多數軟體系統來說,發生硬體故障以後,只要能夠快速把系統在新的機器上恢復,短暫的宕機時間仍然可以接受。因此,只有少量的高可用系統才需要多機冗餘。
However, as data volumes and applications’ computing demands have increased, more applications have begun using larger numbers of machines, which proportionally increase the rate of hardware failure. Moreover, in some clout platforms such as Amazon Web Services(AWS) it is fairly common for virtual machine to become unavailable without warning, as the platforms are designed to prioritize flexibility and elasticity over single-machine reliability.
然而,隨著資料體量的增減和計算需要的增加,更多的應用開始使用多臺機器提供服務,這也成比例的增加了硬體故障的可能性。而且一些類似於AWS的雲平臺,它們更加關注於提供伸縮性和靈活性而非單機的可靠性,所以在這些平臺中虛擬機器硬體故障是很常見的。
Hence there is a move toward systems that can tolerate the loss of entire machines, by using software fault-tolerance techniques in preference or in addition to hardware redundancy. Such systems also have operational advantages: a single-server system requires planned downtime if you need to reboot the machine(to apply operating system security patches , for example), whereas a system that can tolerate machine failure can be patched one node at a time, without downtime of the entire system (a rolling upgrade).
因此,除了剛才提到的硬體冗餘之外,出現了一種趨勢:通過優先使用軟體容錯技術來容忍整個叢集故障。這樣的系統還具有操作優勢:如果需要重新啟動機器(例如,作業系統修補補丁),則單機的系統必須停機一段時間,而能夠容忍機器故障的系統可以一次修補一個節點,而不會造成整個系統的故障(即滾動升級)。
Software Errors 軟體錯誤
We usually think of hardware failure as being random and independent from each other: one machine’s disk failing does not imply that another machine’s disk is going to failure.There may be weak correlation(for example due to a common cause, such as the temperature in the server rack), but otherwise it is unlikely that a larger number of hardware components will fail at the same time.
通常情況下,硬體故障是隨機的,獨立於其他的機器而發生:某臺機器硬碟故障不能說明其他機器的硬碟也發生故障。儘管這其中有很弱的相關性,例如因為機架的溫度造成了多臺機器故障,但是一般情況下不可能大量的機器同時發生故障。
Another class of fault is a systematic error within the system. Such faults are hard to anticipate, and because they are correlated across nodes, they tend to cause many more system failures than hardware faults. Examples include:
另外一類錯誤是軟體系統內部的錯誤。軟體錯誤往往很難預測,而且因為節點之間的相關性,軟體錯誤會比硬體錯誤造成更多的系統失敗。軟體錯誤如下:
- A software bug that cause each instance of an system to crash when given a error input. For example, consider the leap second on June 30,2012, that caused many applications to hang simultaneously due to a bug in Linux kernel.
- A runaway process that uses up some shared resources—CPU time, memory, disk space , or network bandwidth.
- A service that a system depends on that slows down, becomes unresponsive,or starts returning corrupted response.
- Cascading failures, where a small fault in one component triggers a fault in another component, which in turn triggers future faults.
The bugs that cause these kinds of software faults lie dormant for a long time until they are triggered by an unusual set of circumstances. In those circumstances, it is revealed that the software is making some kind of assumption about its environment—and while that assumption is usually true, it eventually stops bring true for some reason.
造成軟體故障的bug被觸發之前可能會潛伏很長時間。這也顯示了軟體系統會對它所允許的環境做出某些假設,這些假設通常是正確的,但是最終會因為各種原因變得不正確。
There is no quick solution to the problem of systematic failure of system. Lots of small things can help: carefully thinking about assumptions and interactions in the system;thorough testing;process isolation;allowing processes to crash and restart;measuring,monitoring,and analyzing system behavior in production.If a system is expected to provide some guarantee(for example,in a message queue, that the number of incoming messages equals the number of outgoing messages), it can constantly check itself while it is running and raise an alert if a discrepancy is found.
解決軟體系統失敗沒有捷徑。有著一些建議:仔細考慮系統中的假設和互動;細緻的進行測試;程式隔離;程式崩潰後重啟機制;測量、監控、分析系統的執行情況。如果系統還被要求提供某些保證,例如訊息佇列中的入隊和出隊訊息數量必須一致,那麼系統必須一直自我檢查確保及時發現不一致的情況。
Human Errors 人為錯誤
Humans design and build software systems, and the operators who keep the systems running are also human.Even when they have the best intentions, humans are known to be unreliable. For example, one study of large internet services found that configuration errors by operators were the leading cause of outages, whereas hardware faults played a role in only 10-25% of outages.
是人類設計和夠建了軟體系統,也是人類操作員維持著系統的正常執行。即使人類的意圖再好,也是不可靠的。例如,一項關於大型網際網路服務的研究發現,運維人員的配置錯誤是導致系統故障的主要原因,而硬體問題只佔據了其中的10~25%。
How do we make our system reliable,in spite of unreliable human?The best systems combine several approaches:
- Design systems in a way that minimizes opportunities for error. For example, well-designed abstraction, APIs, and Amin interfaces make it easy to do “the right thing” and discourage “the wrong thing.” However, if the interfaces are too restrictive people will work around them, negating their benefits, so this is a tricky balance to get right.
- Decouple the places where people make the most mistakes from the places where they can cause failures. In particular, provide full featured non-production sandbox environments where people can explore and experiment safely, using real data, without affecting real users.
- Test thoroughy at all levels, from unit tests to whole system integration tests and manual tests. Automated testing is widely used, well understood, and especially valuable for covering corner cases that rarely arise in normal operations.
- Allow quickly and easy recovery from human errors,to minimize the impact in the case of a failure. For example, make it fast to roll back configuration changes, roll out new code gradually, and provide tools to recompute data(in case it turns out that the old computation was incorrect)
- Set up detailed and clear monitoring, such as performance metrics and error rates. In other engineering disciplines this is referred to as telemetry.(once a rocket has left the ground, telemetry is essential for tracking what is happening and for understanding failures) Monitoring can show us early warning signals and allow to us to check whether any assumption or constraints are being violated. When a problem occurs, metrics are invaluable in diagnosing the issue.
- Implementing good management practice and training—a complex and important aspect, and beyond the scope of this book.
How important is reliability? 穩定性的重要性
Reliability is not just for nuclear power station and air traffic control software—more mundane application are also expected to work reliably. Bugs in business applications cause lost productivity(and legal risks if figures are reported incorrectly), and outages of e-commerce sites can have huge costs in terms of lost revenue and damage to reputation.
不是隻有核電站或者航空工業才對系統穩定性有要求,民用的軟體也需要穩定性的保證。業務系統的bug會造成生產損失(如果結果出錯,可能承擔法律責任),電商網站的故障會造成收入和聲譽的損失。
Even in “noncritical” applications we have a responsibility to our users. Consider a parent who stores all their pictures and videos of their children in your photo application.How would they feel if that database was suddenly corrupted?Would they know how to restore it from a backup?
即使開發的是一些“不重要”的軟體,我們也需要對我們的使用者負責。設想一對父母把他們的家庭照片和視訊存放在了你的網站掛上,如果網站資料庫崩潰這對父母會有什麼樣的感受?他們知道怎麼樣從備份中恢復嗎?
There are situations in which we may sacrifice reliability in order to reduce development cost or operational cost—but we should be very conscious of when we are cutting corner.
也存在一些為了減少開發成本而犧牲穩定性的情況——但是我們一定要考慮清楚什麼情況下才能這麼做。