1. 程式人生 > 資料庫 >課程研討|資料庫原理1|第二週-4

課程研討|資料庫原理1|第二週-4

研討題目

什麼是關係型資料庫?什麼是非關係型資料庫?它們有什麼區別?各舉1個典型產品簡單介紹他們特點?

研討內容

關係型資料庫

關係型資料庫是建立在關係模型基礎上的資料庫。

What is a Relational Database?

Relational databases maintain data in tables, providing an efficient, intuitive, and flexible way to store and access structured information. Tables, also known as relations, consist of columns

containing one or more data categories, and rows, also known as table records, containing a set of data defined by the category. Applications access data by specifying queries, which use operations such as project to identify attributes, select to identify tuples, and join to combine relations. The relational model for database management was developed by IBM computer scientist Edgar F. Codd in 1970.

關係資料庫將資料儲存在表中,從而提供了一種有效、直觀且靈活的方式來儲存和訪問結構化資訊。表,也稱為關係,由包含一個或多個數據類別的列和包含表項定義的一組資料的行(也稱為表記錄)組成。應用程式通過指定查詢來訪問資料,查詢使用諸如專案之類的操作來標識屬性,選擇來標識元組並聯結以合併關係。

How do Relational Databases Work?

Relational databases provide an environment from which data can be accessed or reassembled in a variety of different ways without needing to reorganize the database tables. Each table has a unique identifier, or primary key, which identifies the information in the table, and each row contains a unique instance of data for the categories defined by the columns. For instance, the table might have a primary key of ‘First Names’ and rows with specific examples such as ‘John, Paul, George and Ringo.’

關係資料庫提供了一種可以以各種不同的方式訪問或重組資料,而無需重新組織資料庫表的環境。每個表只都有一個唯一的識別符號或主鍵,用於標識表中的資訊,並且每一行都包含由列定義的類別的資料的唯一例項。

The logical connection between different tables can then be established with the use of foreign keys - a field in a table that connects to the primary key data of another table. Relational Database Management Systems often employ SQL or structured query language for gathering data for reports and for interactive queries. So in our example, First Names might be linked to a Role table with data roles of Lead Vocals, Bass Guitar, Drums and Lead Guitar.

可以使用外來鍵建立不同表之間的邏輯連線,外來鍵是表中的欄位,該欄位連線到另一個表的主鍵資料。關係型資料庫管理系統通常採用SQL或結構化查詢語言來手機報告資料和互動式查詢。

How is Data in a Relational Database System Organized?

The relational model of the relational database separates logical data structures from physical storage structures, enabling database administrators to manage physical data storage without affecting access to that data as a logical structure. The distinction also applies to database operations – logical operations allow an application to specify the content it needs, and physical operations determine how that data should be accessed, then carries out the task.

關係資料庫的關係模型將邏輯資料結構與物理儲存結構分開,使資料庫管理員可以管理物理資料儲存而不會影響對作為邏輯結構的資料的訪問。邏輯操作允許應用程式指定其所需內容,而物理操作則確定應如何訪問資料然後執行任務。

What are the Advantages of a Relational Database?

The main advantage of a relational database is its formally described, tabular structure, from which data can be easily stored, categorized, queried, and filtered without needing to reorganize database tables. Further benefits of relational databases include:

  • Scalability: New data may be added independent of existing records.
  • Simplicity: Complex queries are easy for users to perform with SQL.
  • Data Accuracy: Normalization procedures eliminate design anomalies.
  • Data Integrity: Strong data typing and validity checks ensure accuracy and consistency.
  • Security: Data in tables within a RDBMS can limit access to specific users.
  • Collaboration: Multiple users can access the same database concurrently.

關係資料庫的主要優點是它的形式化描述的表格結構,可以從中輕鬆儲存,分類,查詢和過濾資料,而無需重新組織資料庫表。關係資料庫的其他好處包括:

  • 可伸縮性:可以獨立於現有記錄新增新資料
  • 簡便性:複雜的查詢使使用者易於使用SQL執行
  • 資料準確性:歸一化程式可以消除設計異常
  • 資料完整性:強大的資料鍵入和有效性檢查可確保準確性和一致性
  • 安全性:RDBMS中表中的資料可以限制對特定使用者的訪問
  • 協作:多個使用者可以同時訪問同一資料庫

What is a Relational Database Management System?

A Relational Database Management System is a tabular based collection of programs and capabilities that provides an interface between users and applications and the database, offering a systematic way to create, update, delete, manage, and retrieve data. Most relational database management systems use the SQL programming language to access the database and many follow the ACID (Atomicity, Consistency, Isolation, Durability) properties of the database:

  • Atomicity: If any statement in the transaction fails, the entire transaction fails and the database is left unchanged.
  • Consistency: The transaction must meet all protocols defined by the system – no partially completed transactions.
  • Isolation: No transaction has access to any other transaction that is unfinished. Each transaction is independent.
  • Durability: Once a transaction has been committed, it will remain committed through the use of transaction logs and backups.

關係資料庫管理系統是基於表格的程式和功能的集合,可在使用者和應用程式與資料庫之間提供介面,從而提供了一種建立,更新,刪除,管理和檢索資料的系統方法。大多數關係資料庫管理系統使用SQL程式語言來訪問資料庫,並且許多遵循資料庫的ACID屬性。

  • 原子性

    如果事務中的任何語句失敗,則整個事務都會失敗,並且資料庫將保持不變。

    所有操作要麼全部成功,要麼全部失敗回滾,且操作失敗不能對資料庫由任何影響。

  • 一致性

    交易必須符合系統定義的所有協議,不能部分完成交易。

    一個事務執行之前和執行之後都必須處於一致性狀態。

  • 隔離性

    沒有事務可以訪問任何其他未完成的事務。每筆交易都是獨立的。

    多個使用者併發訪問資料庫時,資料庫為每一個使用者開啟的事務,不能被其他事務的操作所幹擾,多個併發事務之間要相互隔離。

  • 永續性

    提交事務後,將通過使用事務日誌和備份來保持提交。

    事務一旦被提交了,資料的改變是永久性的,即便在資料庫系統遇到故障的情況下也不會丟失提交事務的操作。

非關係型資料庫

非關係資料庫以非表格形式儲存資料,並且比傳統的基於SQL的關係資料庫結構更靈活。它不遵循傳統關係資料庫管理系統提供的關係模型。

Non-relational databases (often called NoSQL databases) are different from traditional relational databases in that they store their data in a non-tabular form. Instead, non-relational databases might be based on data structures like documents. A document can be highly detailed while containing a range of different types of information in different formats. This ability to digest and organize various types of information side-by-side makes non-relational databases much more flexible than relational databases.

Non-relational databases are often used when large quantities of complex and diverse data need to be organized. For example, a large store might have a database in which each customer has their own document containing all of their information, from name and address to order history and credit card information. Despite their differing formats, each of these pieces of information can be stored in the same document.

Non-relational databases often perform faster because a query doesn’t have to view several tables in order to deliver an answer, as relational datasets often do. Non-relational databases are therefore ideal for storing data that may be changed frequently or for applications that handle many different kinds of data. They can support rapidly developing applications requiring a dynamic database able to change quickly and to accommodate large amounts of complex, unstructured data.

非關係資料庫與傳統關係資料庫的不同之處在於,它們以非表格形式儲存資料。相反,非關係資料庫可能基於文件之類的資料結構。文件可以高度詳細,同時包含各種格式的不同型別的資訊。並排消化和組織各種型別資訊的能力使非關係資料庫比關係資料庫更加靈活。

當需要組織大量複雜多樣的資料時,通常使用非關係資料庫。非關係資料庫的執行速度通常更快,這是因為查詢時不必像提供關係資料集那樣檢視多個表。因此,非關係資料庫是儲存可能經常更改的資料或處理許多不同型別資料的應用程式的理想選擇。它們可以支援快速開發的應用程式,這些應用程式,這些應用程式要求動態資料庫能夠快速更改並容納大量複雜的非結構化資料。

A NoSQL (originally referring to “non-SQL” or “non-relational”) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed since the late 1960s, but the name “NoSQL” was only coined in the early 21st century,triggered by the needs of Web 2.0 companies.NoSQL databases are increasingly used in big data and real-time web applications.NoSQL systems are also sometimes called “Not only SQL” to emphasize that they may support SQL-like query languages or sit alongside SQL databases in polyglot-persistent architectures.

Motivations for this approach include: simplicity of design, simpler “horizontal” scaling to clusters of machines (which is a problem for relational databases), finer control over availability and limiting the object-relational impedance mismatch. The data structures used by NoSQL databases (e.g. key–value pair, wide column, graph, or document) are different from those used by default in relational databases, making some operations faster in NoSQL. The particular suitability of a given NoSQL database depends on the problem it must solve. Sometimes the data structures used by NoSQL databases are also viewed as “more flexible” than relational database tables.

The Benifits of a non-relational database

Today’s applications collect and store increasingly vast quantities of ever-more complex customer and user data. The benefits of this data to businesses, of course, lies in its potential for analysis. Using a non-relational database can unlock patterns and value even within masses of variegated data.

There are several advantages to using non-relational databases, including:

  • Massive dataset organization
    In the age of Big Data, non-relational databases can not only store massive quantities of information, but they can also query these datasets with ease. Scale and speed are crucial advantages of non-relational databases.

  • Flexible database expansion
    Data is not static. As more information is collected, a non-relational database can absorb these new data points, enriching the existing database with new levels of granular value even if they don’t fit the data types of previously existing information.

  • Multiple data structures
    The data now collected from users takes on a myriad of forms, from numbers and strings, to photo and video content, to message histories. A database needs the ability to store these various information formats, understand relationships between them, and perform detailed queries. No matter what format your information is in, non-relational databases can collate different information types together in the same document.

  • Built for the cloud
    A non-relational database can be massive. And as they can in some cases grow exponentially, they need a hosting environment that can grow and expand with. The cloud’s inherent scalability makes it an ideal home for non-relational databases.

當今的應用程式收集和儲存越來越多的客戶和越來越複雜的使用者資料。使用非關係資料庫可以在大量雜色資料中解鎖模式和價值。

  • 大規模資料集組織

    在大資料時代,非關係資料庫不僅可以儲存大量資訊,而且可以輕鬆查詢這些資料集。規模和速度是非關係資料庫的關鍵優勢。

  • 靈活的資料庫擴充套件

    資料不是靜態的。隨著收集到更多資訊,非關係資料庫可以吸收這些新資料點,從而使現有資料庫具有新的粒度值,即使它們不適合先前現有資訊的資料型別。

  • 多種資料結構

    從使用者那裡收集的資料具有多種形式,從數字和字串到照片和視訊內容再到訊息歷史記錄。資料庫。資料庫需要具有儲存這些各種資訊格式,瞭解它們之間的關係以及執行詳細查詢的能力。無論您的資訊採用哪種格式,非關係資料庫都可以在同一文件中整理不同的資訊型別。

  • 專為雲而建

    非關係資料庫可能非常龐大。並且由於它們在某些情況下可以加倍增長,因此需要一個可以隨其擴充套件和擴充套件的託管環境。雲的固有可擴充套件性使其成為非關係資料庫的理想之處。

Non-relational databases and application development

Applications must be able to query data efficiently and deliver results almost instantly. Non-relational databases are a natural choice for this kind of environment. They offer both security and agility, allowing for rapid development of applications in an agile environment. Easier and less complex to manage than relational databases, they can also yield lower data management costs while providing superior performance and speed.

Naturals for agile development, non-relational databases can accommodate the complexity of data inputs more efficiently than structured databases. In an age of increasing data complexity, non-relational databases provide the flexibility in database design that has become increasingly indispensable. Especially when paired with the cloud, non-relational databases lift the limits on your data collection, organization, and analysis, allowing you to get the most out of your data.

應用程式必須能夠有效查詢資料並幾乎立即交付結果。對於這種環境,非關係資料庫是自然的選擇。它們同時提供安全性和敏捷性,從而允許在敏捷環境中快速開發應用程式。與關係資料庫相比,它們更易於管理,管理起來也不那麼複雜,它們還可以降低資料管理成本,同時提供卓越的效能和速度。

與結構化資料庫相比,非關係資料庫對於敏捷開發是很自然的,它可以更有效地適應資料輸入的複雜性。在資料複雜性日益增長的時代,非關係型資料庫提供了越來越不可缺少的資料庫設計靈活性。尤其是與雲配對時,非關係型資料庫將解除對資料收集,組織和分析的限制,使您能夠充分利用資料。

Types of NoSQL Databases

Over time, four major types of NoSQL databases emerged: document databases, key-value databases, wide-column stores, and graph databases. Let’s examine each type.

  • Document databases store data in documents similar to JSON (JavaScript Object Notation) objects. Each document contains pairs of fields and values. The values can typically be a variety of types including things like strings, numbers, booleans, arrays, or objects, and their structures typically align with objects developers are working with in code. Because of their variety of field value types and powerful query languages, document databases are great for a wide variety of use cases and can be used as a general purpose database. They can horizontally scale-out to accomodate large data volumes. MongoDB is consistently ranked as the world’s most popular NoSQL database according to DB-engines and is an example of a document database.

  • Key-value databases are a simpler type of database where each item contains keys and values. A value can typically only be retrieved by referencing its key, so learning how to query for a specific key-value pair is typically simple. Key-value databases are great for use cases where you need to store large amounts of data but you don’t need to perform complex queries to retrieve it. Common use cases include storing user preferences or caching. Redis and DynanoDB are popular key-value databases.

  • Wide-column stores store data in tables, rows, and dynamic columns. Wide-column stores provide a lot of flexibility over relational databases because each row is not required to have the same columns. Many consider wide-column stores to be two-dimensional key-value databases. Wide-column stores are great for when you need to store large amounts of data and you can predict what your query patterns will be. Wide-column stores are commonly used for storing Internet of Things data and user profile data. Cassandra and HBase are two of the most popular wide-column stores.

  • Graph databases store data in nodes and edges. Nodes typically store information about people, places, and things while edges store information about the relationships between the nodes. Graph databases excel in use cases where you need to traverse relationships to look for patterns such as social networks, fraud detection, and recommendation engines. Neo4j and JanusGraph are examples of graph databases.

  • 文件資料庫

    將資料儲存在類似於JSON物件的文件中。每個文件包含成對的欄位和值。這些值通常可以是多種型別,包括字串,數字,布林值,陣列或物件之類的東西,並且它們的結構通常與開發人員在程式碼中使用的物件保持一致。由於它們的欄位值型別和強大的查詢語言多種多樣,因此文件資料庫非常適合各種各樣的用例,並且它們的結構通常與開發人員在程式碼中使用的物件保持一致。由於它們的欄位值型別和強大的查詢語言多種多樣,因此文件資料庫非常適合各種各樣的用例,並且可以用作通用資料庫。它們可以水平擴充套件以適應大資料量。

    • 典型應用場景

      Web應用

    • 不適用場景

      在不同的文件上新增事務。Document-Oriented資料庫並不支援文件間的事務,如果對這方面有需求則不應該選用這個解決方案。

  • 鍵值資料庫

    是一種較簡單的資料庫,其中每個專案都包含鍵和值。通常只能通過引用其鍵來檢索值,因此學習如何查詢特定鍵值對通常很簡單。鍵值資料庫非常適合需要儲存大量資料但無需執行復雜查詢來檢索資料的用例。

    支援搜尋功能

    鍵值儲存使用關係陣列作為其基本資料模型。在此模型中,資料表示為鍵-值對的集合,這樣每個可能的鍵在集合中最多出現一次。

    鍵值模型是最簡單的非平凡資料模型之一,並且更豐富的資料模型通常是對其進行擴充套件的。鍵值模型可以擴充套件為按詞典順序維護鍵的離散排序模型。此擴充套件具有強大的計算功能,因為它可以有效地檢索選擇性鍵範圍。

    鍵值儲存區可以使用從最終一致性到可序列化性的一致性模型。一些資料庫支援鍵的排序。有各種硬體實現方式,有些使用者將資料儲存在記憶體中,而其他使用者則儲存在固態驅動器或旋轉磁碟上。

    • 典型應用場景

      內容快取,主要用於處理大量資料的高訪問負載,也用於一些日誌系統等等

    • 不適用場景

      1. 取代通過鍵查詢,而是通過值來查詢。Key-Value資料庫中根本沒有通過值查詢的途徑。
      2. 需要儲存資料之間的關係。在Key-Value資料庫中不能通過兩個或以上的鍵來關聯資料。
      3. 事務的支援。在Key-Value資料庫中故障產生時不可以進行回滾。
  • 列儲存資料庫

    將資料儲存在表、行和動態列中。列儲存提供了比關係資料庫有了更大的靈活性,因為不需要每一行都具有相同的列。許多人認為列儲存是二維鍵值資料庫。列儲存非常適合需要儲存大量資料並且可以預測查詢模式的情況。列儲存通常用於儲存物聯網資料和使用者配置檔案資料。

    • 典型應用場景

      分散式的檔案系統

    • 不適用場景

      1. 如果我們需要ACID事務。Vassandra就不支援事務。
      2. 原型設計。如果我們分析Cassandra的資料結構,我們就會發現結構是基於我們期望的資料查詢方式而定。在模型設計之初,我們根本不可能去預測它的查詢方式,而一旦查詢方式改變,我們就必須重新設計列族。
  • 圖形資料庫

    將資料儲存在節點和邊中。節點通常儲存有關人物,地點和事物的資訊,而邊緣則儲存有關節點之間的關係的資訊。在需要遍歷關係以查詢模式的用例中,圖形資料庫就十分出色。

    圖形資料庫是為資料的關係而設計的,它的關係很好地表示為由有限數量的關係連線的元素組成的圖。資料示例包括社會關係,公共交通連線,路線圖,網路拓撲等。

    • 典型應用場景

      社交網路,推薦系統等。專注於構建關係圖譜。

    • 不適用場景

      不適合的資料模型。圖資料庫的適用範圍很小,因為很少有操作涉及到整個圖。

Data modelPerformanceScalabilityFlexibilityComplexityFunctionality
Key–value storehighhighhighnonevariable (none)
Column-oriented storehighhighmoderatelowminimal
Document-oriented storehighvariable (high)highlowvariable (low)
Graph databasevariablevariablehighhighgraph theory
Relational databasevariablevariablelowmoderaterelational algebra

二者區別與比較

Non Relational Databases, or NoSQL databases, store and organize data in means other than the tabular relations model used in relational databases. Where relational databases store data in rows and columns, have strict rules concerning data variety and table relationships, and follow strict ACID properties, non relational databases offer a more flexible data structure based on the BASE (Basically Available, Soft state, Eventual consistency) model: Basically Available guarantees the availability of the data - there will be a response to any request, but without any consistency guarantee; Soft State guarantees that the state of the system could change over time; and Eventual Consistency guarantees that the system will eventually become consistent once it stops receiving inputs.

非關係資料庫以與關係資料庫中使用的表格關係模型不同的方式儲存和組織資料。關係資料庫在行和列中儲存資料,具有關於資料種類和表關係的嚴格規則並遵循嚴格的ACID屬性的情況下,非關係資料庫基於BASE(基本可用,軟狀態,最終一致性)模型提供更靈活的資料結構:基本上可用保證資料的可用性——將對任何請求作出響應,但不提供任何一致性保證;軟狀態保證系統的狀態可用隨時間變化;最終一致性可確保系統一旦停止接收輸入,最終將變得一致。

  • 資料儲存方式不同

    關係型和非關係型資料庫的主要差異是資料儲存的方式。關係型資料天然就是表格式的,因此儲存在資料表的行和列中。資料表可以彼此關聯協作儲存,也很容易提取資料。

    與其相反,非關係型資料不適合儲存在資料表的行和列中,而是大塊組合在一起。非關係型資料通常儲存在資料集中,就像文件、鍵值對或者圖結構。你的資料及其特性是選擇資料儲存和提取方式的首要影響因素。

  • 擴充套件方式不同

    SQL和NoSQL資料庫最大的差別可能是在擴充套件方式上,要支援日益增長的需求當然要擴充套件。

    要支援更多併發量,SQL資料庫是縱向擴充套件,也就是說提高處理能力,使用速度更快速的計算機,這樣處理相同的資料集就更快了。

    因為資料儲存在關係表中,操作的效能瓶頸可能涉及很多個表,這都需要通過提高計算機效能來客服。雖然SQL資料庫有很大擴充套件空間,但最終肯定會達到縱向擴充套件的上限。而NoSQL資料庫是橫向擴充套件的。

    而非關係型資料儲存天然就是分散式的,NoSQL資料庫的擴充套件可以通過給資源池新增更多普通的資料庫伺服器(節點)來分擔負載。

  • 對事務性的支援不同

    如果資料操作需要高事務性或者複雜資料查詢需要控制執行計劃,那麼傳統的SQL資料庫從效能和穩定性方面考慮是你的最佳選擇。SQL資料庫支援對事務原子性細粒度控制,並且易於回滾事務。

    雖然NoSQL資料庫也可以使用事務操作,但穩定性方面沒法和關係型資料庫比較,所以它們真正閃亮的價值是在操作的擴充套件性和大資料量處理方面。

  • 儲存方式

    SQL資料儲存於特定結構的表中;NoSQL更加靈活和可擴充套件,儲存方式可用是JSON文件、雜湊表或者其他方式。

  • 表/資料集合的資料的關係

    在SQL中必須定義好表和欄位結構後才能新增資料。表結構可以在被定義之後更新,但是如果有比較大的結構變更的話就會變得比較複雜。在NoSQL中,資料可以在任何時候任何地方新增,不需要先定義表。NoSQL可以在資料集中建立索引,NoSQL更適合初始化資料還不明確或者未定的專案中。

  • 外部資料儲存

    SQL中如何需要增加外部關聯資料的話,規範化做法是在原表中增加一個外來鍵,關聯外部資料表;在NoSQL中除了這種規範化的外部資料表做法以外,還能用非規範化方式把外部資料直接放到原資料集中,以提高查詢效率。

  • SQL中的JOIN查詢

    SQL中科院使用JOIN錶鏈接方式將多個關係資料表中的資料用一條簡單的查詢語句查詢出來。NoSQL暫未提供類似JOIN的查詢方式對多個數據集中的資料做查詢。

  • 資料耦合性

    SQL中不允許刪除已經被使用的外部資料;而NoSQL中則沒有這種強耦合的概念,可以隨時刪除任何資料。

  • 事務

    SQL中如果多張表資料需要同批次被更新,即如果其中一張表更新失敗的話其他表也不能更新成功。這種場景可以通過事務來控制,可以在所有命令完成後再統一提交事務。而NoSQL中沒有事務這個概念,每一個數據集的操作都是原子級的。

關係型資料庫適合儲存結構化資料,如使用者的帳號、地址:

  1. 這些資料通常需要做結構化查詢(嗯,好像是廢話),比如join,這時候,關係型資料庫就要勝出一籌
  2. 這些資料的規模、增長的速度通常是可以預期的
  3. 事務性、一致性

NoSQL適合儲存非結構化資料,如文章、評論:

  1. 這些資料通常用於模糊處理,如全文搜尋、機器學習
  2. 這些資料是海量的,而且增長的速度是難以預期的
  3. 根據資料的特點,NoSQL資料庫通常具有無限(至少接近)伸縮性按
  4. key獲取資料效率很高,但是對join或其他結構化查詢的支援就比較差

CAP定理

對於一個分散式計算系統,不可能同時滿足以下三點:

  • Consistency 一致性

    所有節點訪問同一份最新的資料副本

    即使系統必須阻止請求,直到所有副本更新,叢集中的每個節點都以最新資料響應。如果在一致的系統中查詢當前正在更新的專案,則將等待該響應,直到所有副本都成功更新為止,但將收到最新的資料。

  • Availability 可用性

    每次請求都能獲取到非錯的響應——但是不保留獲取的資料為最新資料

    每個節點都會返回立即響應。即使該響應不是最新資料也是如此。如果在可用系統中查詢正在更新的專案,那麼將獲得該服務當時提供的最佳答案。

  • Partition tolerance 分割槽容錯性

    以實際效果而言,分割槽相當於對通訊的時限要求。系統如果不能在時限內達成資料一致性,就意味著發生了分割槽的情況,必須就當前操作在C和A之間做出選擇。

    即使複製的資料節點發生故障或失去與其他複製的資料節點的連線,也可以確保系統繼續執行。

CAP theorem

Many relational database systems support built-in replication features where copies of the primary database can be made to other secondary server instances. Write operations are made to the primary instance and replicated to each of the secondaries. Upon a failure, the primary instance can fail over to a secondary to provide high availability. Secondaries can also be used to distribute read operations. While writes operations always go against the primary replica, read operations can be routed to any of the secondaries to reduce system load.

Data can also be horizontally partitioned across multiple nodes, such as with . But, sharding dramatically increases operational overhead by spitting data across many pieces that cannot easily communicate. It can be costly and time consuming to manage. It can end up impacting performance, table joins, and referential integrity.

If data replicas were to lose network connectivity in a “highly consistent” relational database cluster, you wouldn’t be able to write to the database. The system would reject the write operation as it can’t replicate that change to the other data replica. Every data replica has to update before the transaction can complete.

關係資料庫通常提供一致性和可用性,但不提供分割槽容限。通常將它們配置到單個伺服器,並通過向計算機新增更多資源來垂直擴充套件。許多關係資料庫系統支援內建的複製功能,可以在其中將主資料庫的副本複製到其他輔助伺服器例項。對主例項進行寫操作,然後將其複製到每個輔助例項。發生故障時,主例項可以故障轉移到輔助例項以提高高可用性。次要物件還可以用於分發讀取操作。雖然寫操作始終與主副本相反,但可以將讀操作路由到任何輔助副本,以減少系統負載。資料也可以在多個節點之間進行水平分割槽,例如使用分片。但是分片會將資料分散在許多不易通訊的資料塊上,從而大大增加了操作開銷。管理起來可能既昂貴又費時。最終可能會影響效能,錶鏈接和參照完整性。如果資料副本在高度一致的關係資料庫群集中失去網路連線,則將無法寫入資料庫。系統將拒絕寫操作,因為它無法將更改複製到其他資料副本。每個資料副本都必須先更新,然後事務才能完成。

NoSQL databases typically support high availability and partition tolerance. They scale out horizontally, often across commodity servers. This approach provides tremendous availability, both within and across geographical regions at a reduced cost. You partition and replicate data across these machines, or nodes, providing redundancy and fault tolerance. The downside is consistency. A change to data on one NoSQL node can take some time to propagate to other nodes. Typically, a NoSQL database node will provide an immediate response to a query - even if the data that is presented is stale and hasn’t updated yet.

If data replicas were to lose connectivity in a “highly available” NoSQL database cluster, you could still complete a write operation to the database. The database cluster would allow the write operation and update each data replica as it becomes available.

This kind of result is known as eventual consistency, a characteristic of distributed data systems where ACID transactions aren’t supported. It’s a brief delay between the update of a data item and time that it takes to propagate that update to each of the replica nodes. Under normal conditions, the lag is typically short, but can increase when problems arise. For example, what would happen if you were to update a product item in a NoSQL database in the United States and query that same data item from a replica node in Europe? You would receive the earlier product information, until the cluster updates the European node with the product change. By immediately returning a query result and not waiting for all replica nodes to update, you gain enormous scale and volume, but with the possibility of presenting older data.

NoSQL資料庫通常支援高可用性和分割槽容限。它們通常在商品伺服器之間橫向擴充套件。這種方法以低成本在地理區域核心地理區域內提供了巨大的可用性。可以在這些計算機或節點之間分割槽和複製資料,從而提供冗餘和容錯能力。缺點是一致性。一個NoSQL節點上的資料更改可能需要一些時間才能傳播到其他節點。通常,NoSQL資料庫節點將提供對查詢的立即響應——即使所提供的的資料過時且未更新。

如果資料副本將在高度可用的NoSQL資料庫群集中失去連線,則仍可用完成對資料庫的寫操作。資料庫叢集將允許寫操作,並在每個資料副本可用時更新它們。

這種結果稱為最終一致性,這是不支援ACID事務的分散式資料系統的特徵。在資料項的更新與將更新傳播到每個副本節點所花費的時間之間存在短暫的延遲。在正常情況下,滯後通常很短,但是當出現問題時滯後會增加。例如,如果要在美國的NoSQL資料庫中更新產品項並從歐洲的副本節點查詢相同的資料項,會發生什麼情況?將收到較早的產品資訊,直到叢集使用產品更改更新歐洲節點。通過立即返回查詢結果而不等待所有副本節點更新,將獲得巨大的規模和數量,但有可能預示較舊的資料。

Consider a NoSQL datastore when:Consider a relational database when:
You have high volume workloads that require large scaleYour workload volume is consistent and requires medium to large scale
Your workloads don’t require ACID guaranteesACID guarantees are required
Your data is dynamic and frequently changesYour data is predictable and highly structured
Data can be expressed without relationshipsData is best expressed relationally
You need fast writes and write safety isn’t criticalWrite safety is a requirement
Data retrieval is simple and tends to be flatYou work with complex queries and reports
Your data requires a wide geographic distributionYour users are more centralized
Your application will be deployed to commodity hardware, such as with public cloudsYour application will be deployed to large, high-end hardware

典型產品

SQL Server

由Microsoft開發的關係資料庫管理系統。作為資料庫伺服器,是一種軟體產品,其主要功能是根據其他軟體應用程式的請求來儲存和檢索資料,這些軟體可以在同一臺計算機上執行,也可以在網路上的另一臺計算機上執行。

  • 易用性
  • 適合分散式組織的可伸縮性
  • 用於決策支援的資料倉庫功能
  • 與相關軟體緊密關聯的整合性
  • 良好的價效比

MongoDB

MongoDB is a cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Server Side Public License (SSPL).

MongoDB是一個跨平臺的面向文件的資料庫,具有可伸縮性和靈活性,可用於所需的查詢和索引編制。

  • 將資料儲存在類似於JSON的靈活文件中,這意味著欄位隨文件的不同而不同,並且資料結構可以隨時間而變化
  • 文件模型對映到應用程式程式碼中的物件,從而使資料易於使用
  • 臨時查詢,索引編制和實時聚合提供了訪問和分析資料的強大方法
  • MongoDB以分散式資料庫為核心,因此內建了高可用性,水平擴充套件和地理分佈並且易於使用
  • MongoDB是免費使用的

高效能、易部署、易使用、儲存資料非常方便

  • 模式少

    MongoDB是一個文件資料庫,其中一個集合包含不同的文件。一個文件之間的欄位數,內容和文件大小可能會有所不同。

  • 單個物件的結構清晰

  • 沒有複雜的聯接

  • 深入的查詢能力

    MongoDB支援使用與SQL幾乎一樣強大的基於文件的查詢語言對文件進行動態查詢

  • 易於擴充套件

  • 不需要將應用程式物件轉換/對映到資料庫物件

  • 使用內部儲存器儲存工作集,從而可以更快地訪問資料

參考資料

[1] https://www.omnisci.com/technical-glossary/relational-database

[2] https://en.wikipedia.org/wiki/Relational_database

[3] https://www.mongodb.com/non-relational-database

[4] https://www.mongodb.com/nosql-explained

[5] https://en.wikipedia.org/wiki/NoSQL

[6] https://docs.microsoft.com/en-us/dotnet/architecture/cloud-native/relational-vs-nosql-data

[7] https://en.wikipedia.org/wiki/CAP_theorem

[8] https://en.wikipedia.org/wiki/Microsoft_SQL_Server

[9] https://www.microsoft.com/en-us/sql-server/sql-server-2019-features

[10] https://www.mongodb.com/what-is-mongodb