1. 程式人生 > 程式設計 >流式計算準確性語義分析

流式計算準確性語義分析

本篇文章是對Exactly once is NOT exactly the same翻譯和分析,對流式計算中衡量準確性的三類語義進行了初步的理解。

1、Stream Processing Engines

Distributed event stream processing has become an increasingly hot topic in the area of Big Data. Notable Stream Processing Engines (SPEs) include Apache Storm,Apache Flink,Heron,Apache Kafka (Kafka Streams),and Apache Spark (Spark Streaming). One of the most notable and widely discussed features of SPEs is their processing semantics,with “exactly-once” being one of the most sought after and many SPEs claiming to provide “exactly-once” processing semantics.

There exists a lot of misunderstanding and ambiguity,however,surrounding what exactly “exactly-once” is,what it entails,and what it really means when individual SPEs claim to provide it. The label “exactly-once” for describing processing semantics is also very misleading. In this blog post,I’ll discuss how “exactly-once” processing semantics differ across many popular SPEs andwhy “exactly-once” can be better described as **effectively-once**

. I’ll also explore the tradeoffs(權衡) between common techniques used to achieve what is often called “exactly-once.”

注意:精確一次更準確的說法應該是有效一次。

2、Background

Stream processing,sometimes referred to as event processing,can be succinctly described as continuous processing of an unbounded series of data or events.(流處理,有時也稱為事件處理,可以簡單地描述為對一系列無界資料或事件的連續處理) A stream- or event-processing application can be more or less described as a directed graph and often,but not always,as a directed acyclic graph (DAG).(流或事件處理應用程式可以或多或少地描述為有向圖,也可以經常(但不總是)描述為有向無環圖(DAG)) In such a graph,each edge represents a flow of data or events and each vertex represents an operator that uses application-defined logic to process data or events from adjacent(鄰近的) edges.(每個邊表示一個資料流或事件流,每個頂點表示一個操作符,該操作符使用應用程式定義的邏輯來處理來自相鄰邊的資料或事件) There are two special types of vertices,commonly referenced as sources and sinks. Sources consume external data/events and inject them into the application while sinks typically gather results produced by the application.(有兩種特殊型別的頂點,通常稱為 sources 和 sinks。sources讀取外部資料/事件到應用程式中,而 sinks 通常會收集應用程式生成的結果) Figure 1 below depicts an example a streaming application.

image-20191117182750009

Figure 1. A typical Heron processing topology

An SPE that executes a stream/event processing application usually allows users to specify a reliability mode or processing semantics that indicates which guarantees it will provide for data processing across the entirety of the application graph.(執行流/事件處理應用程式的SPE通常允許使用者指定一種可靠性模式或處理語義,該可靠性模式或處理語義指示哪些保證它將為整個應用程式圖提供資料處理) These guarantees are meaningful since you can always assume the possibility of failures via network,machines,etc. that can result in data loss.(這些保證是有意義的,因為您總是可以假設通過網路、機器等可能導致資料丟失的故障的可能性) Three modes/labels,at-most-once,at-least-once,and exactly-once,are generally used to describe the data processing semantics that the SPE should provide to the application.

Here’s a loose definition of those different processing semantics(通常使用三種模式/標籤(最多一次,至少一次和完全一次)來描述SPE應該提供給應用程式的資料處理語義)

3、At-most-once(最多一次)

This is essentially a “best effort” approach(這本質上是一種“盡力而為”的方法). Data or events are guaranteed to be processed at most once by all operators in the application.(應用程式中的所有操作符保證最多一次處理資料或event) This means that no additional attempts will be made to retry or retransmit events if it was lost before the streaming application can fully process it(這意味著,如果流應用程式在完全處理事件之前丟失了事件,則不會再嘗試重試或重新傳輸事件). Figure 2 illustrates an example of this.

image-20191117182839716

Figure 2. At-most-once processing semantics

4、At-least-once(至少一次)

Data or events are guaranteed to be processed at least once by all operators in the application graph(這意味著,如果流應用程式在完全處理事件之前丟失了事件,則不會再嘗試重試或重新傳輸事件). This usually means an event will be replayed or retransmitted from the source if the event is lost before the streaming application fully processed it.(這通常意味著如果事件在流應用程式完全處理它之前丟失,則將從源重播或重新傳輸該事件) Since it can be retransmitted,an event can sometimes be processed more than once,thus the at-least-once term.(然而,由於可以將其重新傳輸,因此事件有時可以進行多次處理,因此,至少一次) Figure 3 illustrates an example of this. In this case,the first operator initially fails to process an event,then succeeds upon retry,then succeeds upon a second retry that turns out to have been unnecessary.(圖3舉例說明瞭這一點。在這種情況下,第一個操作符最初無法處理事件,然後重試成功,然後再重試成功,最後發現沒有必要進行第二次重試)

image-20191117182916410

Figure 3. At-least-once processing semantics

5、Exactly-once(精確一次)

Events are guaranteed to be processed “exactly once” by all operators in the stream application,even in the event of various failures.(即使在發生各種故障的情況下,流應用程式中的所有操作符也保證只處理一次事件)

Two popular mechanisms(機制) are typically used to achieve “exactly-once” processing semantics.

  1. Distributed snapshot/state checkpointing(分散式快照或者是狀態檢查點)

  2. At-least-once event delivery plus message deduplication(至少一次事件傳遞以及訊息重複資料刪除)

5.1、Distributed snapshot/state checkpointing

The distributed snapshot/state checkpointing method of achieving “exactly-once” is inspired by the Chandy-Lamport distributed snapshot algorithm.1(實現“完全一次”的分散式快照/狀態檢查點方法是受Chandy-Lamport分散式快照演演算法啟發的。) With this mechanism,all the state for each operator in the streaming application is periodically checkpointed,and in the event of a failure anywhere in the system,all the state of for every operator is rolled back to the most recent globally consistent checkpoint.(流應用程式中每個操作符的所有狀態都是定期檢查點的,當系統中任何地方出現故障時,每個操作符的所有狀態都回滾到最近的全域性一致檢查點) During the rollback,all processing will be paused.(在回滾期間,所有處理將被暫停) Sources are also reset to the correct offset corresponding to the most recent checkpoint.(source也被重置為與最近的檢查點相對應的正確偏移量) The whole streaming application is basically rewound to its most recent consistent state and processing can then restart from that state(整個流處理應用程式基本上被重新恢復到最近的一致狀態,然後處理可以從該狀態重新啟動). Figure 4 below illustrates the basics of this mechanism.

image-20191117183005067

Figure 4. Distributed snapshot

In Figure 4,

  • the streaming application is working normally at T1 and the state is checkpointed.(流應用程式在T1下正常工作,狀態為檢查點。)

  • At time T2,the operator fails to process an incoming datum. At this point,the state value of S = 4 has been saved to durable storage,while the state value S = 12 is held in the operator’s memory.(但是在T2時,Operator不能處理輸入的資料。此時,狀態值S = 4被儲存到持久儲存中,而狀態值S = 12儲存在操作符的記憶體中。)

  • In order to overcome this discrepancy,at time T3 the processing graph rewinds the state to S = 4 and “replays” each successive state in the stream up to the most recent,processing each datum(基準點). (為了克服這種差異,在時間T3處,處理圖將狀態後退到S = 4,並將流中的每個連續狀態“重播”到最新的,處理每個資料的狀態。)

  • The end result is that some data have been processed multiple times,but that’s okay because the resulting state is the same no matter how many rollbacks have been performed.(最終的結果是,有些資料已經處理了多次,但是這沒有關係,因為不管執行了多少回滾,結果狀態都是相同的。)

5.2、At-least-once event delivery plus message deduplication

Another method used to achieve “exactly-once” is through implementing at-least-once event delivery in conjunction with event deduplication on a per-operator basis.(另一種實現精確一次的方法是在每個operation的基礎上實現至少一次事件交付和事件重複資料刪除) SPEs utilizing this approach will replay failed events for further attempts at processing and remove duplicated events for every operator prior to the events entering the user defined logic in the operator.(使用此方法的spe將重播失敗事件,以便進一步嘗試處理,並在事件進入操作符中使用者定義的邏輯之前刪除每個操作符的重複事件。) This mechanism requires that a transaction log be maintained for every operator to track which events it has already processed.(該機制要求為每個操作符維護一個事務日誌,以跟蹤它已經處理的事件) SPEs that utilize a mechanism like such are Google’s MillWheel2 and Apache Kafka Streams. Figure 5 illustrates the gist of this mechanism.

image-20191117183118602

Figure 5. At-least-once delivery plus deduplication

6、Is exactly-once really exactly-once?

Now let’s reexamine what the “exactly-once” processing semantics really guarantees to the end user. The label “exactly-once” is misleading in describing what is done exactly once.(現在讓我們重新審視『精確一次』處理語義真正對終端使用者的保證。『精確一次』這個術語在描述正好處理一次時會讓人產生誤導)

  • Some might think that “exactly-once” describes the guarantee to event processing in which each event in the stream is processed only once.(有些人可能認為『精確一次』描述了事件處理的保證,其中流中的每個事件只被處理一次)

  • In reality,there is no SPE that can guarantee exactly-once processing. To guarantee that the user-defined logic in each operator only executes once per event is impossible in the face of arbitrary failures,because partial execution of user code is an ever-present possibility.(實際上,沒有SPE可以保證精確的一次處理。要保證每個操作符中的使用者定義邏輯只針對每個事件執行一次是不可能的,因為隨時都可能出現部分執行使用者程式碼的情況。)

So what does SPEs guarantee when they claim “exactly-once” processing semantics? If user logic cannot be guaranteed to be executed exactly once then what is executed exactly once? When SPEs claim “exactly-once” processing semantics,what they’re actually saying is that they can guarantee that updates to state managed by the SPE are committed only once to a durable backend store.(那麼,當引擎宣告『精確一次』處理語義時,它們能保證什麼呢?如果不能保證使用者邏輯只執行一次,那麼什麼邏輯只執行一次?當引擎宣告『精確一次』處理語義時,它們實際上是在說,它們可以保證引擎管理的狀態更新只提交一次到持久的後端儲存)

Both mechanisms described above use a durable backend store as a source of truth that can hold the state of every operator and automatically commit updates to it. (上面描述的兩種機制都使用持久的後端儲存作為真實性的來源,可以儲存每個運算元的狀態並自動向其提交更新)

  • For mechanism 1 (distributed snapshot/state checkpointing),this durable backend state is used to hold the globally consistent state checkpoints (checkpointed state for every operator) for the streaming application.(對於機制 1 (分散式快照 / 狀態檢查點),此持久後端狀態用於儲存流應用程式的全域性一致狀態檢查點(每個運算元的檢查點狀態)

  • For mechanism 2 (at-least-once event delivery plus deduplication),the durable backend state is used to store the state of every operator as well as a transaction log for every operator that tracks all the events it has already fully processed.(對於機制 2 (至少一次事件傳遞加上重複資料刪除),持久後端狀態用於儲存每個運算元的狀態以及每個運算元的事務日誌,該日誌跟蹤它已經完全處理的所有事件)

The committing of state or applying updates to the durable backend that is the source of truth can be described as occurring exactly-once.(提交狀態或對作為真實來源的持久後端應用更新可以被描述為恰好發生一次) Computing the state update/change,i.e. processing the event that is executing arbitrary user -defined logic on the event,can happen more than once if failures occur,as mentioned above(如上所述,計算狀態更新/更改,即處理在事件上執行任意使用者定義的邏輯的事件,如果發生故障,則可能會發生多次(如上所述)). In other words,the processing of an event can happen more than once but the effect of that processing is only reflected once in the durable backend state store.(換句話說,事件的處理可以發生多次,但是處理的效果只在持久後端狀態儲存中反映一次) Here at Streamlio,we’ve decided that effectively-once is the best term for describing these processing semantics.

7、Distributed snapshot versus at-least-once event delivery plus deduplication

From a semantic point of view,both the distributed snapshot and at-least-once event delivery plus deduplication mechanisms provide that same guarantee. Due to differences in implementation between the two mechanisms,there are significant performance differences.(從語義的角度來看,分散式快照和至少一次事件交付加上重複資料刪除機制都提供了相同的保證。但是,由於這兩種機制的實現方式不同,效能也有很大差異。)

  • The performance overhead of mechanism 1 (distributed snapshot/state checkpointing) on top of the SPE can be minimal since the SPE is essentially sending a few special events alongside regular events through all the operators in the streaming application,while state checkpointing can be performed asynchronously in the background.(在SPE之上的機制1(分散式快照/狀態檢查點)的效能開銷可以最小化,因為SPE本質上通過流式處理程式中的所有運運算元傳送一些特殊事件以及常規事件,而狀態檢查點在後臺可以非同步執行) For large streaming applications,failures may happen more frequently,causing the SPE to need to pause the application and roll back the state of all operators,which will in turn impact performance(然而,對於大型流應用程式,故障可能發生得更頻繁,導致SPE需要暫停應用程式並回滾所有操作符的狀態,這反過來又會影響效能). The larger the streaming application,the more likely and thus more frequently failures can occur,and in turn,the more significantly the performance of the streaming application will be impacted(流應用程式越大,故障發生的可能性就越大,因此故障發生的頻率也就越高,反過來,流應用程式的效能受到的影響也就越大). However,again,this mechanism is very non-intrusive and demands minimal additional resources impact to run.(但是,這種機制是非侵入性的,並且執行時對額外資源的影響很小。)

  • Mechanism 2 (at-least-once event delivery plus deduplication) may require a lot more resources,especially storage(機制2(至少一次事件交付加上重複資料刪除)可能需要更多的資源,特別是儲存。). With this mechanism,the SPE would need to be able to track every tuple that has been fully processed by every instance of an operator to perform deduplication as well as perform the deduplication itself for every event.(使用這種機制,SPE將需要能夠跟蹤操作符的每個例項已經完全處理過的每個元組,以便執行重複資料刪除,以及為每個事件執行重複資料刪除本身。) This can amount to a huge amount of data to keep track of,especially if the streaming application is large or if there are many applications running. There is also performance overhead associated with every event at every operator to perform the deduplication(每個操作員執行重複資料消除的每個事件都會產生效能開銷). With this mechanism,the performance of the streaming application is less likely to be impacted by the size of the application.(流式應用不太可能受到應用大小的影響)

    • 1、With mechanism 1,a global pause and state rollback needs to occur if any failures occur on any operator; (使用機制1,如果任何操作符發生任何故障,則需要執行全域性暫停和狀態回滾)
    • 2、with mechanism 2,the effects of a failure are much more localized. When a failure occurs in an operator,events that might have not been fully processed are just replayed/retransmitted from an upstream source.(對於機制2,故障的影響更加侷限。當操作員發生故障時,可能只是從上游源重放/重傳了可能尚未完全處理的事件) The performance impact is isolated to where the failure happened in the streaming application and will cause little impact to the performance of other operators in the streaming application.(效能影響被隔離到流應用程式中發生故障的地方,對流應用程式中其他操作符的效能影響很小) The pros and cons of both mechanisms from a performance standpoint are listed in the tables below.(從效能的角度來看,這兩種機制的優缺點列在下面的表中。)

Distributed snapshot/state checkpointing

Pros優點 Cons缺點
Little performance and resource overhead (效能和資源開銷很少) Larger impact to performance when recovering from failures (從故障中恢復對效能的影響更大)
Potential impact to performance increases as topology gets larger (隨著拓撲變大,對效能的潛在影響會增加)

At-least-once delivery plus deduplication

Pros(優點) Cons(缺點)
Performance impact of failures are localized (故障對效能的影響已本地化) Potentially need large amounts of storage and infrastructure to support (潛在需要大量的儲存和基礎架構來支援)
Impact of failures does not necessarily increase with the size of the topology (故障的影響不一定隨拓撲的大小而增加) Performance overhead for every event at every operator (每個操作員的每個事件的效能開銷)

Though there are differences between the distributed snapshot and at-least-once event delivery plus deduplication mechanisms from a theoretical point of view,both can be reduced to at-least-once processing plus idempotency.(儘管從理論上講,分散式快照和至少一次事件交付加上重複資料刪除機制之間存在差異,但兩者都可以簡化為至少一次處理加上冪等性) For both mechanisms,events will be replayed/retransmitted when failures occur (implementing at-least-once),and through state rollback or event deduplication,operators essentially become idempotent when updating internally managed state.(對於這兩種機制,都將在發生故障時(至少一次實現)重播/重傳事件,並且通過狀態回滾或事件重複資料刪除,操作員在更新內部管理狀態時實質上將成為冪等。)

8、Conclusion

In this blog post,I hope to have convinced you that the term “exactly-once” is very misleading. Providing “exactly-once” processing semantics really means that distinct updates to the state of an operator that is managed by the stream processing engine are only reflected once(提供精確的一次處理語義實際上意味著對由流處理引擎管理的操作符狀態的不同更新只反映一次). “Exactly-once” by no means guarantees that processing of an event,i.e. execution of arbitrary user-defined logic,will happen only once.(確切地說,一次並不能保證事件的處理,即任意使用者定義邏輯的執行只發生一次) Here at Streamlio,we prefer the term effectively once for this guarantee because processing is not necessarily guaranteed to occur once but the effect on the SPE-managed state is reflected once(在Streamlio中,對於這種保證,我們更傾向於使用“有效一次”這個術語,因為處理並不一定保證只發生一次,但是對spe管理狀態的影響只反映一次). Two popular mechanisms,distributed snapshot and dessage deduplication,are used to implement exactly/effectively-once processing semantics. (兩種流行的機制,即分散式快照和dessage重複資料刪除,用於實現精確/有效的一次處理語義)Both mechanisms provide the same semantic guarantees to message processing and state updates but there are nonetheless differences in performance(這兩種機制為訊息處理和狀態更新提供了相同的語義保證,但在效能上仍然存在差異). This post is not meant to convince you that either mechanism is superior to the other,as each has its pros and cons.

References

1. Chandy,K. Mani and Leslie Lamport. Distributed snapshots: Determining global states of distributed systems. ACM Transactions on Computer Systems (TOCS) 3.1 (1985): 63-75.

2. Akidau,Tyler,et al. MillWheel: Fault-tolerant stream processing at internet scale. Proceedings of the VLDB Endowment 6.11 (2013): 1033-1044.