Building Real Time AI with AWS Fargate
This post is a contribution from AWS customer, Veritone. It was originally published on the company’s Website.
Here at Veritone, we deal with a lot of data. Our product uses the power of cognitive computing to analyze and interpret the contents of structured and unstructured data, particularly audio and video. We use cognitive computing to provide valuable insights to our customers.
Our platform is designed to ingest audio, video and other types of data via a series of batch processes (called “engines”) that process the media and attach some sort of output to it, such as transcripts or facial recognition data.
Our goal was to design a data pipeline that could process streaming audio, video, or other content from sources, such as IP cameras, mobile devices, and structured data feeds in real-time, through an open ecosystem of cognitive engines. This enables support for customer use cases like real-time transcription for live-broadcast TV and radio, face and object detection for public safety applications, and the real-time analysis of social media for harmful content.
Why AWS Fargate?
We leverage Docker containers as the deployment artifact of both our internal services and cognitive engines. This gave us the flexibility to deploy and execute services in a reliable and portable way. Fargate on AWS turned out to be a perfect tool for orchestrating the dynamic nature of our deployments.
Fargate allows us to quickly scale Docker-based engines from zero to any desired number without having to worry about pre-provisioning capacity or bootstrapping and managing EC2 instances. We use Fargate both as a backend for quickly starting engine containers on demand and for the orchestration of services that need to always be running. It enables us to handle sudden bursts of real-time workloads with a consistent launch time. Fargate also allows our developers to get near-immediate feedback on deployments without having to manage any infrastructure or deal with downtime. The integration with Fargate makes this super simple.
Moving to Real Time
We designed a solution (shown below), in which media from a source, such as a mobile app, which “pushes” streams into our platform, or an IP camera feed, which is “pulled”, is streamed through a series of containerized engines, processing the data as it is ingested. Some engines, which we refer to as Stream Engines, work on raw media streams from start to finish. For all others, streams are decomposed into a series of objects, such as video frames or small audio/video chunks that can be processed in parallel by what we call Object Engines. An output stream of results from each engine in the pipeline is relayed back to our core platform or customer-facing applications via Veritone’s APIs.
Message queues placed between the components facilitate the flow of stream data, objects, and events through the data pipeline. For that, we defined a number of message formats. We decided to use Apache Kafka, a streaming message platform, as the message bus between these components.
Kafka gives us the ability to:
- Guarantee that a consumer receives an entire stream of messages, in sequence.
- Buffer streams and have consumers process streams at their own pace.
- Determine “lag” of engine queues.
- Distribute workload across engine groups, by utilizing partitions.
The flow of stream data and the lifecycle of the engines is managed and coordinated by a number of microservices written in Go. These include the Scheduler, Coordinator, and Engine Orchestrators.
Deployment and Orchestration
For processing real-time data, such as streaming video from a mobile device, we required the flexibility to deploy dynamic container configurations and often define new services (engines) on the fly. Stream Engines need to be launched on-demand to handle an incoming stream. Object Engines, on the other hand, are brought up and torn down in response to the amount of pending work in their respective queues.
EC2 instances typically require provisioning to be done in anticipation of incoming load and generally take too long to start in this case. We needed a way to quickly scale Docker containers on demand, and Fargate made this achievable with very little effort.
In Closing
Fargate helped us solve a lot of problems related to real-time processing, including the reduction of operational overhead, for this dynamic environment. We expect it to continue to grow and mature as a service. Some features we would like to see in the near future include GPU support for our GPU-based AI Engines and the ability to cache container images that are larger for quicker “warm” launch times.
About Veritone
Veritone created the world’s first operating system for Artificial Intelligence. Veritone’s aiWARE operating system unlocks the power of cognitive computing to transform and analyze audio, video and other data sources in an automated manner to generate actionable insights. The Veritone platform provides customers ease, speed and accuracy at low cost.
The Veritone authors are Christopher Stobie – [email protected] and Mezzi Sotoodeh – [email protected]
相關推薦
Building Real Time AI with AWS Fargate
This post is a contribution from AWS customer, Veritone. It was originally published on the company’s Website. Here at Veritone,
Building real-time dashboard applications with Apache Flink, Elasticsearch, and Kibana
https://www.elastic.co/cn/blog/building-real-time-dashboard-applications-with-apache-flink-elasticsearch-and-kibana Fabian Hue
Setting Up Just-in-Time Provisioning with AWS IoT Core
In an earlier blog post about just-in-time registration of device certificates, we discussed how just-in-time registration (JITR) can be used to a
Introducing AWS AppSync – Build data-driven apps with real-time and off-line capabilities
In this day and age, it is almost impossible to do without our mobile devices and the applications that help make our lives easier. As our depende
論文筆記--PCN:Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
.com 角度 ati 分享圖片 直接 算法 二級 使用 計算 測試demo:https://github.com/Jack-CV/PCN 關鍵詞:rotation-invariant face detection, rotation-in-plane, coarse-t
論文閱讀筆記(六)Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
采樣 分享 最終 產生 pre 運算 減少 att 我們 作者:Shaoqing Ren, Kaiming He, Ross Girshick, and Jian SunSPPnet、Fast R-CNN等目標檢測算法已經大幅降低了目標檢測網絡的運行時間。可是盡管如此,仍然
【Faster RCNN】《Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks》
NIPS-2015 NIPS,全稱神經資訊處理系統大會(Conference and Workshop on Neural Information Processing Systems),是一個關於機器學習和計算神經科學的國際會議。該會議固定在每年的12月舉行
論文閱讀筆記二十六:Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks(CVPR 2016)
論文源址:https://arxiv.org/abs/1506.01497 tensorflow程式碼:https://github.com/endernewton/tf-faster-rcnn 摘要 目標檢測依賴於區域proposals演算法對目標的位置進
【論文閱讀】Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
簡述 看這篇論文,並實現一下這個。(如果有能力實現的話) 實時任意風格轉換(用自適應Instance Normalization) instanceNorm = batchsize=1 的 batchNorm 1 Abstract Gatys et al
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Abstract SPPnet和Fast R-CNN雖然減少了演算法執行時間,但region proposal仍然是限制演算法速度的瓶頸。而Faster R-CNN提出了Region Proposal Network (RPN),該網路基於卷積特徵預測每個位置是否為物體以及
Building a Slack Bot with Go and Wit.ai @ Alex Pliutau's Blog
Building a Slack Bot with Go and Wit.ai We will build a simple Slack Bot with NLU functionality to get some useful information from Wolfr
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from
摘要 在本文中,我們從三維鐳射雷達點雲的角度對道路目標進行了語義分割。我們特別希望檢測和分類感興趣的例項,例如汽車、行人和騎自行車的人。我們制定這個問題作為一個逐點分類的問題,並提出一個端到端的管道稱為SqueezeSeg基於卷積神經網路(CNN):C
【論文筆記】Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
寫在前面: 我看的paper大多為Computer Vision、Deep Learning相關的paper,現在基本也處於入門階段,一些理解可能不太正確。說到底,小女子才疏學淺,如果有錯
【筆記】Faster-R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
論文程式碼:重要:訓練檔案.prototxt說明:http://blog.csdn.net/Seven_year_Promise/article/details/60954553從RCNN到fast R
[論文學習]《Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 》
faster R-CNN的主要貢獻 提出了 region proposal network(RPN),通過該網路我們可以將提取region proposal的過程也納入到深度學習的過程之中。這樣做既增加了Accuracy,由降低了耗時。之所以說增加Accura
【翻譯】Faster-R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
摘要 目前最先進的目標檢測網路需要先用區域建議演算法推測目標位置,像SPPnet[7]和Fast R-CNN[5]這些網路已經減少了檢測網路的執行時間,這時計算區域建議就成了瓶頸問題。本文中,我們介紹一種區域建議網路(Region Proposal Network, R
《Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks》論文閱讀
《使用PCN網路的實時旋轉不變人臉檢測》論文閱讀 摘要 對於人臉的任意角(RIP)檢測在應用中有著廣泛的需求,但該項任務仍然十分具備挑戰性。在處理大量的角度變化問題上,現有的處理辦法都在檢測速度和精確度上做出了妥協。為了更好地提高檢測效率,作者提出了一個PCN網路來對RIP人臉檢測從一個由粗
FaceBoxes: A CPU Real-time Face Detector with High Accuracy(論文解析)
CPU上的高精度實時人臉檢測器 綜述 人臉識別是計算機視覺和模式識別的基礎問題,過去幾十年取得了長足進步,但是由於計算量較大,在CPU上的實時檢測一直沒有很好的被解決。面臨的主要問題,一是人臉和背景的可變性都太大(種類太多),二是由於人臉的不同尺寸,
Build data driven apps with real time and offline capabilities based on GraphQL
AWS AppSync is a serverless back-end for mobile, web, and enterprise applications. AWS AppSync makes it easy to build data driven mobile a
Running AWS Fargate with virtual-kubelet
AWS Fargate is a new compute engine that allows you to run containers without having to provision, manage, or scale servers. Toda