DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

阿新 • • 發佈：2018-09-05

step with 圖片 eight enter sub img layer each

1、Introduction

DL解決VO問題：End-to-End VO with RCNN

技術分享圖片

2、Network structure

技術分享圖片

a.CNN based Feature Extraction

技術分享圖片

　　論文使用KITTI數據集。

　　CNN部分有9個卷積層，除了Conv6，其他的卷積層後都連接1層ReLU，則共有17層。

b、RNN based Sequential Modelling

　　RNN is different from CNN in that it maintains memory of its hidden states over time and has feedback loops among them, which enables its current hidden state to be a function of the previous ones.

　　Given a convolutional feature x_k at time k, a RNN updates at time step k by

技術分享圖片

　　hk and yk are the hidden state and output at time k respectively.

　　W terms denote corresponding weight matrices.

　　b terms denote bias vectors.

　　H is an element-wise nonlinear activation function.

　　LSTM

技術分享圖片

Folded and unfolded LSTMs and internal structure of its unit.

技術分享圖片

　　is element-wise product of two vectors.

　　σ is sigmoid non-linearity.

　　tanh is hyperbolic tangent non-linearity.

　　W terms denote corresponding weight matrices.

　　b terms denote bias vectors.

　　ik, f k, gk, ck and ok are input gate, forget gate, input modulation gate, memory cell and output gate.

　　Each of the LSTM layers has 1000 hidden states.

3、損失函數及優化

　　The conditional probability of the poses Y_t = (y₁, . . . , y_t) given a sequence of monocular RGB images X_t = (x₁, . . . , x_t) up to time t.

　　Optimal parameters :

　　The hyperparameters of the DNNs:

技術分享圖片

　　(p_k, φ_k) is the ground truth pose.

　　(p?_k, φ?_k) is the estimated ground truth pose.

　　κ (100 in the experiments) is a scale factor to balance the weights of positions and orientations.

　　N is the number of samples.

　　The orientation φ is represented by Euler angles rather than quaternion since quaternion is subject to an extra unit constraint which hinders the optimisation problem of DL.

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

step with 圖片 eight enter sub img layer each 1、Introduction DL解決VO問題：End-to-End VO with RCNN 2、Network structure a.CNN based Feature Ext

Question Answering over Freebase with Multi-Column Convolutional Neural Networks【論文筆記】

一、概要通過知識庫回答自然語言問題是一個重要的具有挑戰性的任務。大多數目前的系統依賴於手工特徵和規則。本篇論文，我們介紹了MCCNNs，從三個不同層面（答案路徑，答案型別，答案上下文）來理解問題。同時，在知識庫中我們共同學習實體和關係的低維詞向量。問答對用於訓練模型以對候選答案

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks閱讀筆記

1.摘要論文提出一種統一的網路結構模型，這種模型可以直接通過一次前向計算就可以同時實現對影象中文字定位和識別的任務。這種網路結構可以直接以end-to-end的方式訓練，訓練的時候只需要輸入影象，影象中文字的bbox，以及文字對應的標籤資訊。這種end-to-end訓練的

【USE】《An End-to-End System for Automatic Urinary Particle Recognition with CNN》

Urine Sediment Examination（USE） JMOS-2018 目錄目錄 1 Background and Motivation 2 Innovation

FlowTrack－End-to-end Flow Correlation Tracking with Spatial-temporal Attention(CVPR2018)

動機：大多數DCF方法僅考慮當前幀的特徵，而很少受益於運動和幀間資訊。發生遮擋和形變時，時間資訊缺失導致效能減低。本文提出FlowTrack，利用連續幀中豐富的光流資訊來改善特徵表示和跟蹤精度。具體是將光流估計，特徵提取，聚合和相關濾波器跟蹤制定為網路中的特殊層，從而實現端到端學習。這種在深度

CBHG 模組來自TACOTRON: TOWARDS END-TO-END SPEECH SYNTHESIS

作者的靈感來源於在文章Fully Character-Level Neural Machine Translation without Explicit Segmentation中的模型。原型如下圖所示： CBHG模組如下圖所示。首次提出在Goggle的一篇文章：TACO

「Medical Image Analysis」Note on End-to-end DP with CNN (EDPCNN)

QQ Group: 428014259 Sina Weibo：小鋒子Shawn Tencent E-mail：[email protected] http://blog.csdn.net/dgyuanshaofeng/article/details/84843126 [1]

The Problem with End-to-End Tests – gitconnected

The Problem with End-to-End TestsSalt is great. According to Wikipedia, salt is essential for life in general, and saltiness is one of the basic human tast

An Overview of End-to-End Exactly-Once Processing in Apache Flink (with Apache Kafka, too!)

01 Mar 2018 Piotr Nowojski (@PiotrNowojski) & Mike Winters (@wints) This post is an adaptation of Piotr Nowojski’s presentation from Flink Forward Ber

論文筆記|Towards End-to-End Lane Detection: an Instance Segmentation

用盡量少的語言描述一篇paper 本文看點：結合embedding和Segmentation mask提供一種做Lane Instance Segmentation的思路 Lane的Instance Segmentation可以比單純的Segmentati

論文筆記（1）DenseBox: Unifying Landmark Localization with End to End Object Detection

本文的貢獻有一下幾點： 1，實現了end-to-end的學習，同時完成了對bounding box和物體類別的預測； 2，在多工學習中融入定位資訊，提高了檢測的準確率。我們先來看看他和其他幾篇代表性文章之間的不同。在OverFeat[1]中提出了將分

【論文筆記】An End-to-End Model for QA over KBs with Cross-Attention Combining Global Knowledge

一、概要該文章發於ACL 2017，在Knowledge base-based question answering (KB-QA)上，作者針對於前人工作中存在沒有充分考慮候選答案的相關資訊來訓練question representation的問題，提出

Overview：end-to-end深度學習網絡在超分辨領域的應用（待續）

向量不同的這就是 src dimens sep max pos pca 目錄 1. SRCNN Contribution Inspiration Network O. Pre-processing I. Patch extraction and representat

端到端的學習end-to-end learning （理解）

傳統的機器學習的流程是由多個獨立的模組組成，每一個獨立的任務其結果的好壞都會影響到下一個步驟，從而影響到整個訓練的結果，這個是非端到端的而深度學習模型在訓練過程中，從輸入端（輸入資料）到輸出端會得到一個預測結果，與真實結果相比較會得到一個誤差，這個誤差會在模型中的每一層傳遞（反向傳播），每一層

目標檢測中對端對端（End to end）的理解

End to end：指的是輸入原始資料，輸出的是最後結果，應用在特徵學習融入演算法，無需單獨處理。 end-to-end（端對端）的方法，一端輸入我的原始資料，一端輸出我想得到的結果。只關心輸入和輸出，中間的步驟全部都不管。　　端到端指的是輸入是原始資料，輸出是最後的結果，原來輸入端不是

Direct Shape Regression Networks for End-to-End Face Alignment

端到端人臉對齊的直接形狀迴歸網路1 主要的挑戰在於人臉影象和相關的面部形狀之間的高度非線性關係，這種非線性關係是基於標記的相關性耦合。現有的方法主要依賴於級聯迴歸，存在固有的缺點，例如對初始化的強依賴性和未能利用相關的標記。本文提出了一種**直接形狀迴歸網路（direct shap

《An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its...》論文閱讀之CRNN

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition paper: CRNN 翻譯：CRNN

《End-to-End Learning of Motion Representation for Video Understanding》論文閱讀

CVPR 2018 | 騰訊AI Lab、MIT等機構提出TVNet：可端到端學習視訊的運動表徵動機儘管端到端的特徵學習已經取得了重要的進展，但是人工設計的光流特徵仍然被廣泛用於各類視訊分析任務中。為了彌補這個不足而提出；以前的方法：

深度學習論文翻譯解析（二）：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

論文標題：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 論文作者： Baoguang Shi, Xiang B

機器學習專案開發過程（End-to-End Machine Learning Project）

引言：之前對於機器學習的認識停留在演算法的分析上，這篇文章主要從專案開發的角度分析機器學習的應用。這篇文章主要解釋實際專案過程中的大致方針，每一步涉及的技術不會介紹很細緻。機器學習專案開發步驟如下： 1. Look at the big picture. 2. Get the dat

DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks

相關推薦