lholistic vs. local
l

text detection is casted as a semantic segmentation problem
lconceptionally and functionally different from previous sliding-window or connected component based approaches
lholistic, pixel-wise predictions: text region map, character map and linking orientation map
ldetections are formed using these three maps
lcan simultaneously handle horizontal, multi-oriented and curved text in real-world natural images

TextBoxes

Liao et al.. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. AAAI, 2017.

la text detection method inspired by SSD
lboth high accuracy and efficiency

Rotation Proposals

Ma et al.. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. arxiv, 2017.

la multi-oriented text detection method based on Faster RCNN
lpropose several modifications to better detect scene text

Corner Localization and Region Segmentation
(A Megvii work in CVPR 2018)

Lyu et al.. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. CVPR, 2018.

la compound text detection method: corner localization and region segmentation

lcorner localization: corner detection with SSD
lregion segmentation: position-sensitive segmentation with R-FCN

Simpler Pipelines

EAST (A Megvii work in CVPR 2017)

Zhou et al.. EAST: An Efficient and Accurate Scene Text Detector. CVPR, 2017.

lmain idea: predict location, scale and orientation of text with a single model and multiple loss functions (multi-task training)

ladvantanges:

(a). accuracy: allow for end-to-end training and optimization

(b). efficiency: remove redundant stages and processings

任意形狀的文字檢測

TextSnake (A Megvii work in ECCV 2018)

Long et al.. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes, ECCV, 2018.

la novel and flexible representation
lable to effectively and precisely describe the geometric properties, such as location, scale, and bending of curved text, while the other representations (axis-aligned rectangle, rotated rectangle or quadrangle) struggle

la text instance is described as a sequence of ordered, overlapping disks centered at symmetric axes, each of which is associated with potentially variable radius and orientation

Mask TextSpotter (A Megvii work in ECCV 2018)

Lyu et al.. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes, ECCV, 2018.

lan end-to-end system for both text detection and recognition
linspired by Mask R-CNN

lRPN for text proposal generation
lFast R-CNN for proposal classification and regression
lmask branch for character segmentaion and recognition

文字識別

CRNN

Shi et al.. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition, TPAMI, 2017.

ASTER

Shi et al.. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification, TPAMI, 2018.

FAN

資源推薦

•Survey

•Scene Text Detection and Recognition: The Deep Learning Era

•arXiv: https://arxiv.org/abs/1811.04256 (draft version)

•Github: https://github.com/Jyouhou/SceneTextPapers (compiled papers, datasets & codes)

•Laboratories and Papers

•https://github.com/chongyangtao/Awesome-Scene-Text-Recognition

•Datasets and Codes

•https://github.com/seungwooYoo/Curated-scene-text-recognition-analysis

•Projects and Products

•https://github.com/wanghaisheng/awesome-ocr

[深度學習]場景文字檢測與識別

背景

文字為什麼重要？

問題定義

那麼會有那些挑戰呢？

近期前沿和有代表性演算法

Holistic, Multi-Channel Prediction

TextBoxes

Rotation Proposals

Corner Localization and Region Segmentation
(A Megvii work in CVPR 2018)

Simpler Pipelines

EAST (A Megvii work in CVPR 2017)

任意形狀的文字檢測

TextSnake (A Megvii work in ECCV 2018)

Mask TextSpotter (A Megvii work in ECCV 2018)

文字識別

CRNN

ASTER

FAN

資源推薦

[深度學習]場景文字檢測與識別

場景文字檢測與識別相關論文

[開原始碼與資料集]文字檢測與識別

機器視覺 OpenCV—python 基於LSTM網路的OCR文字檢測與識別

基於深度學習的目標檢測及場景文字檢測研究進展

【OCR技術系列之四】基於深度學習的文字識別（3755個漢字）

[Xcode10 實際操作]七、檔案與資料-(20)CoreML機器學習框架：檢測和識別圖片中的物體

物體檢測與識別——學習筆記

深度學習之目標檢測常用演算法原理+實踐精講 YOLO / Faster RCNN / SSD / 文字檢測 / 多工網路

文字的檢測與識別資源

Google深度學習筆記文字與序列的深度模型

OpenCV 學習筆記07 目標檢測與識別

基於深度學習的病毒檢測技術無需沙箱環境，直接將樣本文件轉換為二維圖片，進而應用改造後的卷積神經網絡 Inception V4 進行訓練和檢測

深度學習下的驗證碼識別教程

CAD2014學習筆記-文字編輯與尺寸標註

TensorFlow筆記（7）-----實戰Google深度學習框架----隊列與多線程

【MATLAB深度學習】神經網絡與分類問題

第十八節、基於傳統圖像處理的目標檢測與識別(HOG+SVM附代碼)

深度學習---過擬合與欠擬合

基於深度學習的目標檢測演算法綜述（一）（截止20180821）

[深度學習]場景文字檢測與識別

背景

文字為什麼重要？

問題定義

那麼會有那些挑戰呢？

近期前沿和有代表性演算法

Holistic, Multi-Channel Prediction

TextBoxes

Rotation Proposals

Corner Localization and Region Segmentation(A Megvii work in CVPR 2018)

Simpler Pipelines

EAST (A Megvii work in CVPR 2017)

任意形狀的文字檢測

TextSnake (A Megvii work in ECCV 2018)

Mask TextSpotter (A Megvii work in ECCV 2018)

文字識別

CRNN

ASTER

FAN

資源推薦

相關推薦

Corner Localization and Region Segmentation
(A Megvii work in CVPR 2018)