1. 程式人生 > >文字的檢測與識別資源

文字的檢測與識別資源

持續更新中.......

【綜述( Survey)】

[2016-TIP] Text Detection Tracking and Recognition in Video:A Comprehensive Survey [paper]

[2015-PAMI] Text Detection and Recognition in Imagery: A Survey [paper]

[2014-FCS] Scene Text Detection and Recognition: Recent Advances and Future Trends[paper]

【場景文字檢測(Scene Text Detection)】

[201703-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection[paper]

[201702-arXiv] Improving Text Proposal for Scene Images with Fully Convolutional Networks [paper]

[2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [paper]

[2017-CVPR] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection

 [paper]

[2017-CVPR] Detecting Oriented Text in Natural Images by Linking Segments [paper]

[2017-AAAI] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [paper][code]

[2016-ECCV] CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network[paper][code]

[2016-PHD-Thesis] Context Modeling for Semantic Text Matching and Scene Text Detection[

paper]

[2016-IJCAI] Scene Text Detection in Video by Learning Locally and Globally [paper]

[201606-arXiv] Scene Text Detection via Holistic, Multi-Channel Prediction [paper]

[2016-CVPR] Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [paper]

[2016-CVPR] Synthetic Data for Text Localization in Natural Images [paper] [data][code]

[2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm[paper]

[2016-CVPR] Multi-oriented text detection with fully convolutional network[paper][code]

[2016-IJCV] Reading Text in the Wild with Convolutional Neural Networks[paper][demo][homepage]

[2016-TIP] Text-Attentional Convolutional Neural Networks for scene Text Detection[paper]

[2016-IJDAR] TextCatcher: a method to detect curved and challenging text in natural scenes[paper]

[201605-arXiv] DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images[paper][data]

[201601-arXiv] TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper][code]

[2015-TPAMI] Real-time Lexicon-free Scene Text Localization and Recognition[paper]

[2015-CVPR] Symmetry-Based Text Line Detector in Natural Scenes [paper][code]

[2015-ICCV] FASText: Efficient unconstrained scene text detector[paper][code]

[2015-ICDAR] Object Proposal for Text Extraction in the Wild[paper][code]

[2015-PHD-Thesis] Deep Learning for Text Spotting [paper]

[2014-ECCV] Deep Features for Text Spotting [paper][code][Homepage]

[2014-TPAMI] Robust Text Detection in Natural Scene Images[paper]

[2014-ECCV] Robust Text Detection with Convolution Neural Network Induced MSER Trees [paper]

[2013-ICCV] Photo OCR:Reading Text in Uncontrolled Conditions[paper]

[2012-CVPR] Real-time scne text localization and recognition[paper][code]

[2010-CVPR] SWT: Detecting Text in Natural Scenes with Stroke Width Transform [paper] [code][code2]

【自然場景中的文字識別(Scene Text Recognition)】

[2016-NIPS] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [paper]

[2016-AAAI] Reading Scene Text in Deep Convolutional Sequences [paper]

[2016-CVPR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [paper]

[2016-CVPR] Robust Scene Text Recognition with Automatic Rectification[paper]

[2015-CoRR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and It's Application to Scene Text Recognition [paper][code]

[2015-ICDAR] Automatic Script Identification in the Wild [paper]

[2015-ICLR] Deep structured output learning for unconstrained text recognition [paper]

[2014-NIPS] Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [paper] [homepage][model]

[2014-TIP] A unified Framework for Multi-Oriented Text Detection and Recognition [paper]

[2013-CVPR]Scene Text Recognition using Part-based Tree-structured Character Detection [paper]

[2012-CVPR]top-down and bottom-up cues for scene text recognition [paper]

[2012-ICPR] End-to-End Text Recognition with CNN [pager][code]

【嵌入型文字的檢測與識別(Embedded Text Detection and Recognition)】

[201704-TPAMI]  A Unified Framework for Tracking based Text Detection and Recognition from Web Videos [paper]

[2017-AAAI] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [paper][code]

【手寫體識別(Handwriting Recognition)】

[201704-TPAMI] Drawing and Recognizing Chinese Characters with RNN [paper]

[201610-arXiv]Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition [paper]

[201610-arXiv] Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition [paper]

[201606-arXiv] Drawing and Recognizing  Chinese Characters with RNN [paper]

201604-arXiv] Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention[paper][video]

[2015-ICDAR] High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps[paper][code][code2]

【資料集(datasets)】

I. For scene text detection

1. COCO-Text

 63,686 images, 173,589 text instances, 3 fine-grained text attributes.

2.Synth-Text

800k thousand images; 8 million synthetic word instances

3. MSRA-TD500

500 (300 training + 200 testing) natural images that their resolution of the image vary 1296x864~1920x1280; Chinese , English or mixture of both

 350 high resolution images (average size 1260 × 860) (100 images for training and250 images for testing ) Only word level bounding boxes are provided with case-insensitive labels

3000 images of indoorand outdoor scenes containing text Korean,English (Number), and Mixed (Korean + English + Number) Task:text location, segmentation and recognition

6. ICDAR系列

-ICDAR 2015 (1000 training images + 500 testing images)

-ICDAR2013 (229 + 233)  

-ICDAR2011 (229 + 255)  

-ICDAR2005 (1001 + 489)

-ICDAR2003 (181 + 251)  

II. For Scene Text Recognition

5000 imagesfrom Scene Texts and born-digital (2k training and 3k testing images)Each image is a cropped word image of scene text with case-insensitive labels

2. Synth-Word

million images covering 90k English words (2014 Oxford; VGG)

3. StanfordSynth

Smallsingle-character images of 62 characters (0-9, a-z, A-Z). (2012 Stanford, AI Group)

SVHN is obtained from house numbers in Google Street View images.(over 600,000 digit images)

5. KAIST

6. Chars

 Over 74K images from natural images, as well as a set of synthetically generated characters .mall single-character images of 62 characters (0-9, a-z, A-Z).

7. ICDAR系列

【參考】