文字識(shí)別方法全面整理
點(diǎn)擊上方“小白學(xué)視覺”,選擇加"星標(biāo)"或“置頂”
重磅干貨,第一時(shí)間送達(dá)
來自 | 知乎? ??作者 |?白裳
鏈接 |?https://zhuanlan.zhihu.com/p/65707543
本文僅供交流,如有侵權(quán),請(qǐng)聯(lián)系刪除。


EAST/CTPN/SegLink/PixelLink/TextBoxes/TextBoxes++/TextSnake/MSR/...?
CRNN:CNN+RNN+CTC 一文讀懂CRNN+CTC文字識(shí)別
https://zhuanlan.zhihu.com/p/43534801
CNN+Seq2Seq+Attention Seq2Seq+Attention原理介紹 https://zhuanlan.zhihu.com/p/51383402 對(duì)應(yīng)OCR代碼如下 https://github.com/bai-shang/crnn_seq2seq_ocr_pytorch
Robust Scene Text Recognition with Automatic Rectification. CVPR2016.
arxiv.org/abs/1603.03915


Scene Text Recognition from Two-Dimensional Perspective. AAAI2018.



SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoderdecoder Network. AAAI2018.
https://ren-fengbo.lab.asu.edu/sites/default/files/16354-77074-1-pb.pdf
Handwriting Recognition in Low-resource Scripts using Adversarial Learning. CVPR2019.
arxiv.org/pdf/1811.01396.pdfESIR: End-to-end Scene Text Recognition via Iterative Image Rectification. CVPR2019.
http:openaccess.thecvf.com/content_CVPR_2019/papers/Zhan_ESIR_End-To-End_Scene_Text_Recognition_via_Iterative_Image_Rectification_CVPR_2019_paper.pdf

Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks.?ICCV2017. http:openaccess.thecvf.com/content_ICCV_2017/papers/Li_Towards_End-To-End_Text_ICCV_2017_paper.pdf



Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework. ICCV2017. http:openaccess.thecvf.com/content_ICCV_2017/papers/Busta_Deep_TextSpotter_An_ICCV_2017_paper.pdf



Attention-based Extraction of Structured Information from Street View Imagery.?ICDAR2017.
arxiv.org/abs/1704.03549

利用CNN提取不同視角的圖片的特征,并將特征concat為一個(gè)大的特征矩陣? 計(jì)算圖片中文的spatial attention??,??越大該區(qū)域?yàn)槲淖謪^(qū)域的概率越大 通過??抽取??中文字區(qū)域特征??,并送入后續(xù)RNN進(jìn)行識(shí)別
Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. ECCV2018.
arxiv.org/abs/1807.02242


Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline. ECCV2018. http:openaccess.thecvf.com/content_ECCV_2018/papers/Zhenbo_Xu_Towards_End-to-End_License_ECCV_2018_paper.pdf

利用Box Regression layer層預(yù)測(cè)車牌位置??; 檢測(cè)出來??確定位置后,采集對(duì)應(yīng)不同尺度的特征圖進(jìn)行ROI Pooling; 把不同尺度特征拼接在一起,進(jìn)行識(shí)別。

在這里特別感謝一下所有開放數(shù)據(jù)集的研究人員!數(shù)據(jù)才是cv第一生產(chǎn)力!
An end-to-end TextSpotter with Explicit Alignment and Attention. CVPR2018. http:openaccess.thecvf.com/content_cvpr_2018/papers/He_An_End-to-End_TextSpotter_CVPR_2018_paper.pdf



FOTS: Fast Oriented Text Spotting with a Unified Network. CVPR2018. arxiv.org/abs/1801.01671



SEE: Towards Semi-Supervised End-to-End Scene Text Recognition. AAAI2018.
arxiv.org/abs/1712.05404


交流群
歡迎加入公眾號(hào)讀者群一起和同行交流,目前有SLAM、三維視覺、傳感器、自動(dòng)駕駛、計(jì)算攝影、檢測(cè)、分割、識(shí)別、醫(yī)學(xué)影像、GAN、算法競(jìng)賽等微信群(以后會(huì)逐漸細(xì)分),請(qǐng)掃描下面微信號(hào)加群,備注:”昵稱+學(xué)校/公司+研究方向“,例如:”張三?+?上海交大?+?視覺SLAM“。請(qǐng)按照格式備注,否則不予通過。添加成功后會(huì)根據(jù)研究方向邀請(qǐng)進(jìn)入相關(guān)微信群。請(qǐng)勿在群內(nèi)發(fā)送廣告,否則會(huì)請(qǐng)出群,謝謝理解~
