SEE: Towards semi-supervised end-to-end scene text recognition

Bartz, Christian; Yang, Haojin; Meinel, Christoph

The search result changed since you submitted your search request. Documents might be displayed in a different sort order.

search hit 5 of 8

Back to Result List

SEE: Towards semi-supervised end-to-end scene text recognition

Christian Bartz, Haojin Yang, Christoph Meinel

Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present SEE, a step towards semi-supervised neural networks for scene text detection and recognition, that can be optimized end-to-end. Most existing works consist of multiple deep neural networks and several pre-processing steps. In contrast to this, we propose to use a single deep neural network, that learns to detect and recognize text from natural images, in a semi-supervised way. SEE is a network that integrates and jointly learns a spatial transformer network, which can learn to detect text regions in an image, and a text recognition network that takes the identified text regions and recognizes their textual content. We introduce the idea behind our novel approach and show its feasibility, by performing a range of experiments on standard benchmark datasets, where weDetecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present SEE, a step towards semi-supervised neural networks for scene text detection and recognition, that can be optimized end-to-end. Most existing works consist of multiple deep neural networks and several pre-processing steps. In contrast to this, we propose to use a single deep neural network, that learns to detect and recognize text from natural images, in a semi-supervised way. SEE is a network that integrates and jointly learns a spatial transformer network, which can learn to detect text regions in an image, and a text recognition network that takes the identified text regions and recognizes their textual content. We introduce the idea behind our novel approach and show its feasibility, by performing a range of experiments on standard benchmark datasets, where we achieve competitive results.…

Metadaten
Author details:	Christian Bartz ORCiD, Haojin Yang GND, Christoph Meinel ORCiD GND
ISBN:	978-1-57735-800-8
Title of parent work (English):	Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, Thirtieth Innovative Applications of Artificial Intelligence Conference, Eight Symposium on Educational Advances in Artificial Intelligence
Publisher:	ASSOC Association for the Advancement of Artificial Intelligence
Place of publishing:	Palo Alto
Publication type:	Other
Language:	English
Year of first publication:	2018
Publication year:	2018
Release date:	2022/02/21
Volume:	10
Number of pages:	8
First page:	6674
Last Page:	6681
Organizational units:	Digital Engineering Fakultät / Hasso-Plattner-Institut für Digital Engineering GmbH
DDC classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 000 Informatik, Informationswissenschaft, allgemeine Werke
Peer review:	Referiert

SEE: Towards semi-supervised end-to-end scene text recognition

Export metadata

Additional Services