ACKNOWLEDGEMENTS
This work was supported by my Supervisor, Special
thanks to him for his guidance to finish this work
smoother and successfully. The test dataset was bor-
rowed from OCR-d.de, special gratitude for providing
the dataset to evaluate our work. Our appreciation for
the author for sharing the code of existing computer
vision based method, which helped us in great extent
to evaluate our method effectively.
REFERENCES
Afzal, M. Z., Kr
¨
amer, M., Bukhari, S. S., Yousefi, M. R.,
Shafait, F., and Breuel, T. M. (2013). Robust bina-
rization of stereo and monocular document images us-
ing percentile filter. In International Workshop on
Camera-Based Document Analysis and Recognition,
pages 139–149. Springer.
Bukhari, S. S., Shafait, F., and Breuel, T. M. (2009). De-
warping of document images using coupled-snakes.
In Proceedings of Third International Workshop on
Camera-Based Document Analysis and Recognition,
Barcelona, Spain, pages 34–41. Citeseer.
Finereaderonline.com (2018). Ocr online - text recognition
& pdf conversion service — abbyy finereader online.
https://finereaderonline.com/en-us. (Accessed on: 10-
09-2018).
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. arXiv preprint.
kersner, M. (2018). martinkersner/py img seg eval.
https://github.com/martinkersner/py img seg eval.
(Accessed on: 10-08-2018).
Kim, B. S., Koo, H. I., and Cho, N. I. (2015). Document
dewarping via text-line based optimization. Pattern
Recognition, 48(11):3600–3614.
Koo, H. I., Kim, J., and Cho, N. I. (2009). Composition of
a dewarped and enhanced document image from two
view images. IEEE Transactions on Image Process-
ing, 18(7):1551–1562.
Liang, J., Doermann, D., and Li, H. (2005). Camera-based
analysis of text and documents: a survey. Interna-
tional Journal of Document Analysis and Recognition
(IJDAR), 7(2-3):84–104.
Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-
volutional networks for semantic segmentation. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 3431–3440.
Mori, S., Nishida, H., and Yamada, H. (1999). Optical char-
acter recognition. John Wiley & Sons, Inc.
NVIDIA (2018). Nvidia/pix2pixhd.
https://github.com/NVIDIA/pix2pixHD.
NVlabs (2018). Nvlabs/ocrodeg.
https://github.com/NVlabs/ocrodeg.
Ocr-d.de (2018). Koordinierte f
¨
orderinitiative zur weiteren-
twicklung von verfahren der optical character recog-
nition (ocr). http://ocr-d.de/.
Reisenhofer, R., Bosse, S., Kutyniok, G., and Wiegand, T.
(2018). A haar wavelet-based perceptual similarity in-
dex for image quality assessment. Signal Processing:
Image Communication, 61:33–43.
Smith, R. (2007). An overview of the tesseract ocr engine.
In Document Analysis and Recognition, 2007. ICDAR
2007. Ninth International Conference on, volume 2,
pages 629–633. IEEE.
Stamatopoulos, N., Gatos, B., Pratikakis, I., and Perantonis,
S. J. (2008). A two-step dewarping of camera docu-
ment images. In Document Analysis Systems, 2008.
DAS’08. The Eighth IAPR International Workshop on,
pages 209–216. IEEE.
Ulges, A., Lampert, C. H., and Breuel, T. M. (2005). Doc-
ument image dewarping using robust estimation of
curled text lines. In Document Analysis and Recog-
nition, 2005. Proceedings. Eighth International Con-
ference on, pages 1001–1005. IEEE.
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., and
Catanzaro, B. (2017). High-resolution image synthe-
sis and semantic manipulation with conditional gans.
arXiv preprint arXiv:1711.11585.
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.
(2004). Image quality assessment: from error visi-
bility to structural similarity. IEEE transactions on
image processing, 13(4):600–612.
Wu, C. and Agam, G. (2002). Document image de-warping
for text/graphics recognition. In Joint IAPR Interna-
tional Workshops on Statistical Techniques in Pattern
Recognition (SPR) and Structural and Syntactic Pat-
tern Recognition (SSPR), pages 348–357. Springer.
Document Image Dewarping using Deep Learning
531