A Survey of Manuscripts Digitization to Restoration using Image Processing

Lalit Prakash Saxena*
Applied Research Section, Combo Consultancy, Obra, Sonebhadra, Uttar Pradesh, India.
Periodicity:September - November'2019
DOI : https://doi.org/10.26634/jpr.6.3.17302

Abstract

As of today, the physical condition of the manuscripts, most of them from ancient period, is deteriorating, creating issues in probable digitization and effective restoration procedures. A systematic approach in digitization and manuscripts restoration is going on, which involves image processing (mainly the acquisition) methods as a preventive measure. This paper presents a concise survey of classical and recent manuscript image processing methods. These documents have physical degradation, which prefers photographs rather than scanning, that may lead to further deterioration. This paper discusses the methodology, contributions, advantages, and disadvantages of the reviewed methods. Further, it also highlights the challenging problems in manuscripts image processing with the scope of future enhancements. The objective of this survey is to help researchers working in the domain of image processing, irrespective of the application areas.

Keywords

Manuscripts, Degradation, Image processing, Digitization, Restoration.

How to Cite this Article?

Saxena, L. P. (2019). A Survey of Manuscripts Digitization to Restoration using Image Processing. i-manager’s Journal on Pattern Recognition, 6(3), 27-36. https://doi.org/10.26634/jpr.6.3.17302

References

[1]. Alahakoon, C. N. (2006). Identification of physical problems of major palm leaf manuscripts collections in Sri Lanka. Journal of the University Librarians Association of Sri Lanka, 10, 54–65.
[2]. Ardizzone, E., Dindo, H., Maniscalco, U., & Mazzola, G. (2006, September). Damages of digitized historical images th as objects for content based applications. In 2006, 14 European Signal Processing Conference (pp. 1-5). IEEE.
[3]. Bar-Yosef, I. (2005). Input sensitive thresholding for ancient Hebrew manuscript. Pattern Recognition Letters, 26(8), 1168-1173. https://doi.org/10.1016/j.patrec.2004 .07.014
[4]. Battiato, Sebastiano, & Stanco, F. (2006). Digital Restoration for antique documents. Communications to Simai Congress. 1, 1-6.
[5]. Brisinello, M., Grbić, R., Stefanovič, D., &Pečkai- Kovač, R. (2018, September). Optical character recognition on images with colorful background. In 2018, IEEE 8th International Conference on Consumer Electronics- Berlin (ICCE-Berlin), (pp. 1-6). IEEE. https://doi.org/10.1109/ICCE-Berlin.2018.8576202
[6] Calabretto, S., & Bozzi, A. (1998). The philological workstation bambi (better access to manuscripts and browsing of images). Journal of Digital Information, 1(3), 1- 17.
[7]. Chen, Y., & Leedham, G. (2005). Decompose algorithm for thresholding degraded historical document images. IEEE Proceedings-Vision, Image and Signal Processing, 152(6), 702-714. https://doi.org/10.10 49/ip-vis:20045054
[8]. Dobreva, M., & Ikonomov, N. (2004). Digital preservation and access to cultural and scientific heritage: Preservation of the kt-digicult-bgproject. International Journal Information Theories & Applications,11(3),204–210.
[9]. Dubois, E., & Pathak, A. (2001, April). Reduction of Bleed-through in Scanned Manuscript Documents. (Vol. 1, pp. 177-180). In PICS.
[10]. Fujisawa, H. (2008). Forty years of research in character and document recognition an industrial perspective. Pattern Recognition, 41(8), 2435-2446. https://doi.org/10.1016/j.patcog.2008.03.015
[11]. Gatos, B., Pratikakis, I., & Perantonis, S. J. (2004, September). An adaptive binarization technique for low quality historical documents. In International Workshop on Document Analysis Systems (pp. 102-113). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28 640-0_10
[12]. Goltz, D., Attas, M., Young, G., Cloutis, E., & Bedynski, M. (2010). Assessing stains on historical documents using hyper spectral imaging. Journal of Cultural Heritage, 11(1), 19-26. https://doi.org/10.1016/j. culher.2009.11.003
[13]. Gorman, L., Sammon, M., & Seul, M. (2008). Practical Algorithms for Image Analysis. Cambridge University Press, New York, USA.
[14]. Hadjadj, Z., Cheriet, M., Meziane, A., & Cherfa, Y. (2017). A new efficient binarization method: Application to degraded historical document images. Signal, Image and Video Processing, 11(6), 1155-1162. https://doi.org/ 10.1007/s11760-017-1070-2
[15]. Jin, L. (2017). Complex impulse noise removal from color images based on super pixel segmentation. Journal of Visual Communication and Image Representation, 48, 54-65. https://doi.org/10.1016/j.jvcir.2017.05.012
[16]. Kim, S. J., Deng, F., & Brown, M. S. (2011). Visual enhancement of old documents with hyper spectral imaging. Pattern Recognition, 44(7), 1461-1469. https://doi.org/10.1016/j.patcog.2010.12.019
[17]. Kumar, D. U., Sreekumar, G., & Athvankar, U. (2009). Traditional writing system in southern India-palm leaf manuscripts. Design Thoughts, 7, 2-7.
[18]. Leydier, Y., Bourgeois, F., & Emptoz, H. (2004). Serialized unsupervised classifier for adaptive color image segmentation: Application to digitized ancient manuscripts. In Proceedings of 17th International Conference on Pattern Recognition (ICPR), 1,494–497.
[19]. Leydier, Y., Lebourgeois, F., & Emptoz, H. (2007). Text search for medieval manuscript images. Pattern Recognition, 40(12), 3552-3567. https://doi.org/ 10.1016/ j.patcog.2007.04.024
[20]. López-Rubio, E. (2010). Restoration of images corrupted by Gaussian and Uniform impulsive noise. Pattern Recognition, 43(5), 1835-1846. https://doi.org/ 10.1016/j.patcog.2009.11.017
[21]. Lorena, A. C., Garcia, L. P., Lehmann, J., Souto, M. C., & Ho, T. K. (2019). How Complex is your classification problem? A survey on measuring classification complexity. ACM Computing Surveys (CSUR), 52(5), 1-34. https://doi.org/10.1145/3347711
[22]. Luo, E., Chan, S. H., & Nguyen, T. Q. (2016). Adaptive image denoising by mixture adaptation. IEEE Transactions on Image Processing, 25(10), 4489-4503. https://doi.org /10.1109/TIP.2016.2590318
[23]. Mello, C., Sanchez, A., Oliveira, A., & Lopes, A. (2008). An efficient gray-level thresholding algorithm for historic document images. Journal of Cultural Heritage, 9(2), 109-116. https://doi.org/10.1016/j.culher.2007.09. 004
[24]. Moghaddam, R. F., & Cheriet, M. (2009). RSLDI: Restoration of single-sided low-quality document images. Pattern Recognition, 42(12), 3355-3364. https://doi.org/10.1016/j.patcog.2008.10.021
[25]. Montani, I., Sapin, E., Pahud, A., & Margot, P. (2012). Enhancement of writings on a damaged medieval manuscript using ultraviolet imaging. Journal of Cultural Heritage, 13(2), 226-228. https://doi.org/10.1016/j. culher.2011.09.002
[26]. Mustafa, W. A., Khairunizam, W., Zunaidi, I., Razlan, Z. M., & Shahriman, A. B. (2019, June). A comprehensive review on document image (DIBCO) database. In IOP Conference Series: Materials Science and Engineering (Vol. 557, No. 1, p. 012006). IOP Publishing.
[27]. Nagy, G. (2000). Twenty years of document image analysis in PAMI. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1, 38-62.
[28]. Nawaz, T., Qazi, K. A., & Ashraf, M. I. (2009). Performance evaluation of noise removal algorithms for scanned images. International Journal of Computer Science and Security, 3(3), 226.
[29]. Nicolas, S., Paquet, T., & Heutte, L. (2003, November). Digitizing cultural heritage manuscripts: The bovary project. In Proceedings of the 2003 ACM Symposium on Document Engineering (pp. 55-57).
[30]. Peratonis, S., Gatos, B., Ntzios, K., Pratikakis, I.,Vrettaros, I., Drigas, A., Mmanouilidis, C., Kesidis, A., & Kalomirakis, D. (2004). Digitisation processing and recognition of old greek manuscripts (thed-scribe project). International Journal Information Theories & Applications,11(3), 232–240.
[31]. Ramponi, G., Stanco, F., Dello Russo, W., Pelusi, S., & Mauro, P. (2005, March). Digital automated restoration of manuscripts and antique printed books. In Proceedings of EVA (pp. 764-767).
[32]. Saxena, L. P. (2019). Niblack's binarization method and its modifications to real-time applications: A review. Artificial Intelligence Review, 51(4), 673-705. https://doi.org/10.1007/s10462-017-9574-2
[33]. Sehad, A., Chibani, Y., Hedjam, R., & Cheriet, M. (2019). Gabor filter-based texture for ancient degraded document image binarization. Pattern Analysis and Applications, 22(1), 1-22. https://doi.org/10.1007/s10 044-018-0747-7
[34]. Serra, J. (1982). Image Analysis and Mathematical Morphology. Academic Press, London.
[35]. Sonka, M., Hlavac, V., & Boyle, R. (2007). Image Processing, Analysis, and Machine Vision. Thomson- Engineering.
[36]. Sparavigna, A. (2009). Digital restoration of ancient papyri. Computer Vision and Pattern Recognition (cs.CV), 1-6.
[37]. Stanco, F., Ramponi, G., & Tenze, L. (2004). A Method for Improving the Visual Quality of Digitized Antique Books (Vol. 276, pp. 4-5). In 7th COST.
[38]. Su, B., Lu, S., & Tan, C. L. (2010, June). Binarization of historical document images using the local maximum and minimum. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems (pp. 159-166).
[39]. Sulaiman, A., Omar, K., & Nasrudin, M. F. (2019). Degraded historical document binarization: A review on issues, challenges, techniques, and future directions. Journal of Imaging, 5(4), 1–25. https://doi.org/10.3390 /jimaging5040048
[40]. Surinta, O., & Chamchong, R. (2008, October). Image segmentation of historical handwriting from palm leaf manuscripts. In International Conference on Intelligent Information Processing (pp. 182-189). Springer, Boston, MA. https://doi.org/10.1007/978-0-387- 87685-6_23
[41]. Thouin, P. D., & Chang, C. I. (2000). A method for restoration of low-resolution document images. International Journal on Document Analysis and Recognition, 2(4), 200-210. https://doi.org/10.100 7/PL00021526
[42]. Uhlír, Z. (2004). Manuscript digitization and electronic processing of manuscripts in the czech national library. International Journal Information Theories & Applications, 11(3), 257–262.
[43]. Zhang, S., Li, X., Zong, M., Zhu, X., & Cheng, D. (2017). Learning k for knn classification. ACM Transactions on Intelligent Systems and Technology (TIST), 8(3), 1-19.
If you have access to this article please login to view the article or kindly login to purchase the article

Purchase Instant Access

Single Article

North Americas,UK,
Middle East,Europe
India Rest of world
USD EUR INR USD-ROW
Pdf 35 35 200 20
Online 35 35 200 15
Pdf & Online 35 35 400 25

Options for accessing this content:
  • If you would like institutional access to this content, please recommend the title to your librarian.
    Library Recommendation Form
  • If you already have i-manager's user account: Login above and proceed to purchase the article.
  • New Users: Please register, then proceed to purchase the article.