Abstract

Image binarization is the process of representing an image pixel in binary format by assigning a value to the pixel as either 0 or 1. Before conversion to binary, the image can be in either gray-scale having pixel value between 0 to 255 or color, i.e., a pixel having value between 0 to 255 for each of the red, green, blue (RGB) channels separately. The method through which this conversion is implemented over an image is called as the binarization method. This paper reviews the methodology, contributions, advantages, and disadvantages of the existing studies on binarization methods. Further, the paper also highlights the problems in image processing with the scope for future enhancements.

Image processing is a broad concept that includes various techniques used in a wide range of applications. Binarization of an image is a direct and frequently used technique in image processing. A generalized binarization method of adjusting gray-scale image contrast irrespective of the document types has been proposed by Feng and Tan (2004). A thresholding technique for obtaining a monochromatic image (binarized) from a given gray-scale image of old documents is proposed in Mello et al. (2008). A broad survey of image processing from the industrial perspective has been given, though optical character recognition (OCR) technology has been available since the 1950s, both recognition hardware and computing power had drawbacks in the first two decades (Fujisawa, 2008).

In Badekas and Papamarkos (2007) preliminary binarization is obtained using a high threshold and the Canny edge detection algorithm (Canny, 1986). The initial results were refined using a binary image obtained with a low threshold. For images with low contrast, noise, and non-uniform illumination, this double-thresholding method is more effective than the classical methods. However sparingly, using entropy-based thresholding method achieved a reduction to 12% in image size, which in turn speed-up further processing.

In Kittler and Illingworth (1986), local thresholding is employed to detect casting and welding defects in machinery images and to separate the foreground and background in text and non-text regions in document images, respectively. Algorithms to handle the challenges in text segmentation of the images having complex backgrounds are discussed (Cavalcanti et al., 2006; Gatos et al., 2004; Oh et al., 2005). In Feng and Tan (2004), a local threshold value for binarization in low-resolution text is obtained with windows containing two consecutive letters. The frequency distribution in gray-scale values of image pixels is plotted as a histogram. Vertical and horizontal projection profiles of the documents image are considered during the binarization process. Valleys observed in the histogram give clues for selecting suitable threshold(s) value (Gonzalez & Woods, 2002; Sonka et al., 2007).

Ideally, a document having black letters on white background or viceversa will have a bimodal histogram (two peaks, one for the background and one for the foreground) of its image. While unimodal histograms (one peak) are observed in images when the text and background have similar intensities, as in the case of stone inscriptions. A multimodal histogram (several peaks) is observed for multi-colored text document. Thus, multimodal histogram generate multiple candidates for thresholds. Then, finding a pixel-wise threshold value to separate the text from non-text region becomes difficult (Jain, 1989; Russ, 2007). Interestingly, in many cases, the selected threshold value is based upon manual observations of the valley region(s), and then adjusted by trial and error to a value that works well (Russ, 2007).

For a global threshold value, t, g(x,y), is a thresholded image of a test image, f(x,y) (Niblack, 1986; Rosenfeld & Kak, 1982). The pixels are labeled either 1 or 0 representing the foreground or background. Suppose g(x,y), is a thresholded image of a test image, f(x,y), for a global threshold value, t. The pixel value in g(x,y) is equal to 1, if f(x,y) > t, and 0, otherwise.(Niblack, 1986; Rosenfeld & Kak, 1982). Figure 1 shows the histogram plots. Figure 1(a) shows the Brahmi script image on rock and its histogram in Figure 1(b). Figure 1(c) shows Grantha script image on palm leaf and its histogram is shown in Figure 1(d). In Figure 1(b), the unimodal histogram of the image suggests a single threshold value. The thresholding of an image is given by,

The pixels that are labeled either 1 or 0 corresponds to foreground or background respectively. Thus, thresholding converts a gray-scale image to a binary image by transforming all pixels value to either 1 or 0 such that above or equal to and below the threshold value, respectively (Gonzalez & Woods, 2002; Jain, 1989; MATLAB, 2011). Mostly, for a global threshold, the value of t, is arbitrarily chosen from a certain range of values for t and image details are observed precisely for the selected value (Gonzalez & Woods, 2002). In the same concept, for a single threshold in Figure 1(d) has two values for the image in Figure 1(c). Thus, the thresholding of an image is given by,

The bimodal histogram predicts two threshold values t1 and t2. If the image pixel value lies between t1 and t2 then it corresponds to 1, or otherwise it is 0 (Gonzalez & Woods, 2002; Niblack, 1986).

The histogram-based method fails when there is less difference between foreground and background intensities. The reason is that the valley between two peaks is not deep, so the histogram tends to a unimodal distribution. This problem is solved with a local thresholding method proposed by Niblack (1986) and Saxena (2019). Here, the threshold value is based on the mean and standard deviation of pixel intensities in a local window. The threshold for Niblack method is given by,

where m(x, y) and s(x, y) are the local mean and standard deviation, and k is a document-dependent manually selected parameter (Niblack used 0.2 for dark foreground and 0.2 for dark background). A large value of k adds extra pixels to foreground area which makes the text unreadable, while a small value of k reduces foreground area resulting in broken and incomplete characters (in a document having text and non-text regions). The main disadvantage of this method is that it produces a large amount of background noise, increasing the foreground region. This problem is solved in Sauvola and Pietikäinen (2000) by the thresholding formula,

where R is the dynamic range of the standard deviation s(x, y) and k takes positive values. This formula has the effect of amplifying the contribution of standard distribution in an adaptive manner. Hence, satisfactory results were obtained for 8-bit images when R is 128 and k is equal to 0.5. A large value of R (close to 255) produces dark background, thus reducing the foreground region. Sauvola's method is better than Otsu (1979) and Niblack (1986) methods but still there are some regions in image with pixels having background noise, as shown in Figure 2.

However, in a two-stage thresholding technique based on fuzzy approach (Solihin & Leedham, 1999), first the image is separated into three regions: the foreground, the background and a fuzzy region with pixel intensities falling in between. The final threshold is found in the second stage after taking into account the distribution of intensities due to the handwriting device, e.g., a pencil has lighter intensity (lesser contrast) and more distribution of intensities than a ballpoint pen, whereas a felt-tipped pen has higher intensity (greater contrast) and less distribution of intensities. The Native Integral Ratio (NIR) helps separate the foreground and background from the fuzzy regions, while a new technique (Solihin & Leedham, 1999) called quadratic integral ratio (QIR) reduces local fluctuations that are introduced by NIR.

Also, the method described in Yang and Yan (2000) uses run-length histogram of grayscale values in degraded document images. First, the clustering and connection characteristics of a character stroke are analyzed from the run length histogram for selected image regions and various inhomogeneous gray-scale backgrounds. Then, a logical thresholding method is used to extract the binary image adaptively from the document image with complex and inhomogeneous background. It can adjust the size of the window and thresholding level adaptively according to the local run-length histogram and the local gray-scale in homogeneity. This method can threshold poor quality gray-scale document images automatically without needing any prior knowledge of the document image and manual fine-tuning of parameters.

1.1 Adaptive Binarization

When histogram-based thresholding techniques fail to satisfactorily separate the text (that is, a global threshold does not work for the whole image), optimal threshold values suited to different regions in the image are generated through adaptive binarization (Batenburg & Sijbers, 2009; Gatos, 2006).

Adaptive binarization is employed for segmentation, noise reduction, and document analysis (Baird, 2004), optical character recognition (Casey & Lecolinet, 1996; Plamondon & Srihari, 2000; Trier et al., 1996), imaging 2-d planes of 3-d objects (Leedham et al., 2003) and for the specific purpose of manuscript digitization, restoration and enhancement (Kavallieratou & Antonopoulou, 2005; Perantonis et al., 2004, Sparavigna, 2009). A comparison and evaluation of different adaptive binarization techniques is given (Blayvas et al., 2006; Duda et al., 2001; Trier & Jain, 1995; Trier & Taxt, 1995). A short survey by Kefali et al. (2010) presents a comparison of global and local thresholding methods for binarization of old Arabic documents image enhancement. Several binarization methods are compared on the basis of relative area error (RAE), mis-classification error (ME), modified Hausdorff distance (MHD) (Sezgin & Sankur, 2004). An adaptive thresholding method is proposed in Sauvola and Pietikäinen (2000) to enhance degraded documents images, while the method in Gatos et al. (2004) interpolates neighboring background intensities for background surface estimation as shown in Figure 3.

2.1 Otsu Method

The global threshold method proposed by Otsu (Otsu, 1979) is based on histogram processing, such that histogram interpretation reveals a separation of pixels into two classes for foreground and background. The threshold selection tends to minimize the weighted within-class variance σ²_ω(t) given by Equation (5):

where

are gray levels. Hence, the total variance is calculated by adding within-class and between class variances, which is represented by Equation (12):

Here, σ² is a constant value irrespective of the threshold value, however minimizing σ²_ω(t), weighted within-class variance or maximizing σ²_b(t) between-class variance leads to threshold value which lies between 0 to 255.

2.2 Kittler and Illingworth Method

The threshold in Kittler and Illingworth (1986) is calculated by minimizing the criterion function.

where h(z): z = 0, 1,..., Z -1 the normalized histogram, c(z, ) is cost function, τ, the threshold based on hypotheses H₀ and H₁ such that,

2.3 Yanowitz and Bruckstein Method

The threshold calculation by solving Laplace Equation in Yanowitz and Bruckstein (1989) is defined by Equation (16):

where P(x, y) is the potential surfaces for (x, y) as the data points, is to estimate the gray-level values to interpolate the threshold surface, which smoothens the image and calculates the gradient magnitude.

2.4 Yosef Method

An input sensitive thresholding algorithm defined in Yosef (2005) for binary image using Equation (17):

where T_C, is the mean gray-scale value of C_B (set of pixels belonging to contours of components in f_B). Further a reference image is obtained by calculating a local threshold by:

where i = 1,...,N, CC_i is the connected component, m_p, is the mean gray-scale value of pixels belonging to CC_i.

2.5 Leedham Method

where I(x, y) is the intensity image. Since, the gradient is sensitive to noise, adding a pre-condition referred to Constant R, improves in selecting a threshold level, such that:

where k = -1:5, R = 40; parameters M(x, y) and G(x, y) are the local mean and local mean-gradient calculated in a window centered at pixel (x; y), for Constant=max|gray value|-min|gray value| in a window.

2.6 Feng and Tan Method

where

α₁, γ, k₁ and k₂ are the positive constants, m(x, y) is the mean, s(x, y) is the standard deviation, M is the minimum gray value, and R_s is the s dynamic range of gray-value standard deviation. The value of γ is fixed to 2 and the values of other factors like 1, k₁, and k₂ are in the ranges of 0:1 - 0:2, 0:15 - 0:25, and 0:01 1 2 - 0:05 respectively.

2.7 Oh Method

The modified waterfall model for threshold calculation in Oh et al. (2005) is given by:

where I'(x, y) represents the water-filled terrain, G(j, k) denotes the 3 x 3 Gaussian mask with a variance of one, and parameter controls the amount of water-filled into a local valley (α=2), water falls at (x_i, y_i) and reaches the lowest position (x_L, y_L).

2.8 Mello Method

It is an entropy based threshold calculation method proposed by Mello et al. (2008), given by:

there are n possible symbols, s, with probability p(s), where entropy is measured in bits/symbols, which maximizes the function by:

where Hb and Hw are the entropies of black and white pixels bounded by threshold t, for Hb: 0 to t and Hw: t + 1 to 255. This method introduced two multiplicative constants mw and mb, which are related to class of documents given by:

2.9 Su Method

For the documents having varying background contrast, there is a suppression method detailed in Su et al. (2010) which uses local image maximum and minimum defined as:

where f_max(x, y) and f_min(x, y) refer to the maximum and minimum image intensities within a local neighborhood window of size 3 x 3. The term ϵ, is a positive but infinitely small number added when local maximum equals to 0. The thresholding equation is defined as:

E_mean and E_std are the mean and the standard deviation of the image intensity of the detected high contrast image pixels. (x,y) denotes the pixel position, N_e is the number of high contrast pixel in window, and N_min is the minimum number of high contrast pixels.

2.10 Blayvas Method

The method in Blayvas et al. (2006) found an adaptive threshold surface by interpolation of the image gray levels at points where the image gradient is high, which is given by:

where p_i = x_i, y_i and v_i = I(x_i, y_i) denotes the i^th support point. The unit step source function is defined by:

here W(I) = [0, 1]² denotes the set of all image points. The shifting is given by Equation:

where l = 0,..., log₂(N) is a scale factor and j; kϵ0,..., 2ⁱ - 1 are spatial shifts. Hence, the threshold surface is given by:

2.11 Saxena Method

where 'k' lies between [-0:5 to +0:5], the constant of proportionality valued at 1/4 gives satisfactory results with Grantha and other manuscript images. 'k' does not affect much, since the value of 2(k+1) s(x, y) will always be a positive value. The factor

lead towards the improvement in terms of preserving the strokes width, maintaining the shape and connectivity, which is solely needed for recognition purposes.

This section discusses the datasets used by the respective methods explored in Section 2. The research work in Otsu (1979) used several images of size 64 x 64 pixels. Images include layout structures, character or symbols, and the cell, all were in gray-scale.

The manuscript restoration work in Yosef (2005) used thirty Hebrew calligraphy manuscripts from 14^th to 16^th century. The size these manuscripts varies from 2000 x 1000 to 8000 x 6000 pixels, and roughly these were split into 3000 x 2000 pixels. Specifically, each split part contained an average of 200 characters, with an approximate size of 80 x 80 pixels. The method in Saxena (2014) tested manuscript images on different media having different scripts: Palm leaf (Grantha), rock (Brahmi), and paper (Modi, Newari, Persian and Roman), and DIBCO datasets. The dataset in Leedham et al. (2003) used the set of ten images of historical handwritten documents, cheques, forms and newspapers.

The binarization method in Feng and Tan (2004) tested eight set of document images in various sizes and having uneven illumination, low-contrast, and random noises. Also, this method applied a median filter of size 5 x 5 in order to remove additional noise. The constant terms namely, γ, α₁, k₁, and k₂ for test set are fixed to 2, 0.12, 0.25, and 0.04 respectively. Hence, to reduce computational complexity, the local threshold at the centers of the primary local windows are bilinearly interpolated to obtain the threshold values for all the pixels.

The experimental dataset without ground-truth data is discussed in Oh et al. (2005). However, the experimented data consists of ten document images obtained by a scanner with 200 dpi having a size of 256 x 256 pixels with 8- bit gray-levels. The threshold computation time of 3.2 seconds per image implemented through C++ is obtained in Mello et al. (2008). Further, the test set comprised of five hundred images with at least 200 dpi in JPEG format. The binarization method in Su et al. (2010) tested the dataset from Document Image Binarization Contest (DIBCO) 20091, which is available online.

Without specifying the exact number of images the method in Kittler and Illingworth (1986) experimented various images of objects and structures with different gray levels. Interestingly, the artificially constructed gray-scale images having prominent white noise and uneven background illumination are processed in Yanowitz and Bruckstein (1989). In addition, this work extended experimenting these images having added narrow-band noise in their spoiled versions. Another work in Blayvas (2006) on artificial images generated four black and white images by simulating non-uniform illumination of the black and white pattern. Further, this work proposed a quantitative measure of error of the binarization method, calculated error as normalized L2 distance between binarized and original black and white image.

This section presents the discussion on the application of binarization methods for documents images. The global thresholding method in Otsu (1979) automatically selects a threshold from the gray level histogram distribution of pixels. This method finds between-class variance from gray-scale values distinguished as group of foreground and background pixels. Hence, the method is supposed to be global in nature as it selects a single value for every pixel of the image. In results, this method is unable to preserve minute details in the image, such as connectivity of the characters.

The objects in artificially constructed gray-scale images are clearly detectable through Kittler and Illingworth (1986). Further, this method works only on the nicely bimodal histogram (observable foreground and background). The evaluation of binarization method in Yanowitz and Bruckstein (1989) performed better than Chow-Kaneko detecting bi-modality of partial histograms and Rosenfeld- Nakagawa processes failed to detect faint objects. In addition, this work claims that the proposed model is comparable with human performance.

The efficiency in correctly segmenting characters from the document image is up to 94% (Yosef, 2005). Additionally, the efficiency in segmentation increased to 98% in a substantial set. Further, as a drawback the computational complexity is reduced for the brighter images. And the generalization of this approach is left for the future scope. The essence of the proposed method in Leedham et al. (2003) is that it processed difficult gray-scale document images having varying pixel intensities. However, the method failed to threshold double-side affected (or bleedthrough) and noisy background document images.

The character recognition rate for machine-printed characters through Feng and Tan (2004) has been up to 90.8%. Further, this work claimed the robustness of the binarization method to uneven illumination. The suppression of noises in non-text regions and completeness of binarized text characters is evident through binarization results. However, there is no evidence of testing on handwritten character images. The proposed method in Oh et al. (2005) proved the effectiveness of the waterfall model through visual observation only. Further, the method surpassed in comparison of processing time of original and proposed waterfall model.

The problem with the documents that are written both sides is the essence of the method in Mello et al. (2008). Further, this work observed that the written documents are 10% in ink (text regions) and 90% in paper (non-text regions). For processing, the size of the images has been reduced from 500 KB to 40 KB, which dropped the processing time 97% of the actual time. In Su et al. (2010), the observation identified different types of document degradation such as uneven illumination and document smear. For future studies, it proposed that error may arise if the background of the degraded document images contains a certain amount of pixels that are dense and at the same time have a fairly high image contrast.

The work in Blayvas et al. (2006) claimed lower computational complexity and smoothness of the proposed binarization method. In addition, this work claimed faster yield of binarized images and better noise robustness. However, there has been no comparison of the proposed method with Sauvola's binarization method. The readability of text up to 66.27%, 92.15%, 97.90%, 56.23%, 78.62%, and 98.91% for palm leaf, and paper manuscripts is obtained in Saxena (2014). Additionally, this work showed the effectiveness of the window size (33–1515) is observable in terms of processing time. Further, it stated that, smaller the window size the more time it takes in processing.

Binarization of an image is a direct and frequently used technique in image processing. Many methods in the literature have been shown to be effective in handling the complexity of binarization. The histogram-based method fails when there is less difference between foreground and background intensities. Thresholding converts a gray-scale image to a binary image by transforming all pixels value to either 1 or 0 such that above or equal to and below the threshold value, respectively. Some algorithms need extensive manual intervention, which make the methods less attractive. Adaptive binarization is employed for segmentation, noise reduction, and document analysis, optical character recognition, imaging 2-D planes of 3-D objects, and for the specific purpose of manuscript digitization, restoration and enhancement. Most of them are designed for specific applications, and so are lack generalization. This paper contains a summary of binarization and thresholding methods, their implementation, usability and challenges.

[1]. Badekas, E., & Papamarkos, N. (2007). Document binarisation using Kohonen SOM. IET Image Processing, 1(1), 67-84. https://doi.org/10.1049/iet-ipr:20050311

[2]. Baird, H. S. (2004, January). Digital libraries and document image analysis. In 2004, Archiving Conference (pp. 286-288). Society for Imaging Science and Technology.

[3]. Batenburg, K. J., & Sijbers, J. (2009). Adaptive thresholding of tomograms by projection distance minimization. Pattern Recognition, 42(10), 2297-2305. https://doi.org/10.1016/j.patcog.2008.11.027

[4]. Blayvas, I., Bruckstein, A., & Kimmel, R. (2006). Efficient computation of adaptive threshold surfaces for image binarization. Pattern Recognition, 39(1), 89-101. https://doi. org/10.1016/j.patcog.2005.08.011

[5]. Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 679-698. https://doi.org/10.1109/ TPAMI.1986.4767851

[6]. Casey, R. G., & Lecolinet, E. (1996). A survey of methods and strategies in character segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7), 690-706. https://doi.org/10.1109/34.506792

[7]. Cavalcanti, G. D., Silva, E. F., Zanchettin, C., Bezerra, B. L., Doria, R. C., & Rabelo, J. C. (2006, October). A heuristic binarization algorithm for documents with complex background. In 2006, International Conference on Image Processing (pp. 389-392). IEEE. https://doi.org/10.1109/ICIP. 2006.312475

[9]. Feng, M. L., & Tan, Y. P. (2004). Contrast adaptive binarization of low quality document images. IEICE Electronics Express, 1(16), 501-506. https://doi.org/10.1587 /elex.1.501

[10]. Fujisawa, H. (2008). Forty years of research in character and document recognition: An industrial perspective. Pattern Recognition, 41(8), 2435-2446. https://doi.org/10.1016/j.patcog.2008.03.015

[11]. Gatos, B., Pratikakis, I., & Perantonis, S. J. (2004, September). An adaptive binarization technique for low quality historical documents. In International Workshop on Document Analysis Systems (pp. 102-113). Heidelberg, Berlin: Springer. https://doi.org/10.1007/978-3-540-28640- 0_10

[12]. Gatos, B., Pratikakis, I., & Perantonis, S. J. (2006). Adaptive degraded document image binarization. Pattern Recognition, 39(3), 317-327. https://doi.org/10.1016/j.pat cog.2005.09.010

[15]. Kavallieratou, E., & Antonopoulou, H. (2005, September). Cleaning and enhancing historical document images. In International Conference on Advanced Concepts for Intelligent Vision Systems (pp. 681-688). Heidelberg, Berlin: Springer. https://doi.org/10.1007/11558 484_86

[16]. Kefali, A., Sari, T., & Sellami, M. (2010). Evaluation of several binarization techniques for old Arabic documents images. In the First International Symposium on Modeling and Implementing Complex Systems MISC (Vol. 1, pp. 88- 99).

[17]. Kittler, J., & Illingworth, J. (1986). Minimum error thresholding. Pattern Recognition, 19(1), 41-47. https://doi. org/10.1016/0031-3203(86)90030-0

[18]. Leedham, G., Chen, Y., Takru, K., Tan, J. H. N., & Mian, L. (2003, August). Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In International Conference on Document Analysis and Recognition (ICDAR) (p. 859–864).

[20]. Mello, C., Sanchez, A., Oliveira, A., & Lopes, A. (2008). An efficient gray-level thresholding algorithm for historic document images. Journal of Cultural Heritage, 9(2), 109- 116. https://doi.org/10.1016/j.culher.2007.09.004

[22]. Oh, H. H., Lim, K. T., & Chien, S. I. (2005). An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds. Pattern Recognition, 38(12), 2612-2625. https://doi.org/ 10.1016/j.patcog.2004.11.025

[24]. Perantonis, S., Gatos, B., Ntzios, K., Pratikakis, I., Vrettaros, I., Drigas, A., ..., & Kalomirakis, D. (2004). Digitisation processing and recognition of old Greek manuscipts (The D-SCRIBE Project). International Journal Information Theories & Applications, 11(3), 232–240.

[25]. Plamondon, R., & Srihari, S. N. (2000). Online and offline handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(1), 63-84. https://doi.org/10.1109/34.824821

[28]. Sauvola, J., & Pietikäinen, M. (2000). Adaptive document image binarization. Pattern Recognition, 33(2), 225-236. https://doi.org/10.1016/S0031-3203(99)00055-2

[29]. Saxena, L. P. (2014). An effective binarization method for readability improvement of stain-affected (degraded) palm leaf and other types of manuscripts. Current Science, 489-496.

[30]. Saxena, L. P. (2019). Niblack's binarization method and its modifications to real-time applications: A review. Artificial Intelligence Review, 51(4), 673-705. https://doi.org/ 10.1007/s10462-017-9574-2

[31]. Sezgin, M., & Sankur, B. (2004). Survey over image thresholding techniques and quantitative performance evaluation. Journal of Electronic Imaging, 13(1), 146-165. https://doi.org/10.1117/1.1631315

[32]. Solihin, Y., & Leedham, C. G. (1999). Integral ratio: A new class of global thresholding techniques for handwriting images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(8), 761-768. https://doi.org/10.1109/34.78 4289

[35]. Su, B., Lu, S., & Tan, C. L. (2010, June). Binarization of historical document images using the local maximum and th minimum. In Proceedings of the 9 IAPR International Workshop on Document Analysis Systems (pp. 159-166). https://doi.org/10.1145/1815330.1815351

[36]. Trier, O. D., & Jain, A. K. (1995). Goal-directed evaluation of binarization methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(12), 1191- 1201. https://doi.org/10.1109/34.476511

[37]. Trier, O. D., Jain, A. K., & Taxt, T. (1996). Feature extraction methods for character recognition: A survey. Pattern Recognition, 29(4), 641-662. https://doi.org/10.1 016/0031-3203(95)00118-2

[38]. Trier, O. D., & Taxt, T. (1995). Evaluation of binarization methods for document images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(3), 312-315. https://doi.org/10.1109/34.368197

[39]. Yang, Y., & Yan, H. (2000). An adaptive logical method for binarization of degraded document images. Pattern Recognition, 33(5), 787-807. https://doi.org/10.1016/S003 1-3203(99)00094-1

[40]. Yanowitz, S. D., & Bruckstein, A. M. (1989). A new method for image segmentation. Computer Vision, Graphics, and Image Processing, 46(1), 82-95. https://doi. org/10.1016/S0734-189X(89)80017-9

[41]. Yosef, I. B. (2005). Input sensitive thresholding for ancient Hebrew manuscript. Pattern Recognition Letters, 26(8), 1168-1173. https://doi.org/10.1016/j.patrec.2004.0 7.014

A Discussion on Image Binarization Methods

Abstract

Keywords :

Introduction

1. State-of-the-Art