Please help support CiteULike by taking part in our survey.
CiteULike is a free online bibliography manager. Register and you can start organising your references online.

A new algorithm for machine printed Arabic character segmentation

Pattern Recognition Letters, Vol. 25, No. 15. (November 2004), pp. 1723-1729.

X Abstract

The major problem with machine printed Arabic character segmentation is the shape of the letter depending on its location in the word. In this paper, a new machine printed Arabic character segmentation algorithm, which is based on the vertical histogram and some rules, is presented. The rules which are based on, not only the structural characteristics between background regions and character components but also the characteristics of isolated Arabic characters, are used to check whether the sub-word includes only one character. Then we use the vertical histogram and some other rules to find real segmentation points. Finally, we split the sub-word at the segmentation points. The experimental results show that the algorithm achieved about 94% correct segmentation.

View the full article here:

DOI, ScienceDirect

This article has been bookmarked once, on 2006-08-29.

2006-08-29 User vmoa
Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.