Multiple horizontally overlapping lines are normally found in printed newspapers of almost every language due to high compression methods used for printing of the newspapers. For any optical character recognition (OCR) system, presence of horizontally overlapping lines decreases the recognition accuracy drastically. In this paper, we have proposed a solution for segmenting horizontally overlapping lines. Whole document has been divided into strips and proposed algorithm has been applied for segmenting horizontally overlapping lines and associating small strips to their respective lines. The results reveal that the algorithm is almost 99% perfect when applied to the Gurmukhi script.
Download Full PDF Version (Non-Commercial Use)