Python OCR with OpenCv – Optical Character Recognation

we continue with Python and OpenCv operations .I will give you information about the installation and use of tesseract in Linux distro, which is the library we will use for OCR operations in our series of articles today.

In the next article, we will import tesseract into our Python application and perform OCR operations with our applications.

What is OCR

OCR is a process of accessing, recognizing, converting to text, in short characters in a picture or handwriting.

Tesseract Library

Tesseract is one of the most widely used OCR libraries developed by Hewlett Packard in the 1980s, which became an open source in 2005 .
Tesseract supports English by default. There are currently more than 100 language support .

Tesseract Install(Linux Mint)

sudo apt-get install tesseract-ocr

1	sudo apt-get install tesseract-ocr

Check the version after the install

tesseract -v

1	tesseract -v

output;

tesseract 3.04.01
leptonica-1.73
libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

tesseract 3.04.01

leptonica-1.73

libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.2.54 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.4 : libopenjp2 2.1.0

Command Line Usage :
Lists a list of commands with the tesseract –help command

for example :

tesseract osp.png -l en

1	tesseract osp.png -l en

– l parameter specifies the language pack .

With the stdout parameter, you can write the text read from the image file to the console .
tesseract aa.if we had used the PNG out-l tour command, we would have worked in the directory out.create a TXT file and write the characters read into this text file.