Reading Text From Image or Video

by dhruvarora561 in Circuits > Computers

292 Views, 1 Favorites, 0 Comments

Reading Text From Image or Video

website-g78bf51bee_640.jpg

In this blog we will learn how to use `opencv` to read texts from photos and videos.


1. This is a Python script that uses computer vision and optical character recognition (OCR) to capture and read text from an image or video stream.

2. The code contains two files `ocr.py` and `ocrUtils.py`.

ocr.py

1. Imports necessary libraries including OpenCV, pytesseract, numpy, ImageGrab, time, and ocrUtils.

2. It defines a main function that does the following:

a. Initializes a camera object (using either a Raspberry Pi camera or a connected webcam).

b. Captures an image from the camera.

c. Uses ocrUtils to read text from the captured image using pytesseract, an OCR engine that can recognize text from images.

d. Displays the captured image with text overlaid on it.

e. Waits for user input before capturing another image.

ocrUtils.py

1. It contains a Python function that reads characters from an input image using the Tesseract OCR (Optical Character Recognition) engine. The function takes an input image as a cv2.Mat object and an optional draw parameter that, if set to True, draws boxes around the detected characters and labels them with the recognized text.

2. The function first sets the path to the Tesseract executable using `pytesseract.pytesseract.tesseract_cmd = '/usr/bin/tesseract'.`

3. It then gets the height, width, and number of channels of the input image using img.shape. It uses the `pytesseract.image_to_string()` function to extract the text from the input image, and `pytesseract.image_to_boxes()` to get the coordinates of the bounding boxes around each character.

4. It then iterates over the bounding boxes, draws a rectangle around each character using `cv2.rectangle()`, and optionally labels the character with the recognized text using `cv2.putText()`. Finally, the function returns the recognized text and the modified input image (if draw is True).

Supplies

  1. Web cam
  2. Brainy Pi
  3. Keyboard
  4. Mouse
  5. Internet Connection

Setting Up

1. We need to install all the dependencies before we can run the code.

2. We also need to get the code before proceeding with the dependencies installation.

git clone https://github.com/brainypi/brainypi-opencv-examples.git cd text-detection  

3. Let us create a virtual environemnt

python -m venv venv  

4. Now, activate the virtual environment

cd venv/Scripts 
activate

5. Installing the dependencies

pip install -r requirements.txt 

Running the Code

Now, we are in a position to run the code

1. We can now run the code and start scanning the QR codes

python ocr.py

2. Press `q` exit out of the program.