How to improve the accuracy of reading text from an image?

Y

Yevgeni2018-08-05 17:01:03

Python

Yevgeni, 2018-08-05 17:01:03

from PIL import Image
from pytesseract import image_to_string
import requests
from io import BytesIO
import re

url = 'https://pbs.twimg.com/media/Dh-ZHUPX4AAQwG-.jpg:large'

response = requests.get(url)

img = Image.open(BytesIO(response.content))

text_from_image = image_to_string(img)

print(text_from_image)

What image manipulations can be performed to improve the accuracy of reading text from an image?
At the moment, the reading accuracy does not exceed 40%.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

D

Dimonchik, 2018-08-05
@dimonchik2013

sharpen increase

D

Dmitry, 2019-05-12
@DmitryKyd

Crop everything superfluous (in this case, the image of a weapon), increase sharpness, remove the background (gray) and convert the resulting image to a black and white image. After such manipulations, you will get a white background and black text on it. His tesseract recognizes much better.