P
P
Perpetuum_Immobile2020-11-17 18:06:41
Neural networks
Perpetuum_Immobile, 2020-11-17 18:06:41

Algorithms for brightening documents, image binarization and where to start studying them?

Not so long ago, my friend and I were assigned a software research project on ways to lighten documents.

Two main paths were identified - classical algorithms and neural networks, and a program was selected on which you can experiment for the benefit of research - Tesseract-OCR . Actually, the question arose: where to start? Where can I find examples of algorithms and neural networks (as well as where can I get acquainted with their features and implementation)? Where to start studying neural networks and image recognition algorithms?

I want to address these questions here, and also undertake to replenish this or a separate resource with already found and verified answers to these questions (I will attach the link to the question in one way or another).

Any material related to the topic is welcome.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Andrey Dugin, 2020-11-17
@adugin

In fact, you have the task of separating text from the background. Alternatively, you can use the simplest autoencoder convolutional neural network: with a sufficiently small length of the Z-vector (bottleneck), it will learn to restore the background, but not the letters. Subtract the background restored by the autoencoder from the original image, and voila - you have only text. You can also google what algorithm is used in the DJVU format .

A
Alexander Skusnov, 2020-11-18
@AlexSku

Gonzalez, Woods "Digital Imaging"

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question