A
A
Anatoly2019-11-05 11:03:24
PHP
Anatoly, 2019-11-05 11:03:24

How to implement parsing of the balance sheet and income statement, if they are in the form of a photograph?

I came up with the idea to write a program that receives a photo or pdf of an accounting report as input, recognizes its articles and saves them to the database for further analysis. Well, for example, in the balance sheet there are two main sections, asset and liability, and there are lines, say, "Buildings and structures: 123,456", "Deposits: 123,345", "Cash: 123,456". After recognition and parsing, this data is added to an array like this

$balance = array(
        "actives" => array(
            "Основные средства" => 123456,
            "Вклады" => 123456,
            "Денежные средства" => 123456,
        ),
        "passives" => array(
            // и т.д.
        ),
    );

Further, this data will be processed / stored in the database, etc.
The question is how to recognize them if the balance is in picture format?
Do I understand correctly that this task should be solved using a neural network? If so, is it possible to write such a neural network in php? :)

Answer the question

In order to leave comments, you need to log in

2 answer(s)
P
Peter Slobodyanyuk, 2019-11-05
@user1410

There are sites where you can recognize images for a fee. Some have some kind of API.
For example, sites antigate.com, captchabot.com, anti-captcha.net, wisetrend.com. (The last 2 links seem to be free to test there)
You can also use GOCR jocr.sourceforge.net
And call it from PHP:
And then - purely a matter of programming skills. The whole thing is checked with regular expressions, broken into a specific structure and written to the database.

N
nuclear_kote, 2019-11-05
@nuclear_kote

Well, for starters, ocr

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question