Answer the question
In order to leave comments, you need to log in
How to implement parsing of the balance sheet and income statement, if they are in the form of a photograph?
I came up with the idea to write a program that receives a photo or pdf of an accounting report as input, recognizes its articles and saves them to the database for further analysis. Well, for example, in the balance sheet there are two main sections, asset and liability, and there are lines, say, "Buildings and structures: 123,456", "Deposits: 123,345", "Cash: 123,456". After recognition and parsing, this data is added to an array like this
$balance = array(
"actives" => array(
"Основные средства" => 123456,
"Вклады" => 123456,
"Денежные средства" => 123456,
),
"passives" => array(
// и т.д.
),
);
Answer the question
In order to leave comments, you need to log in
There are sites where you can recognize images for a fee. Some have some kind of API.
For example, sites antigate.com, captchabot.com, anti-captcha.net, wisetrend.com. (The last 2 links seem to be free to test there)
You can also use GOCR jocr.sourceforge.net
And call it from PHP:
And then - purely a matter of programming skills. The whole thing is checked with regular expressions, broken into a specific structure and written to the database.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question