Answer the question
In order to leave comments, you need to log in
How to convert pdf to html using python?
Hi Habr!
Perhaps the question is stupid, but I searched the entire Internet and did not find a suitable lib for converting!
Who can help?
Answer the question
In order to leave comments, you need to log in
https://github.com/coolwanglu/pdf2htmlEX
the result is, frankly, strange
And so, they will say directly. Converting pdf to html will not work in any programming language! The maximum that can be extracted from pdf is text, which will be devoid of any markup. For the pdf format itself was created for a preprint and it lacks information about headings, paragraphs, styles. Moreover, pdf has text and attributes, where and how to place this text, and nothing else for text.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question