Answer the question
In order to leave comments, you need to log in
How to clean up exported HTML from Google Docs?
Does anyone know a better way for tidy-html to clean up very messy HTML after Google Docs has been uploaded? It is program methods that are of interest (preferably C ++) without using Google Scripts
Tidy cleans very badly, for example, 200 thousand characters cleared up to 195 thousand.
Answer the question
In order to leave comments, you need to log in
An alternative method that I used myself:
I downloaded it in ODT format, using the installed LibreOffice, I converted the document into DocBook format, which in its structure resembles HTML, but there are no “beauties”, that is, styles.
You can view and edit DocBook in LibreOffice. You can convert this format to others: LaTeX, PDF, HTML, ...
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question