Answer the question
In order to leave comments, you need to log in
What is the easiest way to represent plain formatted text in Python?
I will have to deal with simple formatted text from multiple sources in different formats, such as Markdown. You don't need features like inserted tables or images, just font styles, monospaced text, and possibly color.
It is clear that the easiest way is to choose one representation for such text and convert everything that comes to the input into it. The question is which representation to choose - it is desirable that the data be text, to simplify transmission, storage and debugging.
XML is nice because it can be stored in a simple string and is fairly easy to escape. It is supported out of the box, via lxml or equivalents. Its enriched text variantquite gives the desired subset of possibilities. But it is quite verbose, and text conversion will be tedious - you run into a tree of tags.
Markdown is compact and can also be stored in a simple line. But it's harder to shield, and doesn't support color (as far as I know). When exporting to markdown, this is not a problem, but when importing, I would like to save everything that is possible. I'm not sure about a library that would be able to markdown - there probably is, but you need to look and learn.
RTF is more or less common, but it is rather confusing and resembles TeX. In terms of working with it, it seems to me that it will combine the verbosity of XML and the non-triviality of markdown.
What other options are there?
Answer the question
In order to leave comments, you need to log in
Some of your requirements are strange. Who will be tired? Are you planning to manually convert? Or beware of the algorithmic complexity of the transformation? Converting is an isolated and well-tested task. XML-based markup will give you maximum flexibility and versatility, you can always ignore certain formatting options and not lose any nuances. You can add your own namespaces and attributes to them. Markdown cannot guarantee this.
What does it mean "running along the tag tree"? Are you going to run by hand?
Where and why do you need to convert? Somehow you left this important part out of brackets.
Do you need to go back too? Is there an identity requirement for double conversion?
Use XML and don't fool yourself. The general format should be as strict, documented, unambiguous, and universal as possible. You won't find anything better than XML here. If at some point you encounter some special formatting in one of the formats (some kind of underlining with a wavy line), then it will not be difficult to add a new tag or attribute to XML in order to save information and not break backward compatibility, but here with the zoo of markdown dialects, you are tormented pretty badly.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question