Answer the question
In order to leave comments, you need to log in
How to find html tags in txt file?
Good afternoon! Help with this problem. There is a txt file that contains both plain text and several tables in the form of <table> ... </table>
. As an example:
"
ACCESSION NUMBER: 0000796343-18-000015
CONFORMED SUBMISSION TYPE: 10-K
PUBLIC DOCUMENT COUNT: 109
CONFORMED PERIOD OF REPORT: 20171201
FILED AS OF DATE: 20180122
DATE AS OF CHANGE: 20180122
<table id=1>
<tr>
<td>Some Text</td>
</tr>
</table>
<table id=2>
<tr>
<td>Some Text</td>
</tr>
</table>
"
Answer the question
In order to leave comments, you need to log in
If any tag, then like this:
This will return two values, the tag and its contents:
>>> tables = re.findall(r'<(\w+)([\s\S]+?)<\/\1>', s)
>>> tables
[('table', ' id=1>\n <tr>\n <td>Some Text</td>\n </tr>\n'), ('table', ' id=2>\n <tr>\n <td>Some Text</td>\n </tr>\n')]
>>> tables = re.findall(r'<table([\s\S]+?)<\/table>', s)
>>> tables
[' id=1>\n <tr>\n <td>Some Text</td>\n </tr>\n', ' id=2>\n <tr>\n <td>Some Text</td>\n </tr>\n']
Wrap all text in <body>
Next https://stackoverflow.com/questions/3051295/jquery...
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question