K
K
kennnies2019-07-03 20:03:11
Python
kennnies, 2019-07-03 20:03:11

How to parse a table with python regular expressions?

There is the following table:

HTML table
<tr>
          <td>99</td>
          <td>Name</td>
          <td>ЕГЭ</td>
          <td>268</td><td>90</td><td>91</td><td>87</td>
          <td></td>
          <td>Копия</td>
          <td>Нет</td>
        </tr>

I use the following regular expression to parse numbers:
re.findall(r'\d{3,3}\d{1,3}\d{1,3}\d{1,3}
You also need to parse the "Copy" field, the transition to a new line does not allow this, I tried it through
\s \n \t \r and \s
It didn't work out very well, how can this be done?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladimir Kuts, 2019-07-03
@fox_12

Well, as it were, regular expressions are far from the most suitable tool for this.

>>> import lxml.html
>>> str1 = """
... <tr>
...           <td>99</td>
...           <td>Name</td>
...           <td>ЕГЭ</td>
...           <td>268</td><td>90</td><td>91</td><td>87</td>
...           <td></td>
...           <td>Копия</td>
...           <td>Нет</td>
...         </tr>"""
>>> root = lxml.html.fromstring(str1)
>>> [x.text for x in root.xpath('.//td')]
['99', 'Name', 'ЕГЭ', '268', '90', '91', '87', None, 'Копия', 'Нет']

D
Dimonchik, 2019-07-03
@dimonchik2013

pytablereader

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question