V
V
Valentin2019-08-20 17:16:15
HTML
Valentin, 2019-08-20 17:16:15

How to parse html that is constantly mutated/morphed (structure, tags, classes, etc.) on every request?

For example:

<div class="DFsfE5qr">
  <div class="etgF_2">UAH 300</div>
  <div class="etgFsdf">USB фонарик</div>
</div>

Maybe so:
<div class="DFghrtqr">
  <div></div>
  <div class="eerg_2">UAH 300</div>
  <div class="etergf">USB фонарик</div>
</div>

May be so:
<span class="grr">
  <span class="grs56-rg">
    <div class="grs56-rg">
      <div class="eegrdfg"><span class="dsf">UAH</span>300</div>
      <div class="ekdf">USB фонарик</div>
    </div>
  </span>
</span>

Answer the question

In order to leave comments, you need to log in

2 answer(s)
X
xmoonlight, 2019-08-20
@xmoonlight

1. That which is permanent - that is, ID.
2. According to the ID structure - we make NS.
3. According to the National Assembly - we get the fields.
4. GOTO 3

J
Jan, 2019-08-20
@on1k

Look for the html structure, not the class names, unless of course the document structure changes.
XPath handles this well.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question