How to parse html that is constantly mutated/morphed (structure, tags, classes, etc.) on every request?

V

Valentin2019-08-20 17:16:15

HTML

Valentin, 2019-08-20 17:16:15

For example:

<div class="DFsfE5qr">
  <div class="etgF_2">UAH 300</div>
  <div class="etgFsdf">USB фонарик</div>
</div>

Maybe so:

<div class="DFghrtqr">
  <div></div>
  <div class="eerg_2">UAH 300</div>
  <div class="etergf">USB фонарик</div>
</div>

May be so:

<span class="grr">
  <span class="grs56-rg">
    <div class="grs56-rg">
      <div class="eegrdfg"><span class="dsf">UAH</span>300</div>
      <div class="ekdf">USB фонарик</div>
    </div>
  </span>
</span>

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

X

xmoonlight, 2019-08-20
@xmoonlight

1. That which is permanent - that is, ID.
2. According to the ID structure - we make NS.
3. According to the National Assembly - we get the fields.
4. GOTO 3

J

Jan, 2019-08-20
@on1k

Look for the html structure, not the class names, unless of course the document structure changes.
XPath handles this well.