I
I
IvanMiroshin2017-04-10 08:34:57
PHP
IvanMiroshin, 2017-04-10 08:34:57

How to find nested DOM element by attribute value in php?

There is a view template:

<section name="myname1">
     ...
     <div name="myname2">
          ... 
          <p name="myname3">
               ...
          </p>
          ...
     </div>
     ...
</section>
<div name="myname4">
     ...
     <div name="myname5">
          ... 
     </div>
     ...
</div>

The goal is to find all DOM elements that have a top-level "name" attribute, with all elements nested within it. At the same time, the attribute can contain text in Cyrillic, in fact, as well as nested structures in it.
I cannot use libraries (there is a requirement of the customer to exclude dependencies).
The first approach to projectile was:
/<\s*([a-z0-9]*)\b[^>]*\bname\s*=\s*\"([^\"]*)[^ >]*>(?>(?:[^<]|<(?!\s*\/?\1\s*\b))|(<\s*\1[^>]*>(? >(?:[^<]|<(?!\s*\/?\s*\1\s*\b))|(?3))+?<\s*\/\s*\1 \s*>))*<\/\1>/is
This works as long as the nesting of the " ... " tag does not grow to more than 700 lines. After that, the regular expression simply does not find anything. But there is a moment, for example,
Other researches:
Tried implementation through PHPDocument, but there were problems with the encoding (the thing is that I don't know what encoding the developed script will use).
I tried to find ".*" first, and then through the "preg_match_all" function with the "PREG_OFFSET_CAPTURE" flag, find the number of opening and closing tags of the same name and their position in the line, followed by calculating the final closing tag for the desired one. But even here I stumbled over the notorious Cyrillic alphabet.
Tried XPath , I can't get it to correctly digest not fully valid layout. It especially strongly swears at the use of svg inline. At the end throws a critical error:
Uncaught exception 'Exception' with message 'String could not be parsed as XML' in ...:748 Stack trace: #0 ... (748): SimpleXMLElement->__construct('

Answer the question

In order to leave comments, you need to log in

3 answer(s)
I
IvanMiroshin, 2017-04-11
@IvanMiroshin

Problem solved:
Tag name: table, can be replaced with ([a-z0-9]+), then all tags will be searched. The main thing then is to substitute this group in the appropriate positions in the regular expression.
The name and value of the attribute ("name", "table01") can be dynamically substituted (in my case, they are set by php variables)
All the problems I have described with large nesting parsing have been resolved.
I hope it will be useful to someone :)

V
Vitaly, 2017-04-10
@rim89

If you can search on the client ( + jquery), then you can use this selector: If you need to rake in php , here is some kind of parser - PHP Simple HTML DOM Parser and by analogy with the selector above

A
Andrey Nikolaev, 2017-04-10
@gromdron

What about xpath ? Did you look in his direction?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question