F
F
FaNaT2020-09-03 20:08:33
C++ / C#
FaNaT, 2020-09-03 20:08:33

How to get text outside of tag using xpath?

Hello. I have this html snippet

<article class="eText">
<p class="">
<b class="">Год:</b> 2019-2020 
<br class="">
<b class="">Жанр:</b> Приключения, фэнтези, мультсериал 
<br class="">
<b class="">Перевод / Озвучивание:</b> Многоголосый дубляж от Wakanim 
<br class="">
<b class="">Время:</b> 22 х ~ 00:24:00 
<br class="">
<b class="">Произведено:</b> Япония, CloverWorks 
<br class="">
<b class="">Режиссер:</b> Тосифуми Акай 
<br class="">
<b class="">Актеры:</b> Нобунага Симадзаки, Риэ Такахаси, Аяко Кавасуми, Кэнъити Судзумура, Маая Сакамото, Томокадзу Сэки, Ю Кобаяси, Такахиро Сакурай, Ю Асакава, Кана Уэда 
</p>
</article>


The request that I got only reaches the desired node. But I don't know how to get the value outside the tag.
//article[@class='eText']/p/b[contains(.,'Жанр:')]

I need to get the values ​​coming after the closing b tags. Those. "2019-2020", "Wakanim's many-voiced dubbing", "Toshifumi Akai" and all the rest.

Of course, I can use Substring to extract the substring after the colon and the problem will be solved, but I wondered if it was possible to write a universal xpath query in which the contents of contains would change to get these text values.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
Roman Fov, 2020-09-04
@fanat_96

Is it possible to write a universal xpath query that will change the contents of contains to get these text values.

Short answer
//article[@class='eText']/p/b[. = 'Жанр:']/following-sibling::text()[1]

xml:
<article class="eText">
  <p class="">
    <b class="">Жанр:</b>вфывафыва
    <b class="">Время:</b> 22 х ~ 00:24:00 
  </p>
</article>

XPath result:
Text='вфывафыва'
--------------------------------------------------
Correct answer
//article[@class='eText']/p/b[text() = 'Жанр:'][generate-id(following-sibling::text()[1]/preceding-sibling::node()[1]) = generate-id(.)]/normalize-space(following-sibling::text()[1])

xml:
<article class="eText">
  <p class="">
    <b class="">Жанр:</b><b class="">Жанр:</b>Многоголосый дубляж от Wakanim 
    <b class="">Время:</b> 22 х ~ 00:24:00 
  </p>
</article>

XPath result:
String='Многоголосый дубляж от Wakanim'
--------------------------------------------------
(Type take into account a possible empty value after the tag <b>. Not sure how descriptive illustrated)
If something is not clear, then ask
PS: A question for connoisseurs: is it really possible to simplify the second option without losing functionality?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question