A
A
Andrej Kopp2022-02-08 19:45:53
PHP
Andrej Kopp, 2022-02-08 19:45:53

How to correctly parse XML by specifying the parent of subcategories?

Hello. Faced such a problem. It is necessary to parse categories with subcategories genres from literature, I want to do this through a simple function without bicycles. At this link , the generated XML with genres, which needs to be parsed by substituting the id attribute as a parent to the child sections.

There is this XML structure:

<genres>
        <genre id="5003" title="бизнес-книги" type="root">
                <genre id="5049" title="банковское дело" token="bankovskoe_delo" type="genre"/>
                <genre id="210646" title="бизнес-справочники" token="business-spravochniki" type="genre"/>
                <genre id="5051" title="бухучет / налогообложение / аудит" token="buhuchet_nalogooblozhenie_audit" type="genre"/>
                <genre id="6784" title="государственное и муниципальное управление" token="gosudarstvennoe_i_munitsipalnoe_upravlenie" type="genre"/>
                <genre id="5060" title="делопроизводство" token="deloproizvodstvo" type="genre"/>
                <genre id="5061" title="зарубежная деловая литература" token="zarubezhnaya_delovaya_literatura" type="genre"/>
                <genre id="5062" title="интернет-бизнес" token="internet" type="genre"/>
                 <genre id="5047" title="кадровый менеджмент" token="kadrovyj_menedzhment" type="container">
                          <genre id="5334" title="аттестация персонала" token="attestaciya_personala" type="genre"/>
                          <genre id="5330" title="гендерные различия" token="gendernyye_razlichiya" type="genre"/>
                          <genre id="5332" title="конфликты" token="konflikty" type="genre"/>
                          <genre id="5336" title="коучинг" token="kouching" type="genre"/>
                          <genre id="5333" title="мотивация" token="motivaciya" type="genre"/>
                          <genre id="5335" title="поиск и подбор персонала" token="poisk_presonala_hr" type="genre"/>
                          <genre id="5331" title="тимбилдинг" token="timbilding" type="genre"/>
                          <genre id="6583" title="управление персоналом" token="upravlenie_personalom" type="genre"/>
                 </genre>
...
</genres>


Wrote this code:

$url = 'https://partnersdnld.litres.ru/genres_list_2/';

        $dom = new DOMDocument('1.0', 'utf-8');

        $dom->load($url);
        $xpath = new DOMXpath($dom);

        foreach ($xpath->evaluate('//genre') as $node) {
            var_dump(
                [
                    'parent_id' => $xpath->evaluate('string(ancestor::genre[1]/id)', $node),
                    'id' => $xpath->evaluate('string(id)', $node),
                    'title' => $xpath->evaluate('string(title)', $node),
                ]
            );
        }


And he got confused in sections and attributes. Can anyone tell me why it returns empty results and how to properly parse parent_id and other data?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Andrej Kopp, 2022-02-09
@sequelone

The attributes are on a different Xpath axis. idis short for child::idand will retrieve the element node on the child axis. For the attribute axis, you need to use attribute::idor label @id.

foreach ($xpath->evaluate('//genre') as $node) {
    var_dump(
        [
            'parent_id' => $xpath->evaluate('string(ancestor::genre[1]/@id)', $node),
            'id' => $node->getAttribute('id'),
            'title' => $node->getAttribute('title'),
        ]
    );
}

Everything worked as it should.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question