Answer the question
In order to leave comments, you need to log in
How to create DataFrame from xml?
It is necessary to translate the xml file into a DataFrame and I can’t figure out how to do it.
There is an xml file with the following structure:
<?xml version="1.0" encoding="utf-8"?>
<licenses_list>
<licenses>
<name>Министерство здравоохранения Астраханской области</name>
<activity_type>Медицинская деятельность</activity_type>
<abbreviated_name_licensee>ООО "КЛИНИКА "ЛИНЛАЙФ"</abbreviated_name_licensee>
<works>
<work>100. При оказании первичной, в том числе доврачебной, врачебной и специализированной, медико-санитарной помощи организуются и выполняются следующие работы (услуги):</work>
<work>100.1. при оказании первичной доврачебной медико-санитарной помощи в амбулаторных условиях по:</work>
<work>100.1.25. сестринскому делу в косметологии</work>
<work>100.4. при оказании первичной специализированной медико-санитарной помощи в амбулаторных условиях по:</work>
<work>100.4.7. анестезиологии и реаниматологии</work>
</works>
</licenses>
</licenses_list>
import xml.etree.ElementTree as ET
import pandas as pd
tree = ET.parse('Рабочий.xml')
root = tree.getroot()
df_index = ['name', 'activity_type', 'abbreviated_name_licensee', 'works']
df = pd.DataFrame(columns=df_index)
df_index = ['name', 'activity_type', 'abbreviated_name_licensee', 'works']
df = pd.DataFrame(columns=df_index)
for elem in root:
for b in range(0,len(elem[3])):
elements = [elem[0].text, elem[1].text, elem[2].text, elem[3][b].text]
df = df.append(pd.Series(elements, index=df_index), ignore_index=True)
Answer the question
In order to leave comments, you need to log in
You don't need to iterate over the 'works' elements in the same loop where the rows are added to the DF. If you sort them out, then add them in one line (separated by commas), and then create an entry in pandas. Or, in general, replace this loop with join():
for elem in root:
elements = [elem[0].text, elem[1].text, elem[2].text, ','.join(val.text for val in elem[3])]
df = df.append(pd.Series(elements, index=df_index), ignore_index=True)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question