K
K
KantorMaz2020-05-21 17:23:41
Python
KantorMaz, 2020-05-21 17:23:41

How to convert XML files to CSV files?

Trying to convert all XML files to CSV files. For this I use python xml_to_csv.py
xml_to_csv.py itself looks like this:

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET


def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df


def main():
    for folder in ['train','test']:
        image_path = os.path.join(os.getcwd(), ('images/' + folder))
        xml_df = xml_to_csv(image_path)
        xml_df.to_csv(('images/' + folder + '_labels.csv'), index=None)
        print('Successfully converted xml to csv.')


main()

As a result, the train_labels and test_labels files are created, but the table is filled in a strange way. All data is written in one column, but I would like each variable to have a separate cell. How can this be done? And can this be somehow related to the version of Excel ?
What happens now:
1) test_labels 5ec68d5920f74598938921.jpeg
2) train_labels5ec68daf45396737519213.jpeg

Answer the question

In order to leave comments, you need to log in

2 answer(s)
B
BasiC2k, 2020-05-21
@KantorMaz

Everything worked out correctly. You have created a comma delimited csv file. If you want to open csv and have everything in columns, use a semicolon as a delimiter.

V
Vladimir, 2020-05-21
@vintello

xml_df.to_csv(('images/' + folder + '_labels.csv'), index=None)

xml_df.to_csv(('images/' + folder + '_labels.csv'), index=None, sep=";")

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question