Answer the question
In order to leave comments, you need to log in
Parsing with lxml and saving data with pandas
Inspired by an article on Habré, I'm trying to write a parser. Code below:
import lxml.html as html
from pandas import DataFrame
main_domain = 'http://market.yandex.ru'
brand_list = html.parse('%s/brands-list.xml' % (main_domain))
e = brand_list.getroot().find_class('body')
for i in e:
t = i.getchildren().pop()
link_table = DataFrame({'EV':j[0].text , 'LINK':j[2]} for j in t.iterlinks())
link_table.to_csv('brands1.csv',';',index=False,encoding="UTF-8")
Answer the question
In order to leave comments, you need to log in
for i in e:
t = i.getchildren().pop()
link_table = DataFrame({'EV':j[0].text.encode('utf-8') , 'LINK':j[2]} for j in t.iterlinks())
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question