Answer the question
In order to leave comments, you need to log in
How to organize normal xml output for further processing?
When you try to load xml for parsing, the bytecode is loaded, as far as I understand, and nothing can be done with it. At best, only Latin characters are displayed.
Code below:
import bs4 as bs
import urllib.request
source = urllib.request.urlopen('https://www.fl.ru/rss/all.xml?category=2').read()
soup = bs.BeautifulSoup(source,'lxml')
table = soup.find('channel')
table_rows = table.find_all('item')
print(table_rows[1])
<item>
<title></title>
<link/>https://www.fl.ru/projects/4080170/fullstack---nodejs-razrabotchik.html
<description></description>
<guid>https://www.fl.ru/projects/4080170/fullstack---nodejs-razrabotchik.html</guid>
<category></category><category></category>
<pubdate>Sat, 06 Jul 2019 21:40:45 GMT</pubdate>
</item>
-<item>
-<title>
-<![CDATA[Fullstack - Node.js разработчик (Бюджет: 130000 руб.)]]>
</title>
<link>https://www.fl.ru/projects/4080170/fullstack---nodejs-razrabotchik.html</link>
-<description>
-<![CDATA[В тематике криптовалют нужно закончить несколько самостоятельных модулей на отдельных поддоменах, которые собирают информацию через API с главного проекта, важно заложить...]]>
</description>
<guid>https://www.fl.ru/projects/4080170/fullstack---nodejs-razrabotchik.html</guid>
-<category>
-<![CDATA[Разработка сайтов / Веб-программирование]]>
</category>
-<category>
-<![CDATA[Программирование / Системное программирование]]>
</category>
<pubDate>Sat, 06 Jul 2019 21:40:45 GMT</pubDate>
</item>
Answer the question
In order to leave comments, you need to log in
Does not help...
Traceback (most recent call last):
File "C:\feedscrape-master\25.py", line 6, in <module>
soup = bs.BeautifulSoup(source.read().decode('cp1251'), 'lxml')
AttributeError: 'bytes' object has no attribute 'read'
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question