Answer the question
In order to leave comments, you need to log in
What is the correct way to save data with Scrapy?
Good afternoon!
You need to get project data from the page.
Data, all on one page:
project name project
description
team
team - there are usually a lot of participants here and here you need to get: Name, status in the project, links to the profile
Now I do this
class ProjectItem(scrapy.Item):
id = scrapy.Field()
name = scrapy.Field()
team = scrapy.Field()
for people in all_team:
...
team.append({
"id": id,
"full_name": full_name,
"current_position": current_position,
"website": website,
})
l.add_value('team', json.dumps(team))
Answer the question
In order to leave comments, you need to log in
1) In scrapy, starting with the first major version, it is no longer necessary to define scrapy.Item, you can use a regular dict.
2) Yes, you can store a field of any format in Item, it could well be a list of dict.
3) You can create two types of Item, but then you will have to check pipelines for its type, in my opinion it’s easier to use item of the same type and parse it into pipelines as you like, by writing queries to insert into 2 NoSQL tables.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question