Answer the question
In order to leave comments, you need to log in
How to specify a file for scrapy spider if it is in the same folder?
I use selenium to upload the page I need to html.
It goes into the folder where the Scrapy project is located.
Now I need to specify the full path to the parsing file.
Can scrapy be configured to parse html files in the same directory?
Sample code:
import scrapy
from urllib.parse import urljoin
class Htmlparse(scrapy.Spider):
name = "htmlparse"
start_urls = [
'file:///C:/scrapyproject/alpabetsch23-43_28-09-2019.html',
]
def parse(self, response):
for post_link in response.xpath('//td').extract():
url = urljoin(response.url, post_link)
print(url)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question