B
B
Bur Ov2021-12-15 23:15:50
Parsing
Bur Ov, 2021-12-15 23:15:50

Why are some sites scraped without www., and some are not?

The task is to parse text from sites, I load the site via curl, and some sites are not parsed if you specify it like this:

https://test.com/

And some, on the contrary, only soar. How to be? Discard all unloaded sites, and then run them like this:
https://www.test.com

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
SagePtr, 2021-12-15
@burov0798

C www is a separate subdomain, which is different from the domain. Some set it up as an alias and return the same content in both cases. Some do not add at all, and the site with www does not open. And some - they put a redirect, and it happens in different ways - sometimes they redirect from a subdomain to a domain, and sometimes vice versa.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question