R
R
r1der2017-05-18 18:12:04
PHP
r1der, 2017-05-18 18:12:04

Python urllib2 vs PHP curl. How to deal with ETag?

Hello
I ran into such a problem, absolutely the same code in two different languages, which sends exactly the same headers to the same server through the same proxy, gives different answers from the server. In the case of python, it simply returns some old page with Etag and every time the same thing

$xs=new Surfer("www.dell.com");
$xs->random_ua=0;
$xs->only_body=1;
$xs->setProxy("#######:65234");   // сокс прокси 
$xs->setProxyType("socks5");
$xs->setProxyPass("######");

$xs->setReferer("");
$xs->ssl_always=1;

$buff=$xs->send("/support/Contents/category/eSupport-Order-support?c=us&l=en&~ck=mn",array("follow"=>1));
echo $buff=$xs->send("/support/orders/us/en/19/Order/Details?sbon=205523888",array("follow"=>1,"debug"=>1,"custom"=>array("Connection: close")));

#класс обертка для курла
class Surfer...


This code generates the following headers for the last request
GET /support/orders/us/en/19/Order/Details?sbon=205523888 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 5.1; U; en)
Host: www.dell.com
Accept: */*
Accept-Encoding: deflate,identity
Referer: https://www.dell.com/support/Contents/category/eSupport-Order-support?c=us&l=en&~ck=mn
Cookie: dais-c=NchVtQmz0UBCDQ77NpbAAoa0OcMfUnzX8tx7bz4aTNEArLVCzLPaR4bDSFexnAAx;eSupId=SID=29568d7f-2cf5-432e-b720-94eb514e823a;lwp=c=us&l=en&cs=19&s=dhs;
Connection: close


What follows is a python code that does everything the same, but instead the same query produces something different and always the same

h=urllib2.HTTPSHandler(debuglevel=1)

current_request="""User-Agent: Mozilla/5.0 (Windows NT 5.1; U; en)
Accept: */*
Accept-Encoding: *"""

current_req=current_request.split("\n")

while True:
    proxy=random.choice(socks_list)

    orderid=random.choice(orders)
    prx=SocksiPyHandler(socks.SOCKS5, proxy[1], 65234,True, "********')

    cj = cookielib.LWPCookieJar()
    opener = urllib2.build_opener(h,urllib2.HTTPCookieProcessor(cj),prx)

    req = urllib2.Request("https://www.dell.com/support/Contents/category/eSupport-Order-support?c=us&l=en&~ck=mn")

    for el in current_req:
        temp=el.split(':')
        req.add_header(temp[0], temp[1].strip(' '))

    response = opener.open(req)
    res = response.read()

    print "--------"
    print "--------"
    time.sleep(2)

    req = urllib2.Request("https://www.dell.com/support/orders/us/en/19/Order/Details?sbon=205523888" )

    for el in current_req:
        temp = el.split(':')
        req.add_header(temp[0], temp[1].strip(' '))
    req.add_header("Referer","https://www.dell.com//support/contents/us/en/19/category/eSupport-Order-support?c=us&l=en&~ck=mn")

    response = opener.open(req)
    res = response.read()


this code generates an absolutely identical request
, but the answer is always like this

reply: 'HTTP/1.1 200 OK\r\n'
header: Server: Apache
header: ETag: "a50bd009cc90b71337825d9fe7063dd9:1488828720"
header: Last-Modified: Mon, 06 Mar 2017 19:32:00 GMT
header: Accept-Ranges: bytes
header: Content-Length: 130762
header: Content-Type: text/html
header: Date: Thu, 18 May 2017 14:55:15 GMT
header: Connection: close


and it shouldn't be like that, the content of some stub page follows. In a word, the server must respond with normal content, and with the same code, a different

PS proxy response is obtained everywhere the same, the headers were checked as outgoing and the same headers as they were already received on the server.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
S
Sergey, 2018-08-05
@gangstarcj

php.net/manual/en/function.file-get-contents.php

D
Dimonchik, 2017-05-18
@r1der

use pycurl and don't worry

A
Alexey Cheremisin, 2017-05-18
@leahch

Run python requests and don't worry docs.python-requests.org

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question