A
A
artds2016-06-17 12:37:17
Python
artds, 2016-06-17 12:37:17

Python paging and saving images?

Good afternoon, how to parse the pagination of the site?
pagination itself

<div class="pagination" data-container="pagination" style="display: none;">
  <ul>

    

          <li>Начало</li> <li>Пред.</li>
    
    
              <li>1</li>
                
              <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=2">2</a></li>
                
              <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=3">3</a></li>
                
              <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=4">4</a></li>
                
              <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=5">5</a></li>
                
          <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=2">След.</a></li>
      <li><a href="/sistemy-opoveshcheniya-i-muzykalnoy-translyatsii/?SECTION_CODE=sistemy-opoveshcheniya-i-muzykalnoy-translyatsii&amp;PAGEN_1=101">Конец</a></li>
    
  
    
  </ul>
</div>

or such
<p><font class="faq"><b>1</b></font> &nbsp;&nbsp;<a class="no_underline" href="/category/ibm/offset20/">2</a> 
<a class="no_underline" href="/category/ibm/offset40/">3</a>
<a class="no_underline" href="/category/ibm/offset60/">4</a>
<a class="no_underline" href="/category/ibm/offset80/">5</a> 
<a class="no_underline" href="/category/ibm/offset100/">6</a>
<a class="no_underline" href="/category/ibm/offset12940/">648</a> &nbsp;&nbsp;<a class="no_underline" href="/category/ibm/offset20/">след &gt;&gt;</a> </p>

I do page pagination like this
from urllib.request import Request, urlopen
from lxml.etree import XMLSyntaxError
from lxml.html import fromstring

URL =  'https://www.layta.ru/adresnye-sistemy-pozharnoy-signalizatsii/?SECTION_CODE=adresnye-sistemy-pozharnoy-signalizatsii&PAGEN_1=%'
def parse_soud():
    f = urlopen(URL)
    list_html = f.read().decode('utf-8')
    list_doc = fromstring(list_html)

    for url in [list_doc % i for i in range(1)]:
        r = requests.get()
        print(r)

in the first case I get NameError: name 'xrange' is not defined
in the second case urllib.error.HTTPError: HTTP Error 400: Bad Request
what am I doing wrong?
pictures finds links of pictures, but how to save them and add the name of the picture to the list?
from urllib.request import Request, urlopen
from lxml.etree import XMLSyntaxError
from lxml.html import fromstring
URL =  'https://www.layta.ru/rubezh-ip-101-29-a3r1.html'
IMAGE_PATH = '.images .bigimg'
def parse_soud():
    f = urlopen(URL)
    list_html = f.read().decode('utf-8')
    list_doc = fromstring(list_html)

    for elem in list_doc.cssselect(IMAGE_PATH):
        img = elem.cssselect('img')[0]
        im = img.get('src')
        im = open('/home/artddss/parse_prjekt')
        im.write(im.content)
        im.close
        print(im)

writes IsADirectoryError: [Errno 21] Is a directory: '/home/artddss/parse_prjekt'
lacking permissions?

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question