A
A
Abdulyar2014-07-23 11:35:08
Python
Abdulyar, 2014-07-23 11:35:08

How to parse ip addresses from web pages in python?

The task is to follow the links of the site
www.zone-h.org/archive,
open links like www.zone-h.org/mirror/id/22714269
and copy the field with the IP address into a single text file.
How would you recommend implementing it in Python? What libraries/examples would you recommend to use? Thanks for the advice.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
H
Heafy, 2014-07-23
@Abdulyar

from urllib  import request
def getIP(urls):
  link = 'http://www.zone-h.org/mirror/id/22714269'
  requestToLink = request.Request(link)
  answerFromServ = request.urlopen(requestToLink).read()
  result = answerFromServ.decode('utf8')
  print (result[result.find('IP') + 20 : result.find('IP') + 37])

Do not judge, I decided to wedge myself for the sake of personal interest as a beginner.
This is a specific ip from a specific page, you may need to improve the place with the search for the ip address itself.
But how do you search for all the links on the www.zone-h.org/archive page?
from urllib  import request
def getUrls():
  urls = []
  link = 'http://www.zone-h.org/archive'
  requestToLink = request.Request(link)
  answerFromServ = request.urlopen(requestToLink).read()
  result = answerFromServ.decode('utf8')
  findIt = 'mirror/id'
  
  for findIt in result:
    urls.append(result[result.find('mirror/id') + 10 : result.find('mirror/id') + 18])
    result = result[result.find('mirror/id'):]
    
  return urls

As far as I understand, there is something wrong with the line
But what?)
Thanks for the answers, and again - do not judge the performance, the other day I started reading Lutz and installed py3.4

V
Vyacheslav, 2014-07-23
@Nirail

Using existing Python tools:
1) Using urllib2, you can download the page from www.zone-h.org/archive.
2) Find all the necessary links on the page, for example, by searching using regex.
3) Go through the received links, use urllib2 to download the page, extract the necessary lines (ip-address) from it and write it to a file
4) ...
5) PROFIT
How to download a page using urllib2 can be easily found on the Internet.
How to find a specific line in a large text is also not a problem.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question