Answer the question
In order to leave comments, you need to log in
How to parse html pages and process it?
There is a page in VK vk.com/go_in_zp?z=photo-50824015_344878304%2Falbum... you need to parse html and find the parse link cs624016.vk.me/v624016533/a226/owG51bJm59o.jpg . Tell me the code in c++ or python, if not difficult
Answer the question
In order to leave comments, you need to log in
$ pip install lxml
import urllib2
from lxml import html
data = urllib2.urlopen(url).read()
h = html.fromstring(data)
h.cssselect('.mv_actions a')[0].attrib['href']
Keep the disgusting, crooked, but working python code:
from selenium import webdriver
import time
browser = webdriver.Firefox()
url='http://vk.com/go_in_zp?z=photo-50824015_344878304%2Falbum-50824015_00%2Frev'
browser.get(url)
time.sleep(5) # this is bad
img=browser.find_element_by_xpath('//a[@id="pv_photo"]/img')
print img.get_attribute('src')
browser.quit()
import requests
from lxml.html import fromstring
url='http://vk.com/go_in_zp?z=photo-50824015_344878304%2Falbum-50824015_00%2Frev'
search_string=url[url.find('photo-')+len('photo-'):url.find('%2F')]
r=requests.get(url)
doc=fromstring(r.text)
xpath='//a[contains(@onclick, "%s")]/img' % search_string
print doc.xpath(xpath)[0].attrib['src']
The easiest way is to use QTWebKit: https://qt-project.org/doc/qt-5/qwebframe.html#fin...
The easiest in this regard for mastering for a beginner is grab ( https://pypi.python.org/pypi/grab/0.4.13 )
Take firefox + firebug, look at the source code of the page and look for the right piece. In firebug you pull out its xpath, then you can do it like this (python 3):
from grab import Grab
g = Grab()
sample_url = 'some_url'
xpath_part= 'some_xpath'
resp = g.go(sample_url).body
result = resp.xpath(some_xpath).text()
print(result)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question