M
M
Mark Adams2016-06-13 14:01:27
Parsing
Mark Adams, 2016-06-13 14:01:27

How to scroll to the bottom of an Instagram page?

we get a page with a photo by the tag: https://www.instagram.com/explore/tags/{tag_without_#}
You can view all the data only by scrolling, no pagination, etc.
Wrote this script:

from selenium import webdriver
import time, random

tag = '' #input tag here without {#}
timer = 30

browser = webdriver.Firefox()
browser.get('https://www.instagram.com/explore/tags/' + tag + '/')
browser.find_element_by_tag_name('html')
print 'CLICK DOWNLOAD MORE! you have %d sec' % timer #click download more
time.sleep(timer)
t = 0
count = 0
while True:
  print 'RUN!'
  while t <= 125:
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(random.uniform(1,1.5))
    browser.execute_script("window.scrollTo(0, 0);")
    time.sleep(random.uniform(1,1.5))
    t += 1
    count += 1
    print 'total: ~' + str(count * 4)
  t = 0
  print 'SLEEP 1h'
  time.sleep(3600)

Before that, I tried to experiment with timers, but so far I have not been able to get to the end of the page (instagrtam slows down the download). I received a maximum of 15k photos.
What are the thoughts?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
V
Victor Aubin, 2016-08-01
@nebo_oben

Open the developer console (Ctrl + Shift + i in chrome) and look at the Network tab what requests the site sends when you click the "Load more" button, and it sends

q=ig_hashtag(HASHTAG)+%7B+media.after(J0HV59zAwAAAF0HV59y8AAAAFkYA%2C+11)+%7B%0A++count%2C%0A++nodes+%7B%0A++++caption%2C%0A++++code%2C%0A++++comments+%7B%0A++++++count%0A++++%7D%2C%0A++++date%2C%0A++++dimensions+%7B%0A++++++height%2C%0A++++++width%0A++++%7D%2C%0A++++display_src%2C%0A++++id%2C%0A++++is_video%2C%0A++++likes+%7B%0A++++++count%0A++++%7D%2C%0A++++owner+%7B%0A++++++id%0A++++%7D%2C%0A++++thumbnail_src%2C%0A++++video_views%0A++%7D%2C%0A++page_info%0A%7D%0A+%7D&ref=tags%3A%3Ashow
to the host https://www.instagram.com/query/ , the parameter in media.after, I suspect, the ID of the latest photo in the current selection

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question