Why, when trying to parse establishments from Google Maps, beautifulsoup4 cannot parse the object and it will return the value None?

A

Alexander2020-12-09 21:13:25

Python

Alexander, 2020-12-09 21:13:25

When executed it gives:

TypeError: 'NoneType' object is not iterable

import requests
from bs4 import BeautifulSoup
import csv

CSV = 'result.csv'
HOST = 'https://www.google.com'
URL = 'https://www.google.com/maps/search/Canggu+villa/@-8.6419077,115.139734,14z'
HEADERS = {
  'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',

  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
}

def get_html(url, params=None):
  r = requests.get(url, headers=HEADERS, params=params)
  return r

def get_content(html):
  soup = BeautifulSoup(html, 'html.parser')
  items = soup.find('div', class_='section-result-content')
  cards = []

  for item in items:
    cards.append(
        {
        'phone': item.find('span', class_='section-result-info section-result-phone-number').find_next_sibling('span').get_text(strip=True),
        'name': item.find('h3', class_='section-result-title').find('span').get_text(strip=True),
        'type': item.find('span', class_='section-result-details').get_text(strip=True)
        }
      )
  return cards

def save_doc(items, path):
  with open(path, 'w', newline=None,) as file:
    writer = csv.writer(file, delimiter=';')
    writer.writerow(['Phone number', 'Name', 'Type'])

    for item in items:
      writer.writerow([item['phone'], item['name'], item['type']])
def parser():
  PAGINATION = 2
  html = get_html(URL)
  if html.status_code == 200:
    cards = []

    for page in range(1, PAGINATION):
      print(f'page: {page}')
      html = get_html(URL, params={'page': page})
      cards.extend(get_content(html.text))
      save_doc(cards, CSV)
    print('Parsing is over')
  else:
    print('Error')

parser()

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

S

Sergey Gornostaev, 2020-12-09
@swwso1

Because the elements you are interested in are generated by javascript at the front, and BeautifulSoup cannot execute JS.

D

Dimitry Zub, 2021-04-20
@zdmit

Google Maps work on JS, BeautifulSoup cannot extract data from JS code. To fetch data from there, you need to use Selenium, or fetch data from inline-js code.
To do this, you need to match window.APP_INITIALIZATION_STATEc with a regular expression. There you will find all the necessary data:
As an alternative, you can use SerpApi's third-party Google Maps API solution. This is a paid API with a trial and a limit of 5,000 search queries.
Example using SerpApi and Python + Google Maps Place Results API:

import os, json
from serpapi import GoogleSearch

params = {
  "engine": "google_maps",
  "type": "place",
  "google_domain": "google.com",
  "q": "Coffee",
  "ll": "@55.7817589,37.3439227,11z",
  "data":"!3m1!4b1!4m5!3m4!1s0x46b54a6a2d1fa48b:0x48626c54fae83fbd!8m2!3d55.7699605!4d37.6207588",
  "api_key": os.getenv("API_KEY"),
}

search = GoogleSearch(params)
results = search.get_dict()

print(json.dumps(results['place_results'], indent=2, ensure_ascii=False))

JSON output:

...
[
  {
    "title": "Black Star Burger",
    "rating": 4.2,
    "reviews": 5201,
    "price": "$$"
  }
]
...

I am working on SerpApi.