Answer the question
In order to leave comments, you need to log in
How to fix python code?
There is code from a book on parsing in Python
Should display links from page
A outputs like this
/wiki/IMDb
/wiki/2007_Webby_Awards
/wiki/2017_Webby_Awards
/wiki/Internet_Archive
here is the code
from urllib.request import urlopen
from bs4 import BeautifulSoup
import datetime
import random
import re
random.seed(datetime.datetime.now())
def getLinks(articleUrl):
html = urlopen("http://en.wikipedia.org" + articleUrl)
bsObj = BeautifulSoup(html, "html.parser")
return bsObj.find\
("div", {"id": "mw-content-text"}).findAll\
("a", href=re.compile("^(/wiki/)((?!:).)*$"))
links = getLinks("/wiki/Kevin_Bacon")
while len(links) > 0:
newArticle = links[random.randint(0, len(links) - 1)].attrs["href"]
print(newArticle)
links = getLinks(newArticle)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question