Recap from class (week 6)

We reviewed the ideas about web scraping covered in the two articles by Canadian journalist Nael Shiab, including:

  • Why do we scrape?
  • What kinds of data do we seek?
  • Examples of what Shiab has scraped, as a journalist
  • Ethics questions about scraping
  • What kinds of sites should we not scrape?

We then installed the BeautifulSoup library in a new Python3 virtualenv and tested it, using commands from Mitchell’s chapters 1 and 2. Don’t forget to use Mitchell’s updated code from her repo instead of the code in her book.

We used the web-scraping section of my python-beginners repo.

We reviewed the basics of writing and running Python3 functions, covered in Sweigart’s chapter 3. This was quick and short, so please refer to the week02 section of my python-beginners repo. In particular, you should examine the chapter outline there and the slide deck, which is linked below the outline. You will be writing your own functions when you write your own web scraper (Assignment 9).

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.