APIs and what you can do with them

Mitchell’s chapter 4 discusses APIs and how to use them to get information from websites (or, more accurately, Web applications) that offer them.

It’s really a great first step, before you try to scrape, to simply search the name of the site you intend to scrape, plus API. For example, I put this into Google:

new york times api

And I got this. Who knew?

Sometimes you get something even better than the API: You find that someone has already done essentially what you want to do and shared it with the world. This post is a good example of that:

How to Scrape Data From Facebook Page Posts for Statistical Analysis

Facebook does not give us much help for any kind of true scraping, even though it offers tons of APIs. The linked post won’t help you scrape other people’s Facebook Pages, but if you work for a publisher, you can use this technique to get raw data about your own Page(s), and you can use the raw data to do way more analysis than Facebook makes possible with its tools for publishers. This gives you super powers. Yay, code!

Things to try

Ask Google: what is my ip address

Copy that IP address into this URL, replacing the IP address that’s already there:  http://freegeoip.net/json/

Paste the URL into your browser and view the tidy JSON data. (Remember JSON? You made a CSV of all your latlongs for map locations for the Leaflet assignment in Intro to Web Apps. You used Mr. Data Converter to change your CSV into JSON.)

Here’s another cool thing Mitchell suggests in chapter 4: Create a free account at The Echo Nest and then explore what you can do in the links on the left side of this page. Then check out the Python libraries for Echo Nest.

Nice people have already written Python “wrappers” for many APIs. However, use caution: Sometimes the publisher changes the API. After that, code written for the old version might not work. Sometimes it needs only a few tweaks, but sometimes it might really be useless.

Don’t despair

Starting to use any API takes some time and effort. The payoff can be great, however, because learning to use the API will usually save you tons of time in the long run, as compared with writing code from scratch.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s