Find the data you want to access
https://data.wprdc.org/datastore/dump/e03a89dd-134a-4ee8-a2bd-62c40aeebc6f
then your resource ID is
e03a89dd-134a-4ee8-a2bd-62c40aeebc6f
Build your request URL
WPRDC uses DataStore to house its data. You can ask WPRDC to send you data by requesting it in a URL format the database understands, an API. To make a request, we'll use the datastore_search call. Here's the base of the URL:
https://data.wprdc.org/api/3/action/datastore_search?
Now we need to tell the database which file we're looking for. Add your request id:
https://data.wprdc.org/api/3/action/datastore_search?resource_id=e03a89dd-134a-4ee8-a2bd-62c40aeebc6f
Since we're just getting started still, let's add a limit on the amount of data WPRDC will send us at once. That way we don't accidentally request huge data files, wasting resources for both us and WPRDC. I'll start with a limit of 5 rows of data. Once we know our program is working, we can come back later and remove the limit.
https://data.wprdc.org/api/3/action/datastore_search?resource_id=e03a89dd-134a-4ee8-a2bd-62c40aeebc6f&limit=5
There's lots of other helpful tricks you can use when building your URL. For example, you can add filters to reduce the amount of irrelevant data you receive. The datastore_search documentation explains more. For now, just hold on to that URL.
Install Python and a few libraries
pip install requests
pip install pandas
pip install notebook
jupyter notebook
Request your data using Python
import requests
import pandas as pd
url = https://data.wprdc.org/api/3/action/datastore_search?resource_id=e03a89dd-134a-4ee8-a2bd-62c40aeebc6f&limit=5
Ask (request) WPRDC to send you that data, then turn it into a readable format (JSON).
resp = requests.get(url).json()
The response contains a bunch of extra metadata that we don't need right now. So let's grab the meaty part of the response ("result") and pull out the actual data ("records"). We'll use pandas to turn that into a nice spreadsheet table (a "DataFrame").
data = pd.DataFrame(resp['result']['records'])
data (if you're using a notebook)
display(data.to_string()) (if you're not using a notebook)
A note on API politeness
This tutorial was written in August 2023.
No comments:
Post a Comment
Please be kind.