Doorda Knowledge Base

Getting Started on DoordaStats with Python

What is Spine?

With over 100 datasets available in DoordaStats, it may require significant time and effort to understand and create value from them. So, in order to speed up the process and help you merge data easily, we have created `r1_stats_spine`, a base dataset which provides a starting point for you.

`r1_stats_spine` consists of basic details like postcode, outputarea, postcode introduction date, latitude, longitude (to help you to map data) and also total people and households.

 

Importing Spine
Since the `r1_stats_spine` is also available on our hosted platform, acquiring that data is as simple as 4 lines of code.


import pandas as pd projectid = 'doorda-production-207313' query = 'SELECT * FROM DoordaStats.r1_stats_spine' spine = pd.io.gbq.read_gbq(query, projectid)


Mix and Match with the Data You Need
For the purpose of this demonstration, we are comparing the electricity and gas consumption of a region. 

1) `r1_household_gas_consumption`

# Get Electricity Consumption Data
projectid = 'doorda-production-207313'
query = 'SELECT * FROM DoordaStats.r1_household_electricity_consumption'
electricity_consumption = pd.io.gbq.read_gbq(query, projectid)

 

2) `r1_household_electricity_consumption` 

# Get Gas Consumption Data
projectid = 'doorda-production-207313'
query = 'SELECT * FROM DoordaStats.r1_household_gas_consumption'
gas_consumption=pd.io.gbq.read_gbq(query, projectid)

It's important to note that we do not need all the data from `r1_household_gas_consumption` and `r1_household_electricity_consumption` dataset. We only need information on usages: `gas_usage` and `electricity_usage`, in addition to the corresponding postcode which serves as a key for joining the dataset.


# Extracting relevant columns electricity_usage = electricity_consumption[['postcode','electricity_usage']] gas_usage = gas_consumption[['postcode','gas_usage']]


Create a final dataset with all the required metrics by using postcode as the key.

spine = spine.merge(electricity_usage, left_on='postcode', right_on='postcode', how='left')
spine = spine.merge(gas_usage, left_on='postcode', right_on='postcode', how='left')


Calculating Individual Utility Usages

Let's find out utility usage on an individual level.

# Calculate individual electricity usage
individual_electricity_usage = spine['electricity_usage']/spine['total_people']
spine['individual_electricity_usage'] = individual_electricity_usage

# Calculate individual gas usage
individual_gas_usage = spine['gas_usage']/spine['total_people']
spine['individual_gas_usage'] = individual_gas_usage



    Attached Files
    There are no attachments for this article.
    Related Articles
    Listing Accessible Data on BigQuery with Python
    Viewed 134 times since Thu, Aug 23, 2018
    Exporting Data from BigQuery with Python
    Viewed 150 times since Fri, Aug 24, 2018
    Calculating Usage Using Dry Run with Python
    Viewed 124 times since Tue, Aug 28, 2018
    Getting Started on Doorda Host with Python
    Viewed 272 times since Mon, Jun 25, 2018