About the Portal
The new National Grid ESO Data Portal was created in order to provide a "centralised repository for published ESO data" through means of a public API backed by a CKAN database. Currently it is still in Beta and the layout of the API as well as its contents may be subject to future change, furthermore it should be noted that during this stage the data streams may be updated later than their historic counterparts.
About this Wrapper
This module creates a Python wrapper around the Data Portal API, providing a more natural way to query data from the National Grid. It has been developed in such a way as to quickly speed up common requests but also enable the full capabilities provided through CKAN. If you have any ideas for the module please feel free to contribute.
The module can be installed using:
pip install NGDataPortal
Getting Started
The module's Wrapper class is the main interface with the API, it can be imported as follows:
from NGDataPortal import Wrapper
n.b. if you haven't already downloaded the module you can use pip install NGDataPortal
To query a data stream simply specifying the name when the wrapper class is initialised and then use the .query_API() method. To see what data streams are available you can use wrapper.streams
which will return a list of those that are available.
stream = 'embedded-wind-and-solar-forecasts'
wrapper = Wrapper(stream)
df = wrapper.query_API()
df.head()
_id | DATE_GMT | TIME_GMT | SETTLEMENT_DATE | SETTLEMENT_PERIOD | EMBEDDED_WIND_FORECAST | EMBEDDED_WIND_CAPACITY | EMBEDDED_SOLAR_FORECAST | EMBEDDED_SOLAR_CAPACITY |
---|---|---|---|---|---|---|---|---|
1 | 20200120 | 1330 | 2020-01-20T00:00:00 | 27 | 1499 | 6465 | 3635 | 13080 |
2 | 20200120 | 1400 | 2020-01-20T00:00:00 | 28 | 1486 | 6465 | 3243 | 13080 |
3 | 20200120 | 1430 | 2020-01-20T00:00:00 | 29 | 1471 | 6465 | 2594 | 13080 |
4 | 20200120 | 1500 | 2020-01-20T00:00:00 | 30 | 1456 | 6465 | 1787 | 13080 |
5 | 20200120 | 1530 | 2020-01-20T00:00:00 | 31 | 1458 | 6465 | 977 | 13080 |
Filtering for a Date Range
Often you may wish to specify a specific date range to be requested, this can be achieved in a number of ways. If only the start_date is provided then all observations since that date will be returned, the inverse is true if only end_date is specified. When both are provided the response will be from between those dates.
When you wish to query a date range you must also provided the dt_col which informs the API which column it will be operating the date filtering over. Once the API format has been stabilised this will be automated within the module.
stream = 'current-balancing-services-use-of-system-bsuos-data'
wrapper = Wrapper(stream=stream)
start_date = '2019-12-20'
end_date = '2019-12-22'
dt_col = 'Settlement Day'
df = wrapper.query_API(start_date=start_date, end_date=end_date, dt_col=dt_col)
df.head()
Settlement Period | Half-hourly Charge | Run Type | Total Daily BSUoS Charge | BSUoS Price (£/MWh Hour) | Settlement Day | _id |
---|---|---|---|---|---|---|
1 | 119,542.669 | II | 5,585,971.58 | 4.89096 | 2019-12-20T00:00:00 | 47667 |
2 | 135,592.386 | II | 5,585,971.58 | 5.40753 | 2019-12-20T00:00:00 | 47668 |
3 | 168,776.958 | II | 5,585,971.58 | 6.79153 | 2019-12-20T00:00:00 | 47669 |
4 | 153,525.796 | II | 5,585,971.58 | 6.21355 | 2019-12-20T00:00:00 | 47670 |
5 | 136,545.346 | II | 5,585,971.58 | 5.63209 | 2019-12-20T00:00:00 | 47671 |
Fully Extensible Queries
One of the advantages in the National Grid opting to use a CKAN backend for the API is that it enables PostgreSQL queries to be directly carried out. This provides considerable advantages in many applications - for example if analysing frequency deviation events you can filter for periods when the value goes outside specified limits, significantly reducing the volume of returned data which would otherwise cover every second.
As an example we'll formally define the SQL string that is created 'under-the-hood' when a date range request is carried out.
stream = 'generation-mix-national'
wrapper = Wrapper(stream)
SQL_query = 'SELECT * from "0a168493-5d67-4a26-8344-2fe0a5d4d20b" WHERE "dateTime_from" BETWEEN \'2019-12-30\'::timestamp AND \'2019-12-31\'::timestamp ORDER BY "dateTime_from"'
df = wrapper.query_API(sql=SQL_query)
df.head()
dateTime_from | nuclear_perc | wind_perc | hydro_perc | coal_perc | gas_perc | other_perc | imports_perc | solar_perc | dateTime_to | _id | biomass_perc |
---|---|---|---|---|---|---|---|---|---|---|---|
2019-12-30T00:00:00 | 25.7 | 36.3 | 2.3 | 1.7 | 16.4 | 0.4 | 6.9 | 0 | 2019-12-30T00:30:00 | 95 | 10.3 |
2019-12-30T00:30:00 | 25.9 | 36.8 | 2.3 | 1.4 | 15.8 | 0.5 | 6.9 | 0 | 2019-12-30T01:00:00 | 94 | 10.4 |
2019-12-30T01:00:00 | 26.2 | 36.8 | 2.2 | 1.4 | 15.8 | 0.5 | 6.7 | 0 | 2019-12-30T01:30:00 | 93 | 10.4 |
2019-12-30T01:30:00 | 26.3 | 36.6 | 2.2 | 1.4 | 15.7 | 0.5 | 6.8 | 0 | 2019-12-30T02:00:00 | 92 | 10.5 |
2019-12-30T02:00:00 | 26.1 | 37.2 | 1.9 | 1.4 | 15.6 | 0.5 | 7.1 | 0 | 2019-12-30T02:30:00 | 91 | 10.2 |