Fetching NSRDB Data#
The fetch_nsrdb function accesses the National Solar Radiation Database (NSRDB) hosted by NREL on a Amazon Web Services (AWS) cloud through the h5py module To access large datasets, an API key can be requested from NREL. Instructions on how to set up the API key can be found here. Download data at any latitude longitude (globally) or state-county (because of repetition of county names) pairs within the US. While HSDS allows you to splice datasets, the script allows you to also find means within ranges. Arrange data in a dataframe for multiscale analysis, with the temporal indices as tuples. [Can be saved as .csv/.txt/.json/.pkl].
import pandas
from energia.utils.nsrdb import fetch_nsrdb_data
Using coordinates#
Coordinates can be used to download required data as shown below. An attrs list can be provided to download specific data such as: air_temperature, clearsky_dhi, clearsky_dni, clearsky_ghi, cloud_type, coordinates, dew_point, dhi, dni, fill_flag, ghi, meta, relative_humidity, solar_zenith_angle, surface_albedo, surface_pressure, time_index, total_precipitable_water, wind_direction, wind_speed
coordinates, weather_data = fetch_nsrdb_data(
attrs=['ghi', 'wind_speed'],
year=2020,
resolution='hourly',
lat_lon=(29.56999969482422, -95.05999755859375),
)
weather_data
| ghi | wind_speed | |
|---|---|---|
| 2020-01-01 00:00:00+00:00 | 0.0 | 0.55 |
| 2020-01-01 01:00:00+00:00 | 0.0 | 0.25 |
| 2020-01-01 02:00:00+00:00 | 0.0 | 0.20 |
| 2020-01-01 03:00:00+00:00 | 0.0 | 0.40 |
| 2020-01-01 04:00:00+00:00 | 0.0 | 0.50 |
| ... | ... | ... |
| 2020-12-31 19:00:00+00:00 | 74.5 | 7.35 |
| 2020-12-31 20:00:00+00:00 | 62.0 | 7.35 |
| 2020-12-31 21:00:00+00:00 | 45.0 | 7.55 |
| 2020-12-31 22:00:00+00:00 | 95.5 | 7.05 |
| 2020-12-31 23:00:00+00:00 | 33.5 | 6.80 |
8784 rows × 2 columns
In this example we will download weather data for every county at an hourly resolution in Texas using the fetch_nsrd_utils function. The centroids of each county can be downloaded from here.
county_df = pandas.read_csv('Texas_Counties_Centroid_Map.csv')
county_list = county_df['CNTY_NM']
for county in county_list:
fetch_nsrdb_data(
attrs=['ghi', 'wind_speed'],
year=2020,
resolution='hourly',
lat_lon=(
county_df[county_df['CNTY_NM'] == county]['X (Lat)'].values[0],
county_df[county_df['CNTY_NM'] == county]['Y (Long)'].values[0],
),
)[1].to_csv(f'{county}.csv')
Using Attributes#
fetch_nsrdb_data also allows you to skim and fetch data which match different specifications, e.g. wind data for collection point at the highest elevation in the county. The total list of specifications inclue ‘max-population’, ‘max-elevation’, ‘max-landcover’ ‘min-population’, ‘min-elevation’, ‘min-landcover’. The state and county needs to be specified. Here we are downloading data for the year 2019 for Harris county in Texas at the collection point with minimum elevation.
coordinates, weather_data = fetch_nsrdb_data(
attrs=[
'dni',
'dhi',
'wind_speed',
'ghi',
'air_temperature',
'dew_point',
'relative_humidity',
'surface_pressure',
],
year=2019,
state='Texas',
county='Harris',
resolution='hourly',
get='min-elevation',
)
weather_data
coordinates
Data can be concatenated for longer temporal periods
weather_houston = pandas.concat(
[
fetch_nsrdb_data(
attrs=[
'dni',
'dhi',
'wind_speed',
'ghi',
'air_temperature',
'dew_point',
'relative_humidity',
'surface_pressure',
],
year=2016 + i,
state='Texas',
county='Harris',
resolution='hourly',
get='min-elevation',
)[1]
for i in range(5)
]
)
weather_houston.index = pandas.to_datetime(weather_houston.index, utc=True)
weather_houston.index = weather_houston.index.strftime('%m/%d/%Y, %r')
weather_houston = weather_houston[~weather_houston.index.str.contains('02/29')]
Resolutions#
The base resolution is ‘half-hourly’. ‘hourly’ and ‘daily’ resolutions average out the data over their respective time periods.