energia.utils.data

energia.utils.data#

Data management utilities

Functions

make_henry_price_df(file_name, year[, stretch])

Makes a DataFrame from data with missing values filled using previous day values.

remove_outliers(data[, sd_cuttoff, mean_range])

Removes outliers up to a chosen number of standard deviations.

make_henry_price_df(file_name: str, year: int, stretch: bool = False) DataFrame[source]#

Makes a DataFrame from data with missing values filled using previous day values.

The costs are converted from $/MMBtu to $/kg using a factor of 1/22.4. Only works if there is a full year of data (365 days). If stretch is True, the timescale is repeated to expand from days (365) to hours (8760).

Parameters#

file_namestr

Path to the CSV file containing the data.

yearint

Year to import data from.

stretchbool, optional

If True, stretches the timescale from days to hours. Defaults to False.

Returns#

pandas.DataFrame

DataFrame containing varying natural gas prices with missing values filled.

remove_outliers(data: DataFrame, sd_cuttoff: int = 2, mean_range: int = 1) DataFrame[source]#

Removes outliers up to a chosen number of standard deviations.

Outliers are replaced with the mean of data points on both sides of the point.

Parameters#

datapandas.DataFrame

Input data.

sd_cutoffint, optional

Remove data points that are beyond this many standard deviations. Defaults to 2.

mean_rangeint, optional

Number of neighboring data points on each side to average over when replacing outliers. Defaults to 1.

Returns#

pandas.DataFrame

DataFrame with outliers replaced by local means.