energia.utils.data#
Data management utilities
Functions
|
Makes a DataFrame from data with missing values filled using previous day values. |
|
Removes outliers up to a chosen number of standard deviations. |
- make_henry_price_df(file_name: str, year: int, stretch: bool = False) DataFrame[source]#
Makes a DataFrame from data with missing values filled using previous day values.
The costs are converted from $/MMBtu to $/kg using a factor of 1/22.4. Only works if there is a full year of data (365 days). If stretch is True, the timescale is repeated to expand from days (365) to hours (8760).
Parameters#
- file_name
str Path to the CSV file containing the data.
- year
int Year to import data from.
- stretch
bool, optional If True, stretches the timescale from days to hours. Defaults to
False.
Returns#
pandas.DataFrameDataFrame containing varying natural gas prices with missing values filled.
- file_name
- remove_outliers(data: DataFrame, sd_cuttoff: int = 2, mean_range: int = 1) DataFrame[source]#
Removes outliers up to a chosen number of standard deviations.
Outliers are replaced with the mean of data points on both sides of the point.
Parameters#
- data
pandas.DataFrame Input data.
- sd_cutoff
int, optional Remove data points that are beyond this many standard deviations. Defaults to
2.- mean_range
int, optional Number of neighboring data points on each side to average over when replacing outliers. Defaults to
1.
Returns#
pandas.DataFrameDataFrame with outliers replaced by local means.
- data