CostDistribution#
- class cost_utils.CostDistribution(df, min_col='min', max_col='max', avg_col='avg', trips_col='trips', weighted_avg_col=None)#
Bases:
objectDistribution of cost values between variable bounds.
Alternate constructors are available in the See Also section
- Parameters:
df (pd.DataFrame) – A DataFrame containing the binned cost distribution data. Must have columns named: min_col, max_col, avg_col, trips_col.
min_col (str) – The name of the columns in df that contains the lower bin edge value for each row.
max_col (str) – The name of the columns in df that contains the upper bin edge value for each row.
avg_col (str) – The name of the columns in df that contains the centre of the bin
trips_col (str) – The name of the columns in df that contains the value for each row.
weighted_avg_col (Optional[str]) – The name of the columns in df that contains the weighted average value for each row. If available, this is different from avg_col as it takes into account this distribution of values within each bound when calculating averages.
See also
Attributes Summary
Average values for each of the cost distribution bins.
Band share values for each of the cost distribution bins.
Bin edges for the cost distribution.
Maximum values of the cost distribution in edges.
Minimum values of the cost distribution bin edges.
Bin edges for the cost distribution.
Trip values for each of the cost distribution bins.
Weighted average values for each of the cost distribution bins.
Methods Summary
band_share_convergence(other)Calculate the convergence between this and other.
band_share_residuals(other)Calculate the band share residuals between this and other.
calculate_weighted_averages(matrix, ...)Calculate weighted averages of bins in a cost distribution.
copy()Create a copy of this instance.
create_similar(trip_vals)Create a similar cost distribution with different trip values.
from_data(matrix, cost_matrix[, min_bounds, ...])Convert values and a cost matrix into a CostDistribution.
from_data_no_bins(matrix, cost_matrix, ...)Convert values and a cost matrix into a CostDistribution.
from_file(filepath[, min_col, max_col, ...])Build an instance from a file on disk.
trip_residuals(other)Calculate the trip residuals between this and other.
Attributes Documentation
- avg_vals#
Average values for each of the cost distribution bins.
Band share values for each of the cost distribution bins.
- bin_edges#
Bin edges for the cost distribution.
- max_vals#
Maximum values of the cost distribution in edges.
- min_vals#
Minimum values of the cost distribution bin edges.
- n_bins#
Bin edges for the cost distribution.
- trip_vals#
Trip values for each of the cost distribution bins.
- weighted_avg_vals#
Weighted average values for each of the cost distribution bins.
Methods Documentation
Calculate the convergence between this and other.
Residuals are calculated as: math_utils.curve_convergence(self.band_share_vals, other.band_share_vals)
- Parameters:
other (CostDistribution) – Another instance of CostDistribution using the same bins.
- Returns:
A float value between 0 and 1. Values closer to 1 indicate a better convergence.
- Return type:
convergence
See also
None
Calculate the band share residuals between this and other.
Residuals are calculated as: self.band_share_vals - other.band_share_vals
- Parameters:
other (CostDistribution) – Another instance of CostDistribution using the same bins.
- Returns:
The residual difference between this and other.
- Return type:
residuals
- static calculate_weighted_averages(matrix, cost_matrix, bin_edges)#
Calculate weighted averages of bins in a cost distribution.
- Parameters:
matrix (np.ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix
cost_matrix (np.ndarray) – A matrix of cost relating to matrix. cost_matrix should be the same shape as matrix
bin_edges (list[float] | np.ndarray) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram
- Returns:
An array to be passed into a dataframe as a column.
- Return type:
np.ndarray
- copy()#
Create a copy of this instance.
- Return type:
- create_similar(trip_vals)#
Create a similar cost distribution with different trip values.
- Parameters:
trip_vals (ndarray) – A numpy array of trip values that will replace the current trip values.
- Returns:
A copy of this instance, with different trip values.
- Return type:
cost_distribution
- classmethod from_data(matrix, cost_matrix, min_bounds=None, max_bounds=None, bin_edges=None)#
Convert values and a cost matrix into a CostDistribution.
- Parameters:
matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix
cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix
min_bounds (list[float] | ndarray | None) – A list of minimum bounds for each edge of a distribution band. Corresponds to max_bounds.
max_bounds (list[float] | ndarray | None) – A list of maximum bounds for each edge of a distribution band. Corresponds to min_bounds.
bin_edges (list[float] | ndarray | None) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram
- Returns:
An instance of CostDistribution containing the given data.
- Return type:
cost_distribution
See also
None
- static from_data_no_bins(matrix, cost_matrix, *args, **kwargs)#
Convert values and a cost matrix into a CostDistribution.
create_log_bins will be used to generate some bin edges.
- Parameters:
matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix
cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix
*args – arguments to pass through to create_log_bins
**kwargs – arguments to pass through to create_log_bins
- Returns:
An instance of CostDistribution containing the given data.
- Return type:
cost_distribution
See also
None
- static from_file(filepath, min_col='min', max_col='max', avg_col='avg', trips_col='trips', weighted_avg_col=None)#
Build an instance from a file on disk.
- Parameters:
filepath (PathLike) – Path to the file to read in.
min_col (str) – The column of data at filepath that contains the minimum cost value of each band.
max_col (str) – The column of data at filepath that contains the maximum cost value of each band.
avg_col (str) – The column of data at filepath that contains the average cost value of each band.
trips_col (str) – The column of data at filepath that contains the number of trips of each cost band.
weighted_avg_col (str | None) – The column of data at ‘filepath’ that contains the weighted average cost value of each band. If the read in df does not contain this column, it will default to the avg_col.
- Returns:
An instance containing the data at filepath.
- Return type:
cost_distribution
- trip_residuals(other)#
Calculate the trip residuals between this and other.
Residuals are calculated as: self.trip_vals - other.trip_vals
- Parameters:
other (CostDistribution) – Another instance of CostDistribution using the same bins.
- Returns:
The residual difference between this and other.
- Return type:
residuals