CostDistribution#

class cost_utils.CostDistribution(df, min_col='min', max_col='max', avg_col='avg', trips_col='trips', weighted_avg_col=None)#

Bases: object

Distribution of cost values between variable bounds.

Alternate constructors are available in the See Also section

Parameters:
  • df (pd.DataFrame) – A DataFrame containing the binned cost distribution data. Must have columns named: min_col, max_col, avg_col, trips_col.

  • min_col (str) – The name of the columns in df that contains the lower bin edge value for each row.

  • max_col (str) – The name of the columns in df that contains the upper bin edge value for each row.

  • avg_col (str) – The name of the columns in df that contains the centre of the bin

  • trips_col (str) – The name of the columns in df that contains the value for each row.

  • weighted_avg_col (Optional[str]) – The name of the columns in df that contains the weighted average value for each row. If available, this is different from avg_col as it takes into account this distribution of values within each bound when calculating averages.

Attributes Summary

avg_vals

Average values for each of the cost distribution bins.

band_share_vals

Band share values for each of the cost distribution bins.

bin_edges

Bin edges for the cost distribution.

max_vals

Maximum values of the cost distribution in edges.

min_vals

Minimum values of the cost distribution bin edges.

n_bins

Bin edges for the cost distribution.

trip_vals

Trip values for each of the cost distribution bins.

weighted_avg_vals

Weighted average values for each of the cost distribution bins.

Methods Summary

band_share_convergence(other)

Calculate the convergence between this and other.

band_share_residuals(other)

Calculate the band share residuals between this and other.

calculate_weighted_averages(matrix, ...)

Calculate weighted averages of bins in a cost distribution.

copy()

Create a copy of this instance.

create_similar(trip_vals)

Create a similar cost distribution with different trip values.

from_data(matrix, cost_matrix[, min_bounds, ...])

Convert values and a cost matrix into a CostDistribution.

from_data_no_bins(matrix, cost_matrix, ...)

Convert values and a cost matrix into a CostDistribution.

from_file(filepath[, min_col, max_col, ...])

Build an instance from a file on disk.

trip_residuals(other)

Calculate the trip residuals between this and other.

Attributes Documentation

avg_vals#

Average values for each of the cost distribution bins.

band_share_vals#

Band share values for each of the cost distribution bins.

bin_edges#

Bin edges for the cost distribution.

max_vals#

Maximum values of the cost distribution in edges.

min_vals#

Minimum values of the cost distribution bin edges.

n_bins#

Bin edges for the cost distribution.

trip_vals#

Trip values for each of the cost distribution bins.

weighted_avg_vals#

Weighted average values for each of the cost distribution bins.

Methods Documentation

band_share_convergence(other)#

Calculate the convergence between this and other.

Residuals are calculated as: math_utils.curve_convergence(self.band_share_vals, other.band_share_vals)

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

A float value between 0 and 1. Values closer to 1 indicate a better convergence.

Return type:

convergence

See also

None

band_share_residuals(other)#

Calculate the band share residuals between this and other.

Residuals are calculated as: self.band_share_vals - other.band_share_vals

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

The residual difference between this and other.

Return type:

residuals

static calculate_weighted_averages(matrix, cost_matrix, bin_edges)#

Calculate weighted averages of bins in a cost distribution.

Parameters:
  • matrix (np.ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (np.ndarray) – A matrix of cost relating to matrix. cost_matrix should be the same shape as matrix

  • bin_edges (list[float] | np.ndarray) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram

Returns:

An array to be passed into a dataframe as a column.

Return type:

np.ndarray

copy()#

Create a copy of this instance.

Return type:

CostDistribution

create_similar(trip_vals)#

Create a similar cost distribution with different trip values.

Parameters:

trip_vals (ndarray) – A numpy array of trip values that will replace the current trip values.

Returns:

A copy of this instance, with different trip values.

Return type:

cost_distribution

classmethod from_data(matrix, cost_matrix, min_bounds=None, max_bounds=None, bin_edges=None)#

Convert values and a cost matrix into a CostDistribution.

Parameters:
  • matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix

  • min_bounds (list[float] | ndarray | None) – A list of minimum bounds for each edge of a distribution band. Corresponds to max_bounds.

  • max_bounds (list[float] | ndarray | None) – A list of maximum bounds for each edge of a distribution band. Corresponds to min_bounds.

  • bin_edges (list[float] | ndarray | None) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram

Returns:

An instance of CostDistribution containing the given data.

Return type:

cost_distribution

See also

None

static from_data_no_bins(matrix, cost_matrix, *args, **kwargs)#

Convert values and a cost matrix into a CostDistribution.

create_log_bins will be used to generate some bin edges.

Parameters:
  • matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix

  • *args – arguments to pass through to create_log_bins

  • **kwargs – arguments to pass through to create_log_bins

Returns:

An instance of CostDistribution containing the given data.

Return type:

cost_distribution

See also

None

static from_file(filepath, min_col='min', max_col='max', avg_col='avg', trips_col='trips', weighted_avg_col=None)#

Build an instance from a file on disk.

Parameters:
  • filepath (PathLike) – Path to the file to read in.

  • min_col (str) – The column of data at filepath that contains the minimum cost value of each band.

  • max_col (str) – The column of data at filepath that contains the maximum cost value of each band.

  • avg_col (str) – The column of data at filepath that contains the average cost value of each band.

  • trips_col (str) – The column of data at filepath that contains the number of trips of each cost band.

  • weighted_avg_col (str | None) – The column of data at ‘filepath’ that contains the weighted average cost value of each band. If the read in df does not contain this column, it will default to the avg_col.

Returns:

An instance containing the data at filepath.

Return type:

cost_distribution

trip_residuals(other)#

Calculate the trip residuals between this and other.

Residuals are calculated as: self.trip_vals - other.trip_vals

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

The residual difference between this and other.

Return type:

residuals