CostDistribution#

class cost_utils.CostDistribution(df, min_col='min', max_col='max', avg_col='ave', trips_col='trips', weighted_avg_col='weighted_ave', units='km')#

Bases: object

Distribution of cost values between variable bounds.

Parameters:
  • df (DataFrame) –

  • min_col (str) –

  • max_col (str) –

  • avg_col (str) –

  • trips_col (str) –

  • weighted_avg_col (str) –

  • units (str) –

df#

The raw Pandas DataFrame containing the cost distribution data.

Type:

pandas.core.frame.DataFrame

min_vals#

Minimum values of the cost distribution bin edges.

max_vals#

Maximum values of the cost distribution bin edges.

bin_edges#

Bin edges for the cost distribution.

avg_vals#

Average values for each of the cost distribution bins.

trip_vals#

Trip values for each of the cost distribution bins.

band_share_vals#

Band share values for each of the cost distribution bins.

weighted_avg_vals#

Weighted average values for each of the cost distribution bins.

Attributes Summary

avg_col

avg_vals

Average values for each of the cost distribution bins.

band_share_vals

Band share values for each of the cost distribution bins.

bin_edges

Bin edges for the cost distribution.

max_col

max_vals

Maximum values of the cost distribution in edges.

min_col

min_vals

Minimum values of the cost distribution bin edges.

n_bins

Bin edges for the cost distribution.

trip_vals

Trip values for each of the cost distribution bins.

trips_col

units

weighted_avg_col

Methods Summary

band_share_convergence(other)

Calculate the convergence between this and other.

band_share_residuals(other)

Calculate the band share residuals between this and other.

calculate_weighted_averages(matrix, ...)

Calculate weighted averages of bins in a cost distribution.

check_df_col_names()

Check the given columns are in the given dataframe.

copy()

Create a copy of this instance.

create_similar(trip_vals)

Create a similar cost distribution with different trip values.

from_data(matrix, cost_matrix[, min_bounds, ...])

Convert values and a cost matrix into a CostDistribution.

from_data_no_bins(matrix, cost_matrix, ...)

Convert values and a cost matrix into a CostDistribution.

from_file(filepath[, min_col, max_col, ...])

Build an instance from a file on disk.

trip_residuals(other)

Calculate the trip residuals between this and other.

Attributes Documentation

avg_col: str = 'ave'#
avg_vals#

Average values for each of the cost distribution bins.

band_share_vals#

Band share values for each of the cost distribution bins.

bin_edges#

Bin edges for the cost distribution.

max_col: str = 'max'#
max_vals#

Maximum values of the cost distribution in edges.

min_col: str = 'min'#
min_vals#

Minimum values of the cost distribution bin edges.

n_bins#

Bin edges for the cost distribution.

trip_vals#

Trip values for each of the cost distribution bins.

trips_col: str = 'trips'#
units: str = 'km'#
weighted_avg_col: str = 'weighted_ave'#

Methods Documentation

band_share_convergence(other)#

Calculate the convergence between this and other.

Residuals are calculated as: math_utils.curve_convergence(self.band_share_vals, other.band_share_vals)

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

A float value between 0 and 1. Values closer to 1 indicate a better convergence.

Return type:

convergence

See also

None

band_share_residuals(other)#

Calculate the band share residuals between this and other.

Residuals are calculated as: self.band_share_vals - other.band_share_vals

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

The residual difference between this and other.

Return type:

residuals

static calculate_weighted_averages(matrix, cost_matrix, bin_edges)#

Calculate weighted averages of bins in a cost distribution.

Parameters:
  • matrix (np.ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (np.ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix

  • bin_edges (list[float] | np.ndarray) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram

Returns:

An array to be passed into a dataframe as a column.

Return type:

np.ndarray

check_df_col_names()#

Check the given columns are in the given dataframe.

Return type:

CostDistribution

copy()#

Create a copy of this instance.

Return type:

CostDistribution

create_similar(trip_vals)#

Create a similar cost distribution with different trip values.

Parameters:

trip_vals (ndarray) – A numpy array of trip values that will replace the current trip values.

Returns:

A copy of this instance, with different trip values.

Return type:

cost_distribution

classmethod from_data(matrix, cost_matrix, min_bounds=None, max_bounds=None, bin_edges=None)#

Convert values and a cost matrix into a CostDistribution.

Parameters:
  • matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix

  • min_bounds (list[float] | ndarray | None) – A list of minimum bounds for each edge of a distribution band. Corresponds to max_bounds.

  • max_bounds (list[float] | ndarray | None) – A list of maximum bounds for each edge of a distribution band. Corresponds to min_bounds.

  • bin_edges (list[float] | ndarray | None) – Defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths. This argument is passed straight into numpy.histogram

Returns:

An instance of CostDistribution containing the given data.

Return type:

cost_distribution

See also

None

static from_data_no_bins(matrix, cost_matrix, *args, **kwargs)#

Convert values and a cost matrix into a CostDistribution.

create_log_bins will be used to generate some bin edges.

Parameters:
  • matrix (ndarray) – The matrix to calculate the cost distribution for. This matrix should be the same shape as cost_matrix

  • cost_matrix (ndarray) – A matrix of cost relating to matrix. This matrix should be the same shape as matrix

  • *args – arguments to pass through to create_log_bins

  • **kwargs – arguments to pass through to create_log_bins

Returns:

An instance of CostDistribution containing the given data.

Return type:

cost_distribution

See also

None

static from_file(filepath, min_col='min', max_col='max', avg_col='ave', trips_col='trips', weighted_avg_col='weighted_ave')#

Build an instance from a file on disk.

Parameters:
  • filepath (PathLike) – Path to the file to read in.

  • min_col (str) – The column of data at filepath that contains the minimum cost value of each band.

  • max_col (str) – The column of data at filepath that contains the maximum cost value of each band.

  • avg_col (str) – The column of data at filepath that contains the average cost value of each band.

  • trips_col (str) – The column of data at filepath that contains the number of trips of each cost band.

  • weighted_avg_col (str) – The column of data at ‘filepath’ that contains the weighted average cost value of each band. If the read in df does not contain this column, it will default to the avg_col.

Returns:

An instance containing the data at filepath.

Return type:

cost_distribution

trip_residuals(other)#

Calculate the trip residuals between this and other.

Residuals are calculated as: self.trip_vals - other.trip_vals

Parameters:

other (CostDistribution) – Another instance of CostDistribution using the same bins.

Returns:

The residual difference between this and other.

Return type:

residuals