bulwark.checks

Each function in this module should:

  • take a pd.DataFrame as its first argument, with optional additional arguments,

  • make an assert about the pd.DataFrame, and

  • return the original, unaltered pd.DataFrame

Functions

custom_check(df, check_func, *args, **kwargs)

Assert that check(df, *args, **kwargs) is true.

has_columns(df, columns[, exact_cols, …])

Asserts that df has columns

has_dtypes(df, items)

Asserts that df has dtypes

has_no_infs(df[, columns])

Asserts that there are no np.infs in df.

has_no_nans(df[, columns])

Asserts that there are no np.nans in df.

has_no_neg_infs(df[, columns])

Asserts that there are no np.infs in df.

has_no_nones(df[, columns])

Asserts that there are no Nones in df.

has_no_x(df[, values, columns])

Asserts that there are no user-specified values in df’s columns.

has_set_within_vals(df, items)

Asserts that all given values are found in columns’ values.

has_unique_index(df)

Asserts that df’s index is unique.

is_monotonic(df[, items, increasing, strict])

Asserts that the df is monotonic.

is_same_as(df, df_to_compare, **kwargs)

Asserts that two pd.DataFrames are equal.

is_shape(df, shape)

Asserts that df is of a known row x column shape.

multi_check(df, checks[, warn])

Asserts that all checks pass.

one_to_many(df, unitcol, manycol)

Asserts that a many-to-one relationship is preserved between two columns.

unique(df[, columns])

Asserts that columns in df only have unique values.

has_vals_within_n_std(df[, n])

Asserts that every value is within n standard deviations of its column’s mean.

has_vals_within_range(df[, items])

Asserts that df is within a range.

has_vals_within_set(df[, items])

Asserts that df is a subset of items.