bulwark.checks

Each function in this module should:

  • take a pd.DataFrame as its first argument, with optional additional arguments,
  • make an assert about the pd.DataFrame, and
  • return the original, unaltered pd.DataFrame

Functions

custom_check(check_func, df, *args, **kwargs) Assert that check(df, *args, **kwargs) is true.
has_columns(df, columns[, exact_cols, …]) Asserts that df has columns
has_dtypes(df, items) Asserts that df has dtypes
has_no_infs(df[, columns]) Asserts that there are no np.infs in df.
has_no_nans(df[, columns]) Asserts that there are no np.nans in df.
has_no_neg_infs(df[, columns]) Asserts that there are no np.infs in df.
has_no_nones(df[, columns]) Asserts that there are no Nones in df.
has_no_x(df[, values, columns]) Asserts that there are no user-specified values in df’s columns.
has_set_within_vals(df, items) Asserts that all given values are found in columns’ values.
has_unique_index(df) Asserts that df’s index is unique.
is_monotonic(df[, items, increasing, strict]) Asserts that the df is monotonic.
is_same_as(df, df_to_compare, **kwargs) Asserts that two pd.DataFrames are equal.
is_shape(df, shape) Asserts that df is of a known row x column shape.
multi_check(df, checks[, warn]) Asserts that all checks pass.
one_to_many(df, unitcol, manycol) Asserts that a many-to-one relationship is preserved between two columns.
unique(df[, columns]) Asserts that columns in df only have unique values.
has_vals_within_n_std(df[, n]) Asserts that every value is within n standard deviations of its column’s mean.
has_vals_within_range(df[, items]) Asserts that df is within a range.
has_vals_within_set(df[, items]) Asserts that df is a subset of items.