bulwark package¶
Submodules¶
bulwark.checks module¶
Each function in this module should:
- take a pd.DataFrame as its first argument, with optional additional arguments,
- make an assert about the pd.DataFrame, and
- return the original, unaltered pd.DataFrame
-
bulwark.checks.
custom_check
(check_func, df, *args, **kwargs)[source]¶ Assert that check(df, *args, **kwargs) is true.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- check_func (function) – A function taking df, *args, and **kwargs. Should raise AssertionError if check not passed.
Returns: Original df.
-
bulwark.checks.
has_columns
(df, columns, exact_cols=False, exact_order=False)[source]¶ Asserts that df has
columns
Parameters: Returns: Original df.
-
bulwark.checks.
has_dtypes
(df, items)[source]¶ Asserts that df has
dtypes
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- items (dict) – Mapping of columns to dtype.
Returns: Original df.
-
bulwark.checks.
has_no_infs
(df, columns=None)[source]¶ Asserts that there are no np.infs in df.
This is a convenience wrapper for has_no_x.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- columns (list) – A subset of columns to check for np.infs.
Returns: Original df.
-
bulwark.checks.
has_no_nans
(df, columns=None)[source]¶ Asserts that there are no np.nans in df.
This is a convenience wrapper for has_no_x.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- columns (list) – A subset of columns to check for np.nans.
Returns: Original df.
-
bulwark.checks.
has_no_neg_infs
(df, columns=None)[source]¶ Asserts that there are no np.infs in df.
This is a convenience wrapper for has_no_x.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- columns (list) – A subset of columns to check for -np.infs.
Returns: Original df.
-
bulwark.checks.
has_no_nones
(df, columns=None)[source]¶ Asserts that there are no Nones in df.
This is a convenience wrapper for has_no_x.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- columns (list) – A subset of columns to check for Nones.
Returns: Original df.
-
bulwark.checks.
has_no_x
(df, values=None, columns=None)[source]¶ Asserts that there are no user-specified values in df’s columns.
Parameters: Returns: Original df.
-
bulwark.checks.
has_set_within_vals
(df, items)[source]¶ Asserts that all given values are found in columns’ values.
In other words, the given values in the items dict should all be a subset of the values found in the associated column in df.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- items (dict) – Mapping of columns to values excepted to be found within them.
Returns: Original df.
Examples
The following check will pass, since df[‘a’] contains each of 1 and 2:
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c']}) >>> ck.has_set_within_vals(df, items={"a": [1, 2]})
The following check will fail, since df[‘b’] doesn’t contain each of “a” and “d”:
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': ['a', 'b', 'c']}) >>> ck.has_set_within_vals(df, items={"a": [1, 2], "b": ["a", "d"]})
-
bulwark.checks.
has_unique_index
(df)[source]¶ Asserts that df’s index is unique.
Parameters: df (pd.DataFrame) – Any pd.DataFrame. Returns: Original df.
-
bulwark.checks.
is_monotonic
(df, items=None, increasing=None, strict=False)[source]¶ Asserts that the df is monotonic.
Parameters: Returns: Original df.
-
bulwark.checks.
is_same_as
(df, df_to_compare, **kwargs)[source]¶ Asserts that two pd.DataFrames are equal.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- df_to_compare (pd.DataFrame) – A second pd.DataFrame.
- **kwargs (dict) – Keyword arguments passed through to pandas’
assert_frame_equal
.
Returns: Original df.
-
bulwark.checks.
is_shape
(df, shape)[source]¶ Asserts that df is of a known row x column shape.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- shape (tuple) – Shape of df as (n_rows, n_columns). Use None or -1 if you don’t care about a specific dimension.
Returns: Original df.
-
bulwark.checks.
multi_check
(df, checks, warn=False)[source]¶ Asserts that all checks pass.
Parameters: Returns: Original df.
-
bulwark.checks.
one_to_many
(df, unitcol, manycol)[source]¶ Asserts that a many-to-one relationship is preserved between two columns.
For example, a retail store will have have distinct departments, each with several employees. If each employee may only work in a single department, then the relationship of the department to the employees is one to many.
Parameters: Returns: Original df.
-
bulwark.checks.
unique
(df, columns=None)[source]¶ Asserts that columns in df only have unique values.
Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- columns (list) – A subset of columns to check for uniqueness of row values.
Returns: Original df.
-
bulwark.checks.
within_n_std
(df, n=3)[source]¶ Asserts that every value is within
n
standard deviations of its column’s mean.Parameters: - df (pd.DataFrame) – Any pd.DataFrame.
- n (int) – Number of standard deviations from the mean.
Returns: Original df.
bulwark.decorators module¶
-
bulwark.decorators.
HasColumns
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasDtypes
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasNoInfs
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasNoNans
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasNoNegInfs
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasNoNones
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasNoX
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasSetWithinVals
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
HasUniqueIndex
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
IsMonotonic
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
IsSameAs
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
IsShape
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
MultiCheck
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
OneToMany
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
Unique
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
WithinNStd
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
WithinRange
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
-
bulwark.decorators.
WithinSet
¶ alias of
bulwark.decorators.decorator_factory.<locals>.decorator_name
bulwark.generic module¶
Module for useful generic functions.