mbl.schema.RandomHeisenbergFoldingTSDRGSchema
3.3. mbl.schema.RandomHeisenbergFoldingTSDRGSchema#
- class mbl.schema.RandomHeisenbergFoldingTSDRGSchema(*args, **kwargs)[source]#
Bases:
pandera.model.SchemaModel
Check if all columns in a dataframe have a column in the Schema.
- Parameters
check_obj (pd.DataFrame) – the dataframe to be validated.
head – validate the first n rows. Rows overlapping with tail or sample are de-duplicated.
tail – validate the last n rows. Rows overlapping with head or sample are de-duplicated.
sample – validate a random sample of n rows. Rows overlapping with head or tail are de-duplicated.
random_state – random seed for the
sample
argument.lazy – if True, lazily evaluates dataframe against all validation checks and raises a
SchemaErrors
. Otherwise, raiseSchemaError
as soon as one occurs.inplace – if True, applies coercion to the object of validation, otherwise creates a copy of the data.
- Returns
validated
DataFrame
- Raises
SchemaError – when
DataFrame
violates built-in or custom checks.- Example
- Return type
pandera.typing.common.DataFrameBase[pandera.model.TSchemaModel]
Calling
schema.validate
returns the dataframe.>>> import pandas as pd >>> import pandera as pa >>> >>> df = pd.DataFrame({ ... "probability": [0.1, 0.4, 0.52, 0.23, 0.8, 0.76], ... "category": ["dog", "dog", "cat", "duck", "dog", "dog"] ... }) >>> >>> schema_withchecks = pa.DataFrameSchema({ ... "probability": pa.Column( ... float, pa.Check(lambda s: (s >= 0) & (s <= 1))), ... ... # check that the "category" column contains a few discrete ... # values, and the majority of the entries are dogs. ... "category": pa.Column( ... str, [ ... pa.Check(lambda s: s.isin(["dog", "cat", "duck"])), ... pa.Check(lambda s: (s == "dog").mean() > 0.5), ... ]), ... }) >>> >>> schema_withchecks.validate(df)[["probability", "category"]] probability category 0 0.10 dog 1 0.40 dog 2 0.52 cat 3 0.23 duck 4 0.80 dog 5 0.76 dog
- __init__()#
Methods
__init__
()bounded_in
(series)close_to_integer
(series)energies_within
(df)energy_bounds
(df)example
(*[, size])Create a
hypothesis
strategy for generating a DataFrame.monotonically_increasing
(series)offset_within
(df)strategy
(*[, size])Create a
hypothesis
strategy for generating a DataFrame.to_schema
()Create
DataFrameSchema
from theSchemaModel
.to_yaml
([stream])Convert Schema to yaml using io.to_yaml.
validate
(check_obj[, head, tail, sample, ...])Check if all columns in a dataframe have a column in the Schema.
Attributes
- level_id: pandera.typing.pandas.Series[int] = 'level_id'#
- en: pandera.typing.pandas.Series[float] = 'en'#
- variance: pandera.typing.pandas.Series[float] = 'variance'#
- total_sz: pandera.typing.pandas.Series[float] = 'total_sz'#
- edge_entropy: pandera.typing.pandas.Series[float] = 'edge_entropy'#
- truncation_dim: pandera.typing.pandas.Series[int] = 'truncation_dim'#
- system_size: pandera.typing.pandas.Series[int] = 'system_size'#
- disorder: pandera.typing.pandas.Series[float] = 'disorder'#
- trial_id: pandera.typing.pandas.Series[str] = 'trial_id'#
- seed: pandera.typing.pandas.Series[int] = 'seed'#
- penalty: pandera.typing.pandas.Series[float] = 'penalty'#
- s_target: pandera.typing.pandas.Series[int] = 's_target'#
- offset: pandera.typing.pandas.Series[float] = 'offset'#
- max_en: pandera.typing.pandas.Series[float] = 'max_en'#
- min_en: pandera.typing.pandas.Series[float] = 'min_en'#
- relative_offset: pandera.typing.pandas.Series[float] = 'relative_offset'#
- method: pandera.typing.pandas.Series[str] = 'method'#
- classmethod monotonically_increasing(series)[source]#
- Parameters
series (pandera.typing.pandas.Series[float]) –
- Return type
bool
- classmethod close_to_integer(series)[source]#
- Parameters
series (pandera.typing.pandas.Series[float]) –
- Return type
pandera.typing.pandas.Series[bool]
- classmethod bounded_in(series)[source]#
- Parameters
series (pandera.typing.pandas.Series[float]) –
- Return type
pandera.typing.pandas.Series[bool]
- classmethod energy_bounds(df)[source]#
- Parameters
df (pandas.core.frame.DataFrame) –
- Return type
pandera.typing.pandas.Series[bool]
- classmethod offset_within(df)[source]#
- Parameters
df (pandas.core.frame.DataFrame) –
- Return type
pandera.typing.pandas.Series[bool]
- classmethod energies_within(df)[source]#
- Parameters
df (pandas.core.frame.DataFrame) –
- Return type
pandera.typing.pandas.Series[bool]