pqagent.preprocessor module

class pqagent.preprocessor.Encoder(encoder_type: str)[source]

Bases: object

A class for handling categorical datasets encoding using OneHotEncoder or OrdinalEncoder.

Attributes:

encoder (OneHotEncoder or OrdinalEncoder): Encoder object to transform categorical datasets. encoder_type (str): Type of the encoder (‘onehot’ or ‘ordinal’). categorical_columns (pd.Index): Categorical columns identified for encoding. new_columns (list): New column names generated by the OneHotEncoder.

Methods:

initialize_encoder(encoder_type: str): Initializes the encoder based on the provided encoder type. fit_transform(df: pd.DataFrame) -> pd.DataFrame: Fits the encoder to the categorical columns and transforms them. transform(df: pd.DataFrame) -> pd.DataFrame: Transforms the provided dataframe using the pre-fitted encoder. inverse_transform(df: pd.DataFrame) -> pd.DataFrame: Reverts the transformed columns back to their original form.

fit_transform(df: DataFrame) DataFrame[source]

Fits the encoder to the categorical columns and transforms them.

Parameters:

df – DataFrame containing the datasets to be encoded.

Returns:

Transformed DataFrame with encoded categorical columns.

initialize_encoder(encoder_type: str)[source]

Initializes the encoder based on the provided encoder type.

Parameters:

encoder_type – Type of encoder (‘onehot’ or ‘ordinal’).

Returns:

Initialized encoder (OneHotEncoder or OrdinalEncoder).

Raises:

NameError – If the encoder type is unknown.

inverse_transform(df: DataFrame) DataFrame[source]

Reverts the transformed columns back to their original form.

Parameters:

df – DataFrame containing the transformed datasets.

Returns:

DataFrame with the original categorical columns restored.

transform(df: DataFrame) DataFrame[source]

Transforms the provided DataFrame using the pre-fitted encoder.

Parameters:

df – DataFrame containing the datasets to be transformed.

Returns:

Transformed DataFrame with encoded categorical columns.

class pqagent.preprocessor.Preprocessor(scaler_type: str, encoder_type: str = None)[source]

Bases: object

encoder: Encoder
fit_transform(dataset: DataSet, inplace=False) DataSet | None[source]

Fit the scaler and encoder to the provided datasets and transform them.

Parameters:
  • dataset – DataSet objects to preprocess.

  • inplace – If False, create and return a new DataSet. Inplace modification is not supported yet.

Returns:

Preprocessed DataSet.

fitted: bool = False
classmethod from_config(config: dict)[source]

Instantiate Preprocessor using a configuration dictionary.

Parameters:

config – A dictionary with keys ‘scaler_type’ and ‘encoder_type’.

Returns:

Preprocessor instance.

inverse_transform(dataset: DataSet, inplace: bool = False) DataSet[source]

Apply inverse transformation to the datasets using the pre-fitted scaler and encoder.

Parameters:
  • dataset – DataSet objects to preprocess.

  • inplace – If False, create and return a new DataSet. Inplace modification is not supported yet.

Returns:

Inversely transformed DataSet.

scaler: Scaler
transform(dataset: DataSet, inplace: bool = False) DataSet[source]
class pqagent.preprocessor.Scaler(scaler_type: str)[source]

Bases: object

A class for handling scaling of numerical datasets using StandardScaler or MinMaxScaler.

Attributes:

scaler (StandardScaler or MinMaxScaler): Scaler object to scale numerical datasets. numerical_columns (pd.Index): Numerical columns identified for scaling.

Methods:

initialize_scaler(scaler_type: str): Initializes the scaler based on the provided scaler type. fit_transform(df: pd.DataFrame) -> pd.DataFrame: Fits the scaler to the numerical columns and transforms them. transform(df: pd.DataFrame) -> pd.DataFrame: Transforms the numerical columns using the pre-fitted scaler. inverse_transform(df: pd.DataFrame) -> pd.DataFrame: Reverts the scaled numerical columns back to their original form.

fit_transform(df: DataFrame) DataFrame[source]

Fits the scaler to the numerical columns and transforms them.

Parameters:

df – DataFrame containing the datasets to be scaled.

Returns:

Transformed DataFrame with scaled numerical columns.

initialize_scaler(scaler_type: str)[source]

Initializes the scaler based on the provided scaler type.

Parameters:

scaler_type – Type of scaler (‘standardization’ or ‘minmax’).

Returns:

Initialized scaler (StandardScaler or MinMaxScaler).

Raises:

NameError – If the scaler type is unknown.

inverse_transform(df: DataFrame) DataFrame[source]

Reverts the scaled numerical columns back to their original form.

Parameters:

df – DataFrame containing the scaled datasets.

Returns:

DataFrame with original numerical values restored.

transform(df: DataFrame) DataFrame[source]

Transforms the numerical columns using the pre-fitted scaler.

Parameters:

df – DataFrame containing the datasets to be transformed.

Returns:

Transformed DataFrame with scaled numerical columns.

pqagent.preprocessor.to_list(input: any) list[source]

Ensure that the input is a list. If not, convert it to a list.

Parameters:

input – Any value or list.

Returns:

List containing the input value or the input itself if it’s already a list.