Handle Outliers
- class handle_outliers.DBSCANOutlierDetector(eps: float = 0.5, min_samples: int = 5, **kwargs: Any)[source]
Detects outliers using the DBSCAN method.
- fit(X: DataFrame, y: Series | None = None) DBSCANOutlierDetector[source]
Fits the DBSCAN model.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted detector.
- Return type:
- class handle_outliers.IQRBasedOutlierDetector(factor: float = 1.5)[source]
Detects outliers using the Interquartile Range (IQR) method.
- fit(X: DataFrame, y: Series | None = None) IQRBasedOutlierDetector[source]
Calculates IQR for the dataset.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted detector.
- Return type:
- class handle_outliers.IsolationForestOutlierDetector(contamination: float = 0.1, random_state: int | None = None, **kwargs: Any)[source]
Detects outliers using the Isolation Forest method.
- fit(X: DataFrame, y: Series | None = None) IsolationForestOutlierDetector[source]
Fits the Isolation Forest model.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted detector.
- Return type:
- class handle_outliers.OutlierCapper(method: str = 'iqr', factor: float = 1.5)[source]
Caps outliers by setting values beyond a threshold to a maximum or minimum value.
- fit(X: DataFrame, y: Series | None = None) OutlierCapper[source]
Calculates the bounds for capping outliers.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted transformer.
- Return type:
- class handle_outliers.RobustScalerTransformer(with_centering: bool = True, with_scaling: bool = True, quantile_range: Tuple[float, float] = (25.0, 75.0), copy: bool = True, unit_variance: bool = False)[source]
Scales data using the RobustScaler method, which is less sensitive to outliers.
- fit(X: DataFrame, y: Series | None = None) RobustScalerTransformer[source]
Fits the RobustScaler to the data.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted transformer.
- Return type:
- class handle_outliers.Winsorizer(limits: Tuple[float, float] = (0.05, 0.05))[source]
Applies Winsorization to limit extreme values in the data.
- fit(X: DataFrame, y: Series | None = None) Winsorizer[source]
Fits the Winsorizer (no action needed).
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted transformer.
- Return type:
- class handle_outliers.ZScoreOutlierDetector(threshold: float = 3.0)[source]
Detects outliers using the Z-Score method.
- fit(X: DataFrame, y: Series | None = None) ZScoreOutlierDetector[source]
Calculates Z-scores for the dataset.
- Parameters:
X (pd.DataFrame) – Input DataFrame.
y (pd.Series, optional) – Target variable (not used).
- Returns:
Fitted detector.
- Return type: