ml4cps.discretization.discretization module

The module provides base classes to represent the dynamics of cyber-physical systems (CPS): CPS and CPSComponent.

Author: Nemanja Hranisavljevic, hranisan@hsu-hh.de

class ml4cps.discretization.discretization.EqualFrequencyDiscretizer

Bases: TimeSeriesDiscretizer

A class that implements the equal-frequency interval (EFI) discretization method. It divides each variable (column) into intervals such that each interval contains approximately the same number of data points.

discretize(df, return_str=False, append_discr=None)

Discretize data into equal-frequency intervals. :param df: Data to discretize. :param return_str: Whether to return discretized data as a concatenated string per row. :return: Discretized data.

train(data, number_of_intervals=10)

Estimate model parameters, thresholds that divide each variable into equal-frequency intervals. :param data: Data to calculate model parameters from. :param number_of_intervals: Number of equal-frequency intervals per variable. :return:

class ml4cps.discretization.discretization.EqualWidthDiscretizer

Bases: TimeSeriesDiscretizer

A class that implements equal-width interval (EWI) discretization method. It calculates range of every variable (column) of the input data into a predefined number of equal-width intervals.

discretize(df, return_str=False, append_discr=None)

Discretize data into equal width intervals. :param data: Data to discretize. :return: Discretized data.

train(data, number_of_intervals=10)

Estimate model parameters, thresholds that divide each variable into equal-width intervals. :param data: Data to calculate model parameters from. :param number_of_intervals: Number of equal-width intervals per variable. :return:

class ml4cps.discretization.discretization.KMeansDiscretizer

Bases: TimeSeriesDiscretizer

A class that implements K-means discretization. It clusters the data values of each variable into a predefined number of clusters using the K-means algorithm.

discretize(df, return_str=False, append_discr=None)

Discretize data into clusters determined by K-means. :param df: DataFrame to discretize. :param return_str: Whether to return discretized data as concatenated strings. :return: Discretized data.

train(data, number_of_clusters_per_var=10)

Train the K-means discretizer by fitting K-means models for each variable (column) in the data. :param data: List of DataFrames to calculate model parameters from. :param number_of_clusters_per_var: Number of clusters (intervals) per variable.

class ml4cps.discretization.discretization.MultivariateKMeansDiscretizer

Bases: TimeSeriesDiscretizer

A class that implements multivariate K-means discretization. It clusters the data based on all variables (columns) together into a predefined number of clusters.

discretize(df, return_str=False, append_discr=None)

Discretize data into clusters determined by K-means. :param df: DataFrame to discretize. :param return_str: Whether to return discretized data as concatenated strings. :return: Discretized data (cluster labels).

train(data, number_of_clusters=10)

Train the K-means discretizer by fitting a single K-means model for all variables. :param data: List of DataFrames to calculate model parameters from. :param number_of_clusters: Number of clusters.

class ml4cps.discretization.discretization.ThresholdDiscretizer

Bases: TimeSeriesDiscretizer

discretize(states)
class ml4cps.discretization.discretization.TimeSeriesDiscretizer

Bases: object

Abstract class that encapsulates methods used for the discretization of time series.

discretize(data)
train(data, *args)