actableai.clustering.models package¶

Subpackages¶

actableai.clustering.models.tests package

Submodules¶

actableai.clustering.models.affinity_propagation module¶

class actableai.clustering.models.affinity_propagation.AffinityPropagation(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.ClusteringModelWrapper

Class to handle Affinity Propagation.

Parameters: models (Base class for all clustering) –

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

actableai.clustering.models.agglomerative_clustering module¶

class actableai.clustering.models.agglomerative_clustering.AgglomerativeClustering(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.ClusteringModelWrapperNoFit

Class to handle Agglomerative Clustering.

Parameters: models (Base class for all clustering) –

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

actableai.clustering.models.base module¶

class actableai.clustering.models.base.BaseClusteringModel(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.models.base.AAIParametersModel[numpy.ndarray, numpy.ndarray], abc.ABC

TODO write documentation

abstract static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

handle_categorical = False¶

has_fit: bool = True¶

has_predict: bool = True¶

project(data: numpy.ndarray) → numpy.ndarray¶

class actableai.clustering.models.base.ClusteringModelWrapper(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.BaseClusteringModel, abc.ABC

initialize_model()¶

class actableai.clustering.models.base.ClusteringModelWrapperNoFit(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶: Bases: actableai.clustering.models.base.ClusteringModelWrapper, abc.ABC

class actableai.clustering.models.base.Model(value)¶

Bases: str, enum.Enum

Enum representing the different model available.

affinity_propagation = 'affinity_propagation'¶

agglomerative_clustering = 'agglomerative_clustering'¶

dbscan = 'dbscan'¶

dec = 'dec'¶

kmeans = 'kmeans'¶

spectral_clustering = 'spectral_clustering'¶

actableai.clustering.models.dbscan module¶

class actableai.clustering.models.dbscan.DBSCAN(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.ClusteringModelWrapperNoFit

Class to handle DBSCAN.

Parameters: models (Base class for all clustering) –

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

actableai.clustering.models.dec module¶

class actableai.clustering.models.dec.ClusteringLayer(*args, **kwargs)¶

Bases: keras.engine.base_layer.Layer

Clustering layer converts input sample (feature) to soft label, i.e. a vector that represents the probability of the sample belonging to each cluster. The probability is calculated with student’s t-distribution.

Example

model.add(ClusteringLayer(n_clusters=10))

Input shape:: 2D tensor with shape: (n_samples, n_features).
Output shape:: 2D tensor with shape: (n_samples, n_clusters).

build(input_shape)¶

Creates the variables of the layer (optional, for subclass implementers).

This is a method that implementers of subclasses of Layer or Model can override if they need a state-creation step in-between layer instantiation and layer call. It is invoked automatically before the first execution of call().

This is typically used to create the weights of Layer subclasses (at the discretion of the subclass implementer).

Parameters: input_shape – Instance of TensorShape, or list of instances of TensorShape if the layer expects a list of inputs (one instance per input).

call(inputs, **kwargs)¶

student t-distribution, as same as used in t-SNE algorithm.: q_ij = 1/(1+dist(x_i, u_j)^2), then normalize it.

Parameters: inputs – the variable containing data, shape=(n_samples, n_features)
Returns: student’s t-distribution, or soft labels for each sample. shape=(n_samples, n_clusters)
Return type: q

compute_output_shape(input_shape)¶

Computes the output shape of the layer.

This method will cause the layer’s state to be built, if that has not happened before. This requires that the layer will later be used with inputs that match the input shape provided here.

Parameters: input_shape – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns: An input shape tuple.

classmethod from_config(config)¶

Creates a layer from its config.

This method is the reverse of get_config, capable of instantiating the same layer from the config dictionary. It does not handle layer connectivity (handled by Network), nor weights (handled by set_weights).

Parameters: config – A Python dictionary, typically the output of get_config.
Returns: A layer instance.

get_config()¶

Returns the config of the layer.

A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.

The config of a layer does not include connectivity information, nor the layer class name. These are handled by Network (one layer of abstraction above).

Note that get_config() does not guarantee to return a fresh copy of dict every time it is called. The callers should make a copy of the returned dict if they want to modify it.

Returns: Python dictionary.

class actableai.clustering.models.dec.DEC(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: bool = 1)¶

Bases: actableai.clustering.models.base.BaseClusteringModel

TODO write documentation

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

handle_categorical = True¶

has_explanations: bool = True¶

actableai.clustering.models.kmeans module¶

class actableai.clustering.models.kmeans.KMeans(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.ClusteringModelWrapper

Class to handle K-Means clustering.

Parameters: models (Base class for all clustering) –

classmethod KMeans_pick_k(scaled_data, alpha_k, k_range, *KMeans_args, **KMeans_kwargs) → sklearn.cluster._kmeans.KMeans¶

Find best k for KMeans based on scaled inertia method.: https://towardsdatascience.com/an-approach-for-choosing-number-of-clusters-for-k-means-c28e614ecb2c

Parameters

scaled_data – matrix scaled data. rows are samples and columns are features for clustering.
alpha_k – manually tuned factor that gives penalty to the number of clusters.
k_range – range of k values to test.

Returns

The value of the best k.

Return type

best_k

static KMeans_pick_k_sil(X, k_range, *KMeans_args, **KMeans_kwargs)¶

Find best k for KMeans based on silhouette score.: https://newbedev.com/scikit-learn-k-means-elbow-criterion

Parameters

X – matrix of data. rows are samples and columns are features for clustering.
k_range – range of k values to test.

Returns

The value of the best k.

Return type

best_k

static KMeans_scaled_inertia(scaled_data: numpy.ndarray, k: int, alpha_k: float, *KMeans_args, **KMeans_kwargs)¶

KMeans with scaled inertia.

Parameters

scaled_data – matrix scaled data. rows are samples and columns are features for clustering.
k – current k for applying KMeans.
alpha_k – manually tuned factor that gives penalty to the number of clusters.

Returns

scaled inertia value for current k

Return type

float

classmethod find_num_clusters(data: numpy.ndarray, k_select_method: str, auto_num_clusters_min: int, auto_num_clusters_max: int, alpha_k: float = 0.01) → int¶

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

actableai.clustering.models.spectral_clustering module¶

class actableai.clustering.models.spectral_clustering.SpectralClustering(input_size: int, num_clusters: int, df_training: pandas.core.frame.DataFrame, parameters: Optional[Dict[str, Any]] = None, process_parameters: bool = True, verbosity: int = 1)¶

Bases: actableai.clustering.models.base.ClusteringModelWrapperNoFit

Class to handle Spectral Clustering.

Parameters: models (Base class for all clustering) –

static get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

Module contents¶

class actableai.clustering.models.ClusteringModel(parameters: Optional[Dict[str, Any]] = None)¶

Bases: actableai.models.base.AAIParametersModel[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame], abc.ABC

get_parameters() → actableai.parameters.parameters.Parameters¶

Returns the parameters of the model.

Returns: The parameters.

has_fit: bool = True¶

has_predict: bool = True¶

project(data: Union[numpy.ndarray, pandas.core.frame.DataFrame]) → numpy.ndarray¶

class actableai.clustering.models.ClusteringModelInference(model: actableai.models.inference.ModelType)¶: Bases: actableai.models.inference.AAIBaseModelInference[actableai.clustering.models.ClusteringModel, actableai.clustering.models.ClusteringModelMetadata]

class actableai.clustering.models.ClusteringModelMetadata(*, features: List[str], feature_parameters: Dict[str, Any], prediction_target: str)¶

Bases: actableai.models.inference.AAIBaseModelMetadata

feature_parameters: Dict[str, Any]¶

features: List[str]¶

prediction_target: str¶

Contents

This Page

actableai.clustering.models package¶

Subpackages¶

Submodules¶

actableai.clustering.models.affinity_propagation module¶

actableai.clustering.models.agglomerative_clustering module¶

actableai.clustering.models.base module¶

actableai.clustering.models.dbscan module¶

actableai.clustering.models.dec module¶

actableai.clustering.models.kmeans module¶

actableai.clustering.models.spectral_clustering module¶

Module contents¶