SOMEstimator

class susi.SOMEstimator(n_rows: int = 10, n_columns: int = 10, init_mode_unsupervised: str = 'random', init_mode_supervised: str = 'random', n_iter_unsupervised: int = 1000, n_iter_supervised: int = 1000, train_mode_unsupervised: str = 'online', train_mode_supervised: str = 'online', neighborhood_mode_unsupervised: str = 'linear', neighborhood_mode_supervised: str = 'linear', learn_mode_unsupervised: str = 'min', learn_mode_supervised: str = 'min', distance_metric: str = 'euclidean', learning_rate_start=0.5, learning_rate_end=0.05, nbh_dist_weight_mode: str = 'pseudo-gaussian', missing_label_placeholder=None, n_jobs=None, random_state=None, verbose=0)[source]

Basic class for supervised self-organizing maps.

Parameters
  • n_rows (int, optional (default=10)) – Number of rows for the SOM grid

  • n_columns (int, optional (default=10)) – Number of columns for the SOM grid

  • init_mode_unsupervised (str, optional (default=”random”)) – Initialization mode of the unsupervised SOM

  • init_mode_supervised (str, optional (default=”random”)) – Initialization mode of the supervised SOM

  • n_iter_unsupervised (int, optional (default=1000)) – Number of iterations for the unsupervised SOM

  • n_iter_supervised (int, optional (default=1000)) – Number of iterations for the supervised SOM

  • train_mode_unsupervised (str, optional (default=”online”)) – Training mode of the unsupervised SOM

  • train_mode_supervised (str, optional (default=”online”)) – Training mode of the supervised SOM

  • neighborhood_mode_unsupervised (str, optional (default=”linear”)) – Neighborhood mode of the unsupervised SOM

  • neighborhood_mode_supervised (str, optional (default=”linear”)) – Neighborhood mode of the supervised SOM

  • learn_mode_unsupervised (str, optional (default=”min”)) – Learning mode of the unsupervised SOM

  • learn_mode_supervised (str, optional (default=”min”)) – Learning mode of the supervised SOM

  • distance_metric (str, optional (default=”euclidean”)) – Distance metric to compare on feature level (not SOM grid). Possible metrics: {“euclidean”, “manhattan”, “mahalanobis”, “tanimoto”}. Note that “tanimoto” tends to be slow.

  • learning_rate_start (float, optional (default=0.5)) – Learning rate start value

  • learning_rate_end (float, optional (default=0.05)) – Learning rate end value (only needed for some lr definitions)

  • nbh_dist_weight_mode (str, optional (default=”pseudo-gaussian”)) – Formula of the neighborhood distance weight. Possible formulas are: {“pseudo-gaussian”, “mexican-hat”}.

  • missing_label_placeholder (int or str or None, optional (default=None)) – Label placeholder for datapoints with no label. This is needed for semi-supervised learning.

  • n_jobs (int or None, optional (default=None)) – The number of jobs to run in parallel.

  • random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

  • verbose (int, optional (default=0)) – Controls the verbosity.

Variables
  • node_list_ (np.array of (int, int) tuples) – List of 2-dimensional coordinates of SOM nodes

  • radius_max_ (float, int) – Maximum radius of the neighborhood function

  • radius_min_ (float, int) – Minimum radius of the neighborhood function

  • unsuper_som_ (np.array) – Weight vectors of the unsupervised SOM shape = (self.n_rows, self.n_columns, X.shape[1])

  • X_ (np.array) – Input data

  • fitted_ (bool) – States if estimator is fitted to X

  • max_iterations_ (int) – Maximum number of iterations for the current training

  • bmus_ (list of (int, int) tuples) – List of best matching units (BMUs) of the dataset X

  • sample_weights_ (TODO) –

  • n_features_in_ (int) – Number of input features

calc_estimation_output(datapoint, mode='bmu')[source]

Get SOM output for fixed SOM.

The given datapoint doesn’t have to belong to the training set of the input SOM.

Parameters
  • datapoint (np.array, shape=(X.shape[1])) – Datapoint = one row of the dataset X

  • mode (str, optional (default=”bmu”)) – Mode of the regression output calculation

Returns

  • estimation_output (object) – Content of SOM node which is linked to the datapoint. Classification: the label. Regression: the target variable.

  • TODO Implement handling of incomplete datapoints

  • TODO implement “neighborhood” mode

fit(X, y=None)[source]

Fit supervised SOM to the input data.

Parameters
  • X (array-like matrix of shape = [n_samples, n_features]) – The prediction input samples.

  • y (array-like matrix of shape = [n_samples, 1]) – The labels (ground truth) of the input samples

Returns

self

Return type

object

Examples

Load the SOM and fit it to your input data X and the labels y with:

>>> import susi
>>> som = susi.SOMRegressor()
>>> som.fit(X, y)
fit_estimator(X, y)[source]

Fit supervised SOM to the (checked) input data.

Parameters
  • X (array-like matrix of shape = [n_samples, n_features]) – The prediction input samples.

  • y (array-like matrix of shape = [n_samples, 1]) – The labels (ground truth) of the input samples

fit_transform(X, y=None)[source]

Fit to the input data and transform it.

Parameters
  • X (array-like matrix of shape = [n_samples, n_features]) – The training and prediction input samples.

  • y (array-like matrix of shape = [n_samples, 1]) – The labels (ground truth) of the input samples

Returns

Predictions including the BMUs of each datapoint

Return type

np.array of tuples (int, int)

Examples

Load the SOM, fit it to your input data X and transform your input data with:

>>> import susi
>>> som = susi.SOMClassifier()
>>> tuples = som.fit_transform(X, y)
get_estimation_map()[source]

Return SOM grid with the estimated value on each node.

Examples

Fit the SOM on your data X, y:

>>> import susi
>>> import matplotlib.pyplot as plt
>>> som = susi.SOMClassifier()
>>> som.fit(X, y)
>>> estimation_map = som.get_estimation_map()
>>> plt.imshow(np.squeeze(estimation_map,) cmap="viridis_r")
get_random_datapoint()[source]

Find and return random datapoint from labeled dataset.

abstract init_super_som()[source]

Initialize map.

modify_weight_matrix_supervised(dist_weight_matrix, true_vector=None, learningrate=None)[source]

Modify weights of the supervised SOM, either online or batch.

Parameters
  • som_array (np.array) – Weight vectors of the SOM shape = (self.n_rows, self.n_columns, X.shape[1])

  • dist_weight_matrix (np.array of float) – Current distance weight of the SOM for the specific node

  • data (np.array, optional) – True vector(s)

  • learningrate (float, optional) – Current learning rate of the SOM

Returns

modify_weight_matrix – Weight vector of the SOM after the modification

Return type

np.array

predict(X, y=None)[source]

Predict output of data X.

Parameters
  • X (array-like matrix of shape = [n_samples, n_features]) – The prediction input samples.

  • y (None, optional) – Ignored.

Returns

y_pred – List of predicted values.

Return type

list of float

Examples

Fit the SOM on your data X, y:

>>> import susi
>>> som = susi.SOMClassifier()
>>> som.fit(X, y)
>>> y_pred = som.predict(X)
train_supervised_som()[source]

Train supervised SOM.