SOMRegressor

class susi.SOMRegressor(n_rows: int = 10, n_columns: int = 10, init_mode_unsupervised: str = 'random', init_mode_supervised: str = 'random', n_iter_unsupervised: int = 1000, n_iter_supervised: int = 1000, train_mode_unsupervised: str = 'online', train_mode_supervised: str = 'online', neighborhood_mode_unsupervised: str = 'linear', neighborhood_mode_supervised: str = 'linear', learn_mode_unsupervised: str = 'min', learn_mode_supervised: str = 'min', distance_metric: str = 'euclidean', learning_rate_start=0.5, learning_rate_end=0.05, nbh_dist_weight_mode: str = 'pseudo-gaussian', missing_label_placeholder=None, n_jobs=None, random_state=None, verbose=0)[source]

Supervised SOM for estimating continuous variables (= regression).

Parameters
  • n_rows (int, optional (default=10)) – Number of rows for the SOM grid

  • n_columns (int, optional (default=10)) – Number of columns for the SOM grid

  • init_mode_unsupervised (str, optional (default=”random”)) – Initialization mode of the unsupervised SOM

  • init_mode_supervised (str, optional (default=”random”)) – Initialization mode of the supervised SOM

  • n_iter_unsupervised (int, optional (default=1000)) – Number of iterations for the unsupervised SOM

  • n_iter_supervised (int, optional (default=1000)) – Number of iterations for the supervised SOM

  • train_mode_unsupervised (str, optional (default=”online”)) – Training mode of the unsupervised SOM

  • train_mode_supervised (str, optional (default=”online”)) – Training mode of the supervised SOM

  • neighborhood_mode_unsupervised (str, optional (default=”linear”)) – Neighborhood mode of the unsupervised SOM

  • neighborhood_mode_supervised (str, optional (default=”linear”)) – Neighborhood mode of the supervised SOM

  • learn_mode_unsupervised (str, optional (default=”min”)) – Learning mode of the unsupervised SOM

  • learn_mode_supervised (str, optional (default=”min”)) – Learning mode of the supervised SOM

  • distance_metric (str, optional (default=”euclidean”)) – Distance metric to compare on feature level (not SOM grid). Possible metrics: {“euclidean”, “manhattan”, “mahalanobis”, “tanimoto”}. Note that “tanimoto” tends to be slow.

  • learning_rate_start (float, optional (default=0.5)) – Learning rate start value

  • learning_rate_end (float, optional (default=0.05)) – Learning rate end value (only needed for some lr definitions)

  • nbh_dist_weight_mode (str, optional (default=”pseudo-gaussian”)) – Formula of the neighborhood distance weight. Possible formulas are: {“pseudo-gaussian”, “mexican-hat”}.

  • missing_label_placeholder (int or str or None, optional (default=None)) – Label placeholder for datapoints with no label. This is needed for semi-supervised learning.

  • n_jobs (int or None, optional (default=None)) – The number of jobs to run in parallel.

  • random_state (int, RandomState instance or None, optional (default=None)) – If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

  • verbose (int, optional (default=0)) – Controls the verbosity.

Variables
  • node_list_ (np.array of (int, int) tuples) – List of 2-dimensional coordinates of SOM nodes

  • radius_max_ (float, int) – Maximum radius of the neighborhood function

  • radius_min_ (float, int) – Minimum radius of the neighborhood function

  • unsuper_som_ (np.array) – Weight vectors of the unsupervised SOM shape = (self.n_rows, self.n_columns, X.shape[1])

  • X_ (np.array) – Input data

  • fitted_ (bool) – States if estimator is fitted to X

  • max_iterations_ (int) – Maximum number of iterations for the current training

  • bmus_ (list of (int, int) tuples) – List of best matching units (BMUs) of the dataset X

  • sample_weights_ (TODO) –

  • n_regression_vars_ (int) – Number of regression variables. In most examples, this equals one.

  • n_features_in_ (int) – Number of input features

init_super_som()[source]

Initialize map for regression.