freediscovery.cluster.ClusterLabels

class freediscovery.cluster.ClusterLabels(vect, model, lsi=None, method='centroid-frequency', n_top_words=6)[source]

Calculate the cluster labels.

Parameters:
  • vect (VectorizerMixin object) – a scikit-learn’s text vectorizer
  • model (ClusterMixin object) – the cluster object
  • lsi_components (TruncatedSVD object or None) – LSA object if it was used for clustering
  • method (str, optional, default='centroid-frequency') – the method used to compute the centroid labels Only ‘centroid-frequency’ is supported at the moment.
  • n_top_words (int, default=10) – keep only most relevant n_top_words words
predict(centroids=None)[source]

Compute the cluster labels

Parameters:centroids (list, default=None) – if not None, ignore clustering given by the clustering model and compute labels for the given cluster centroids
Returns:cluster_labels
Return type:array [n_samples]