aaanalysis.AAclust.comp_centers

static AAclust.comp_centers(X, labels=None)[source]

Computes the center of each cluster based on the given labels.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – Feature matrix. Rows typically correspond to scales and columns to amino acids.

  • labels (array-like, shape (n_samples,)) – Cluster labels for each sample in X.

Returns:

  • centers (array-like, shape (n_clusters,)) – The computed center for each cluster.

  • labels_centers (array-like, shape (n_clusters,)) – The labels associated with each computed center.

Examples

The cluster centers can be computed using the AAclust().comp_centers() method:

import aaanalysis as aa
import pandas as pd
# Create example dataset comprising 100 scales
df_scales = aa.load_scales().T.sample(100).T
X = df_scales.T
# Fit AAclust model and obtain clustering centers for 5 clusters
aac = aa.AAclust()
labels = aac.fit(X, n_clusters=5).labels_
centers, labels_centers = aac.comp_centers(X=X, labels=labels)
# Create DataFrame with cluster centers
columns = [f"Cluster {i}" for i in labels_centers]
df_centers = pd.DataFrame(centers.T, columns=columns, index=df_scales.index)
aa.display_df(df_centers)
  Cluster 2 Cluster 1 Cluster 3 Cluster 0 Cluster 4
AA          
A 0.519000 0.437000 0.420000 0.198000 0.477000
C 0.133000 0.432000 0.248000 0.393000 0.627000
D 0.515000 0.420000 0.840000 0.564000 0.228000
E 0.637000 0.609000 0.782000 0.265000 0.214000
F 0.254000 0.627000 0.133000 0.216000 0.790000
G 0.422000 0.068000 0.616000 0.843000 0.383000
H 0.330000 0.569000 0.560000 0.361000 0.415000
I 0.331000 0.544000 0.121000 0.093000 0.854000
K 0.695000 0.674000 0.795000 0.346000 0.240000
L 0.452000 0.646000 0.211000 0.139000 0.746000
M 0.231000 0.710000 0.175000 0.199000 0.726000
N 0.418000 0.392000 0.745000 0.603000 0.243000
P 0.426000 0.223000 0.771000 0.628000 0.353000
Q 0.439000 0.559000 0.746000 0.356000 0.308000
R 0.596000 0.594000 0.690000 0.381000 0.218000
S 0.504000 0.288000 0.694000 0.546000 0.337000
T 0.444000 0.360000 0.599000 0.393000 0.425000
V 0.403000 0.481000 0.244000 0.089000 0.769000
W 0.125000 0.729000 0.202000 0.415000 0.709000
Y 0.291000 0.520000 0.402000 0.387000 0.651000