aaanalysis.AAclustPlot.correlation

AAclustPlot.correlation(df_corr=None, labels=None, labels_ref=None, cluster_x=False, method='average', xtick_label_rotation=90, ytick_label_rotation=0, bar_position='left', bar_colors='tab:gray', bar_width_x=0.1, bar_spacing_x=0.1, bar_width_y=0.1, bar_spacing_y=0.1, vmin=-1.0, vmax=1.0, cmap='viridis', kwargs_heatmap=None)[source]

Heatmap for correlation matrix with colored sidebar to label clusters.

Parameters:
  • df_corr (pd.DataFrame, shape (n_samples, n_clusters)) – DataFrame with correlation matrix. Rows typically correspond to scales and columns to clusters.

  • labels (array-like, shape (n_samples,)) – Cluster labels determining the grouping and coloring of the side colorbar. It should have the same length as number of rows in df_corr (n_samples).

  • labels_ref (array-like, shape (n_clusters,), optional) – Cluster labels comprising unique values from ‘labels’. Length must match with ‘n_clusters’ in df_corr.

  • cluster_x (bool, default=False) – If True, x-axis (clusters) values are clustered. Disabled for pairwise correlation.

  • method ({'single', 'complete', 'average', 'weighted', 'centroid', 'median', 'ward'}, default='average') – Linkage method from scipy.cluster.hierarchy.linkage() used for clustering.

  • xtick_label_rotation (int, default=90) – Rotation of x-tick labels (names of clusters).

  • ytick_label_rotation (int, default=0) – Rotation of y-tick labels (names of samples).

  • bar_position (str or list of str, default='left') – Position of the colored sidebar (left, right, top, or down). If None, no sidebar is added.

  • bar_colors (str or list of str, default='tab:gray') – Either a single color or a list of colors for each unique label in labels.

  • bar_width_x (float, default=0.1) – Width of the x-axis sidebar, must be >= 0.

  • bar_spacing_x (float, default=0.1) – Space between the heatmap and the colored x-axis sidebar, must be >= 0.

  • bar_width_y (float, default=0,1) – Width of the y-axis sidebar, must be >= 0.

  • bar_spacing_y (float, default=0.1) – Space between the heatmap and the colored y-axis sidebar, must be >= 0.

  • vmin (float, default=-1.0) – Minimum value of the color scale in seaborn.heatmap().

  • vmax (float, default=1.0) – Maximum value of the color scale in seaborn.heatmap().

  • cmap (str, default='viridis') – Colormap to be used for the seaborn.heatmap().

  • kwargs_heatmap (dict, optional) – Dictionary with keyword arguments for adjusting heatmap (seaborn.heatmap()).

Returns:

ax – Axes object with the correlation heatmap.

Return type:

plt.Axes

Notes

  • Ensure labels and df_corr are in the same order to avoid mislabeling.

  • bar_tick_labels=True will remove tick labels and set them as text for optimal spacing so that they can not be adjusted or retrieved afterward (e.g., via ax.get_xticklabels()).

See also

Examples

To showcase the AAclustPlot().correlation() method, we create an example dataset and obtained a DataFrame with pairwise correlations (df_corr) using the AAclust().correlation() method:

import matplotlib.pyplot as plt
import aaanalysis as aa
aa.options["verbose"] = False
# Obtain example scale dataset
df_scales = aa.load_scales(unclassified_out=True).T.sample(50).T
df_cat = aa.load_scales(name="scales_cat")
dict_scale_name = dict(zip(df_cat["scale_id"], df_cat["subcategory"]))
names = [dict_scale_name[s] for s in list(df_scales)]
X = df_scales.T
# Fit AAclust model and retrieve labels, cluster names, and df_corr
aac = aa.AAclust()
labels = aac.fit(X, n_clusters=8).labels_
print(labels)
df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels)
[7 2 3 6 3 3 5 3 6 6 0 7 1 1 7 1 2 3 0 3 3 1 1 1 7 5 3 7 6 3 5 1 3 2 2 6 3
 6 4 6 2 7 6 6 1 6 6 0 7 0]

The pair-wise Pearson correlation can now be visualized using the AAclustPlot().correlation() method. Provide the labels sorted as in df_corr.

aac_plot = aa.AAclustPlot()
aa.plot_settings(font_scale=0.7, weight_bold=False, no_ticks=True)
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted)
plt.show()
../_images/aac_plot_correlation_1_output_3_0.png

Gray bars indicate the clusters. To change their position or provide multiple bars, use the bar_position parameter and adjust their width and spacing by using bar_width_x, bar_width_y, bar_spacing_x, and bar_spacing_y

aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, bar_position=["left", "top"],
                     bar_width_x=1, bar_width_y=0.5, bar_spacing_x=1, bar_spacing_y=0.5)
plt.show()
../_images/aac_plot_correlation_2_output_5_0.png

To obtain the correlation between each scale (y-axis) and the medoids (x-axis), we obtain the medoids using the AAclust().comp_medoids() and AAclust().comp_correlation() methods.

X_ref, labels_ref = aac.comp_medoids(X, labels=labels)
# Creat correlation DataFrane between scales and medoids
df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels, X_ref=X_ref, labels_ref=labels_ref)
# Plot correlation
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted)
plt.tight_layout()
plt.show()
../_images/aac_plot_correlation_3_output_7_0.png

We can re-clustered the x-axis values be setting cluster_x=True. The scipy.cluster.hierarchy.linkage method is internally used, for which the linkage method can be selected by the method parameter (default=average):

aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, cluster_x=True, method="ward")
plt.tight_layout()
plt.show()
../_images/aac_plot_correlation_4_output_9_0.png

To show the names of scales (y-axis) and cluster (x-axis), provide them to the AAclust().comp_correlation() method. The cluster labels (labels_ref) must be given to the AAclustPlot().correlation() method. The xtick_label_rotation parameter can be used to rotate the x-ticks:

# Creat correlation DataFrane between scales and medoids
cluster_names = aac.name_clusters(X, labels=labels, names=names)
dict_cluster = dict(zip(labels, cluster_names))
names_ref = [dict_cluster[i] for i in labels_ref]
df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels, X_ref=X_ref, labels_ref=labels_ref, names=names, names_ref=names_ref)
# Plot correlation
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, labels_ref=labels_ref, xtick_label_rotation=45)
plt.tight_layout()
plt.show()
../_images/aac_plot_correlation_5_output_11_0.png

If the columns of df_corr contain the cluster labels, labels_ref does not need to be provided. The clusters can be colored using the bar_colors parameter.

# Plot correlation without cluster names
df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels, X_ref=X_ref, labels_ref=labels_ref)
n_clusters = len(set(labels_sorted))
colors = aa.plot_get_clist(n_colors=n_clusters)
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, xtick_label_rotation=0,
                     bar_colors=colors, bar_position=["left", "bottom"], bar_width_x=1, bar_width_y=0.2)
plt.tight_layout()
plt.show()
# Plot correlation with cluster names
df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels, X_ref=X_ref, labels_ref=labels_ref,
                                              names=names, names_ref=names_ref)
n_clusters = len(set(labels_sorted))
colors = aa.plot_get_clist(n_colors=n_clusters)
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, xtick_label_rotation=45, labels_ref=labels_ref,
                     bar_colors=colors, bar_position=["left", "bottom"], bar_width_x=1, bar_width_y=0.2)
plt.tight_layout()
plt.show()
../_images/aac_plot_correlation_6_output_13_0.png ../_images/aac_plot_correlation_7_output_13_1.png

While vmin, vmax, anx cmap can be directly adjusted, further keyword arguments for the sns.heatmap() function can be provided by the kwargs_heatmap argument:

df_corr, labels_sorted = aac.comp_correlation(X=X, labels=labels, X_ref=X_ref, labels_ref=labels_ref, names=names, names_ref=names_ref)
# Plot correlation
aac_plot.correlation(df_corr=df_corr, labels=labels_sorted, labels_ref=labels_ref, xtick_label_rotation=45,
                     vmin=-0.5, vmax=0.5, cmap="cividis", kwargs_heatmap=dict(linecolor="black"))
plt.tight_layout()
plt.show()
../_images/aac_plot_correlation_8_output_15_0.png