Welcome to the AAanalysis documentation!

Distribution

License PyPI - Package Version Supported Python Versions Downloads GitHub Stars

Quality

PyPI - Status CI/CD Pipeline Codecov CodeQL Documentation Status

AAanalysis Model Overview

AAanalysis (Amino Acid analysis) is a Python framework for interpretable sequence-based protein prediction. Its foundation are the following algorithms:

  • CPP: Comparative Physicochemical Profiling, a feature engineering algorithm comparing two sets of protein sequences to identify the set of most distinctive features.

  • dPULearn: deterministic Positive-Unlabeled (PU) Learning algorithm to enable training on unbalanced and small datasets.

  • AAclust: k-optimized clustering wrapper framework to select redundancy-reduced sets of numerical scales (e.g., amino acid scales).

In addition, AAanalysis provide functions for loading various protein benchmark datasets, amino acid scales, and their two-level classification (AAontology). We combined CPP with the explainable AI SHAP framework to explain sample level predictions with single-residue resolution.

If you are looking to make publication-ready plots with a view lines of code, see our Plotting Prelude.

You can find the source code of AAanalysis at GitHub.

Install

AAanalysis can be installed from PyPi:

pip install aaanalysis

For extended features, including our explainable AI module, please use the ‘professional’ version:

pip install aaanalysis[pro]

EXAMPLES

Indices and tables

Citation

If you use AAanalysis in your work, please cite the respective publication as follows:

AAclust:

[Breimann24a] Breimann and Frishman (2024a), AAclust: k-optimized clustering for selecting redundancy-reduced sets of amino acid scales, Bioinformatics Advances.

AAontology:

[Breimann24b] Breimann et al. (2024b), AAontology: An ontology of amino acid scales for interpretable machine learning, Journal of Molecular Biology.

CPP:

[Breimann25a] Breimann and Kamp et al. (2025), Charting γ-secretase substrates by explainable AI, Nature Communications.

dPULearn:

[Breimann25a] Breimann and Kamp et al. (2025), Charting γ-secretase substrates by explainable AI, Nature Communications.