comp_smooth_scores
- comp_smooth_scores(scores, method='triangular', window=2, sigma=None, peak_preserving=True)[source]
Smooth a per-residue score vector with a NaN-aware, peak-preserving kernel.
Off-by-one positional jitter is universal in windowed protease / Post-Translational Modification (PTM) prediction; smoothing the per-residue score makes nearby high scores reinforce a site. The peak-preserving form takes
max(smoothed, raw)so a true peak is never attenuated below its original height. Pure-numpy, no SciPy.Added in version 1.1.0.
- Parameters:
scores (array-like, shape (n_residues,)) – Per-residue score vector.
NaNpositions are ignored in the weighted average and renormalized over finite neighbours.method (str, default='triangular') – Smoothing kernel:
'triangular'or'gaussian'.window (int, default=2) – Half-width of the kernel (covers
+/- windowresidues).sigma (float, optional) – Gaussian standard deviation; defaults to
window / 2whenNone.peak_preserving (bool, default=True) – If
True, returnmax(smoothed, raw)elementwise.
- Returns:
smoothed – Smoothed score vector, same length as
scores.- Return type:
array-like, shape (n_residues,)
See also
plot_rank()for visualizing per-protein score tracks.
Examples
comp_smooth_scoressmooths a per-residue score track with a NaN-aware, peak-preserving kernel: nearby high scores reinforce a site, butmax(smoothed, raw)ensures a true peak is never attenuated.import numpy as np import aaanalysis as aa track = np.array([0.0, 0.0, 1.0, 0.0, 0.0]) smoothed = aa.comp_smooth_scores(scores=track, method="triangular", window=2) smoothed
array([0.16666667, 0.25 , 1. , 0.25 , 0.16666667])
The
gaussianmethodweights neighbours by a Gaussian of widthsigma, andpeak_preserving(defaultTrue) takesmax(smoothed, raw)so a true peak is never attenuated. Disabling it lets the smoothing pull the peak down:import pandas as pd track = np.array([0.0, 0.0, 1.0, 0.0, 0.0]) df_smooth = pd.DataFrame({ "raw": track, "gaussian, peak_preserving=True": np.round(aa.comp_smooth_scores(scores=track, method="gaussian", sigma=1.0, peak_preserving=True), 3), "gaussian, peak_preserving=False": np.round(aa.comp_smooth_scores(scores=track, method="gaussian", sigma=1.0, peak_preserving=False), 3)}) aa.display_df(df_smooth)
raw gaussian, peak_preserving=True gaussian, peak_preserving=False 1 0.000000 0.078000 0.078000 2 0.000000 0.258000 0.258000 3 1.000000 1.000000 0.403000 4 0.000000 0.258000 0.258000 5 0.000000 0.078000 0.078000