SeqMut.eval

SeqMut.eval(df_scan=None, th=None)[source]

Evaluate a mutational scan: tag mutations stable/disruptive and summarize per region.

A mutation is disruptive when its |ΔCPP| reaches the threshold; the per-region disruptive rate shows where in the sequence (JMD-N / TMD / JMD-C) substitutions move the CPP profile most.

Parameters:
  • df_scan (pd.DataFrame) – Mutational landscape produced by SeqMut.scan().

  • th (float, optional) – |ΔCPP| threshold above which a mutation is disruptive. If None, the upper tertile (2/3 quantile) of the observed delta_cpp distribution is used.

Returns:

df_eval – One row per entry x region with n_mut, n_disruptive, frac_disruptive, and mean_delta_cpp.

Return type:

pd.DataFrame, shape (n_entry_region, 6)

Examples

:meth:SeqMut.eval tags each scanned mutation stable / disruptive by an |ΔCPP| threshold (default: the upper tertile) and summarizes disruptive rates per entry x region.

import aaanalysis as aa
aa.options["verbose"] = False

df_seq = aa.load_dataset(name="DOM_GSEC", n=10)
labels = df_seq["label"].to_list()
sf = aa.SequenceFeature()
df_parts = sf.get_df_parts(df_seq=df_seq)
split_kws = sf.get_split_kws()
cpp = aa.CPP(df_parts=df_parts, split_kws=split_kws, verbose=False)
df_feat = cpp.run(labels=labels, n_filter=25)

seqmut = aa.SeqMut()
df_scan = seqmut.scan(df_seq=df_seq, df_feat=df_feat, region="tmd")
aa.display_df(seqmut.eval(df_scan=df_scan), n_rows=10, show_shape=True)
CPP using the Python kernel fallback — the compiled Cython extension is not available in this install. Output is bit-exact with the Cython path but ~2x slower. Reinstall via pip install --force-reinstall aaanalysis to fetch a prebuilt wheel.
DataFrame shape: (20, 6)
  entry region n_mut n_disruptive frac_disruptive mean_delta_cpp
1 P16070 tmd 437 150 0.343249 0.251832
2 P09803 tmd 437 152 0.347826 0.195534
3 Q03157 tmd 437 149 0.340961 0.200584
4 P05556 tmd 437 142 0.324943 0.265178
5 Q06481 tmd 437 141 0.322654 0.191197
6 P05067 tmd 437 147 0.336384 0.211840
7 P70180 tmd 437 124 0.283753 0.201154
8 P01135 tmd 437 147 0.336384 0.253857
9 P35070 tmd 437 154 0.352403 0.213124
10 P16234 tmd 437 155 0.354691 0.218947