aaanalysis.CPPGrid.eval
- CPPGrid.eval(sort_by='avg_ABS_AUC', ascending=None)[source]
Score the swept configurations and return
df_paramsjoined to per-config quality, best-first.Aggregates each configuration’s feature table (
list_df_feat_) into the same discriminative-power columnsCPP.eval()reports —avg_ABS_AUCis the mean of the per-featureabs_aucin thatdf_feat— and joins them ontodf_params. The result is sorted so the best configuration isdf.iloc[0]. Configurations that errored (df_featisNone) getNaNquality and sort last.- Parameters:
sort_by (str, default='avg_ABS_AUC') – Quality column to rank by. One of the added columns (
avg_ABS_AUC,avg_abs_mean_dif,n_features) or any existingdf_paramscolumn.ascending (bool, optional) – Sort direction.
None(default) picks the sensible direction: descending for the higher-is-better metrics (avg_ABS_AUC,avg_abs_mean_dif), ascending for everything else.
- Returns:
df_eval –
df_paramswith appended quality columns, sorted best-first. One row per configuration; the original product-order index is preserved soself.list_df_feat_[i]still maps to row labeli.- Return type:
pd.DataFrame
Notes
Call
run()first; otherwise aRuntimeErroris raised.Redundancy (
n_clusters) is not computed here — grid configurations can use differentdf_parts/df_scales, so the per-set clustering thatCPP.eval()performs is not well-defined across the sweep. UseCPP.eval()on a singledf_featif you need it.
See also
run(): produces thelist_df_feat_/df_params_this consumes.CPP.eval(): the per-configuration evaluator whoseavg_ABS_AUCthis matches.
Examples
After :meth:
CPPGrid.run, :meth:CPPGrid.evalscores every configuration and returnsdf_paramsjoined to per-configuration quality columns (avg_ABS_AUCis the mean of each feature table’sabs_auc), sorted best-first sodf.iloc[0]is the best configuration.import aaanalysis as aa aa.options["verbose"] = False df_seq = aa.load_dataset(name="DOM_GSEC", n=10) labels = df_seq["label"].to_list() grid = aa.CPPGrid(df_seq=df_seq, labels=labels, n_jobs=1, random_state=0) grid.run(params_cpp={"n_filter": [10, 25]}) df_eval = grid.eval(sort_by="avg_ABS_AUC") df_eval
n_filter df_scales n_warnings n_errors avg_ABS_AUC avg_abs_mean_dif n_features 0 10 0 0 0 0.4680 0.23950 10 1 25 0 0 0 0.4548 0.21852 25 The best configuration’s feature table is then
grid.list_df_feat_[df_eval.index[0]].best_i = df_eval.index[0] grid.list_df_feat_[best_i].head()
feature category subcategory scale_name scale_description abs_auc abs_mean_dif mean_dif std_test std_ref p_val_mann_whitney p_val_fdr_bh positions 0 JMD_N_TMD_N-Segment(1,10)-ZIMJ680101 Polarity Hydrophobicity Hydrophobicity Hydrophobicity (Zimmerman et al., 1968) 0.500 0.361 0.361 0.156 0.150 0.000157 1.0 1,2 1 JMD_N_TMD_N-Pattern(C,2,5,8,12)-PALJ810110 Conformation β-sheet β-sheet Normalized frequency of beta-sheet in all-beta... 0.470 0.233 -0.233 0.092 0.095 0.000381 1.0 9,13,16,19 2 TMD-Pattern(N,1,5,8,11)-PALJ810110 Conformation β-sheet β-sheet Normalized frequency of beta-sheet in all-beta... 0.470 0.233 -0.233 0.092 0.095 0.000381 1.0 11,15,18,21 3 TMD_C_JMD_C-Pattern(N,4,8,12)-TANS770105 Conformation β-turn (C-term) β-turn (3rd residue) Normalized frequency of chain reversal S (Tana... 0.470 0.230 -0.230 0.061 0.111 0.000381 1.0 24,28,32 4 TMD_C_JMD_C-Pattern(C,8,12,15)-AURR980102 Conformation Linker (6-14 AA) α-helix (N-terminal, outside) Normalized positional residue frequency at hel... 0.465 0.189 0.189 0.054 0.099 0.000440 1.0 26,29,33