aaanalysis.StructurePreprocessor.build_cat

StructurePreprocessor.build_cat(features=None, dim_names_override=None)[source]

Build the df_cat metadata frame for features.

Pure registry lookup — corpus-free. df_cat[category] is always 'Structure' for every StructurePreprocessor feature; the per-key semantics live in df_cat[subcategory] (see registry).

Parameters:
  • features (list of str) – Feature keys from the StructurePreprocessor registry, in the order they appear along the D axis of the encoder outputs.

  • dim_names_override (list of str, optional) – Replacement names for the D columns; length must equal the total dimensionality across features.

Returns:

df_cat – One row per dimension: scale_id, category, subcategory, scale_name, scale_description. category is the top-level color/redundancy-bucket bucket; subcategory carries the fine-grained semantic split ('DSSP_SS_3state', 'Flexibility_bfactor', etc.).

Return type:

pd.DataFrame, shape (D_total, 5)