aaanalysis.SequenceFeature.get_features
- SequenceFeature.get_features(list_parts=None, all_parts=False, split_kws=None, list_scales=None)[source]
Create list of all feature ids for given Parts, Splits, and Scales.
- Parameters:
list_parts (list of str, default=["tmd", "jmd_n_tmd_n", "tmd_c_jmd_c"]) – Names of sequence parts which should be created (e.g., ‘tmd’). Length should be >= 1.
all_parts (bool, default=False) – Whether to create DataFrame with all possible sequence parts (if
True) or parts given bylist_parts.split_kws (dict, optional) – Dictionary with parameter dictionary for each chosen split_type. Default from
SequenceFeature.get_split_kws().list_scales (list of str, optional) – Names of scales. Default scales from
load_scales()withname='scales'.
- Returns:
features – Ids of all possible features for combination of Parts, Splits, and Scales with form: PART-SPLIT-SCALE
- Return type:
Notes
If
ext_lenin aaanalysis.options is not set to > 0, following parts containing extended tmd are not considered forall_parts=True: [‘tmd_e’, ‘ext_c’, ‘ext_n’, ‘ext_n_tmd_n’, ‘tmd_c_ext_c’].
Examples
By default, the
SequenceFeature().get_features()method creates all features for the default Parts, Splits, and Scales:import aaanalysis as aa sf = aa.SequenceFeature() features = sf.get_features() print(f"{len(features)} features were created, such as:") print(features[0:5])580140 features were created, such as: ['TMD-Segment(1,1)-ANDN920101', 'TMD-Segment(1,1)-ARGP820101', 'TMD-Segment(1,1)-ARGP820102', 'TMD-Segment(1,1)-ARGP820103', 'TMD-Segment(1,1)-BEGF750101']
Beside the default parts, the default splits can be retrieved using the
SequenceFeature().get_split_kws()method and the scales by using theload_scales()function:split_kws = sf.get_split_kws() list_scales = list(aa.load_scales()) list_parts = ["tmd", "jmd_n_tmd_n", "tmd_c_jmd_c"] features = sf.get_features(list_parts=list_parts, split_kws=split_kws, list_scales=list_scales) n_parts = len(list_parts) n_scales = len(list_scales) n_splits = int(len(features) / (n_parts * n_scales)) print(f"{n_parts} parts x {n_splits} splits x {n_scales} scales = {len(features)} features")3 parts x 330 splits x 586 scales = 580140 features
To obtain features for all
Partssetall_parts=True:features = sf.get_features(all_parts=True) print(f"{len(features)} features were created")1547040 features were created
PartsandScalescan be easily changed by adjusting their respective lists. To changeSplits, you can create a newsplit_kws:split_kws = sf.get_split_kws(split_types=["Segment"], n_split_min=5, n_split_max=5) features = sf.get_features(list_parts=["tmd"], list_scales=["scale_1"], split_kws=split_kws) print(f"{len(features)} features were created: ") print(features)5 features were created: ['TMD-Segment(1,5)-scale_1', 'TMD-Segment(2,5)-scale_1', 'TMD-Segment(3,5)-scale_1', 'TMD-Segment(4,5)-scale_1', 'TMD-Segment(5,5)-scale_1']