aaanalysis.SequenceFeature.get_feature_positions
- static SequenceFeature.get_feature_positions(features=None, start=1, tmd_len=20, jmd_n_len=10, jmd_c_len=10, tmd_seq=None, jmd_n_seq=None, jmd_c_seq=None)[source]
Create for features a list of corresponding positions or amino acids.
- Parameters:
features (array-like, shape (n_features,)) – List of feature ids.
start (int, default=1) – Position label of first residue position (starting at N-terminus).
tmd_len (int, default=20) – Length of TMD (>0).
jmd_n_len (int, default=10) – Length of JMD-N (>=0).
jmd_c_len (int, default=10) – Length of JMD-C (>=0).
tmd_seq (str, optional) – Sequence of TMD. If given, respective amino acid segments/patterns will be returned instead of positions.
jmd_n_seq (str, optional) – Sequence of JMD-N. If given, respective amino acid segments/patterns will be returned instead of positions.
jmd_c_seq (str, optional) – Sequence of JMD-C. If given, respective amino acid segments/patterns will be returned instead of positions.
- Returns:
List of positions or amino acids for each feature.
- Return type:
list_pos or list_aa
Notes
Length parameters (
tmd_len,jmd_n_len,jmd_c_len) must match with ids infeatures.Length of sequence (
tmd_seq,jmd_n_seq,jmd_c_seq) must match with ids infeatures.
Examples
To obtain feature positions, we retrieve feature ids using the
SequenceFeature().get_features()method:import aaanalysis as aa sf = aa.SequenceFeature() split_kws = sf.get_split_kws(n_split_min=10, n_split_max=10, split_types=["Segment"]) features = sf.get_features(split_kws=split_kws, list_scales=["ARGP820101"]) print(features[0:5])
['TMD-Segment(1,10)-ARGP820101', 'TMD-Segment(2,10)-ARGP820101', 'TMD-Segment(3,10)-ARGP820101', 'TMD-Segment(4,10)-ARGP820101', 'TMD-Segment(5,10)-ARGP820101']
A list of feature positions can now be created using the
SequenceFeature().get_feature_positions()method:feature_names = sf.get_feature_positions(features=features) print(feature_names[0:5])
['11,12', '13,14', '15,16', '17,18', '19,20']
The start position and the length of the sequence parts (tmd_len, jmd_n_len, and jmd_c_len) can be adjusted:
# Shift start position from 1 to 20 feature_names = sf.get_feature_positions(features=features, start=20) print(feature_names[0:5])
['30,31', '32,33', '34,35', '36,37', '38,39']
# Change TMD length from 20 to 100 feature_names = sf.get_feature_names(features=features, tmd_len=40) print(feature_names[0:5])
['Hydrophobicity [11-14]', 'Hydrophobicity [15-18]', 'Hydrophobicity [19-22]', 'Hydrophobicity [23-26]', 'Hydrophobicity [27-30]']
To obtain amino acid segments or patterns, you can provide sequence parts of respective matching to the respective features using the
tmd_seq,jmd_n_seq, andjmd_c_seqparameters:tmd_seq = "ABCDEFGHIJKLMNOPQRST" feature_names = sf.get_feature_positions(features=features, tmd_seq=tmd_seq) print(feature_names[0:5])
['AB', 'CD', 'EF', 'GH', 'IJ']