aaanalysis.SequencePreprocessor.get_aa_window

static SequencePreprocessor.get_aa_window(seq=None, pos_start=0, pos_stop=None, window_size=None, index1=False, gap='-', accept_gap=True)[source]

Extracts a window of amino acids from a sequence.

This window starts from a given start position (pos_start) and stops either at a defined stop position (pos_stop) or after a number of residues defined by window_size.

Parameters:
  • seq (str) – The protein sequence from which to extract the window.

  • pos_start (int, default=0) – The starting position (>=0) of the window.

  • pos_stop (int, optional) – The ending position (>=``pos_start``) of the window. If None, window_size is used to determine it.

  • window_size (int, optional) – The size of the window (>=1) to extract. Only used if pos_stop is None.

  • index1 (bool, default=False) – Whether position index starts at 1 (if True) or 0 (if False), where the first amino acid is at position 1 or 0, respectively.

  • gap (str, default='-') – The character used to represent gaps.

  • accept_gap (bool, default=True) – Whether to accept gaps in the window. If True, C-terminally padding is enabled.

Returns:

window – The extracted window of amino acids.

Return type:

str

Notes

  • A ValueError is raised if both pos_stop and window_size are None or if both are provided.

Examples

You can obtain a defined amino acid window (a subsequence of defined length) from a protein sequences using the SequencePreprocessor().get_aa_window() method. We first create an example sequence and the SequencePrepreprocessor() object as follows:

import aaanalysis as aa

seq = "ABCDEFGHIJ"
sp = aa.SequencePreprocessor()

Provide the sequence as seq parameter and specify a stop position using the pos_stop parameter:

# Get amino acid window of size 6
window = sp.get_aa_window(seq=seq, pos_stop=5)
print(window)
ABCDEF

You can change the start position (default=0) using the pos_start parameter:

# Get amino acid window of size 4
window = sp.get_aa_window(seq=seq, pos_start=2, pos_stop=5)
print(window)
CDEF

Instead of defining the stop position, you can set a specific length using the window_size parameter:

# Get amino acid window of size 7
window = sp.get_aa_window(seq=seq, pos_start=2, window_size=7)
print(window)
CDEFGHI

If you wish to start counting residue positions from 1 instead of 0, set index1=True:

# Get amino acid window of size 7
window = sp.get_aa_window(seq=seq, pos_start=2, window_size=7, index1=True)
print(window)
BCDEFGH

Selecting too long windows could result into gaps (default=‘-’), which can be disabled setting accept_gaps=False (by default enabled):

# Get amino acid window of size 10 (two gaps)
window = sp.get_aa_window(seq=seq, pos_start=2, window_size=10, accept_gap=True)
print(window)
CDEFGHIJ--