aaanalysis.dPULearnPlot.eval

static dPULearnPlot.eval(df_eval=None, figsize=(6, 4), dict_xlims=None, legend=True, legend_y=-0.175, colors=None)[source]

Plot evaluation output of dPULearn comparing multiple sets of identified negatives.

Evaluation measures can be grouped into ‘Homogeneity’ measures (‘avg STD’ and ‘avg IQR’) assessing the similarity within the sets of identified negatives, and ‘Dissimilarity’ measures (‘avg AUC’, ‘avg KLD’) assessing the dissimilarity between the identified negatives and the other reference groups including positive samples (‘Pos’), unlabeled samples (‘Unl’), and ground-truth negative samples (‘Neg’) if given.

Parameters:
  • df_eval (pd.DataFrame, shape (n_datasets, n_metrics)) –

    DataFrame with evaluation measures for sets of identified negatives. Each row corresponds to a specific dataset including identified negatives. required ‘columns’ are:

    • ’name’: Name of datasets containing identified negatives (typically named by identification approach).

    • ’avg_STD’: Average standard deviation (STD) assessing homogeneity of identified negatives.

    • ’avg_IQR’: Average interquartile range (IQR) assessing homogeneity of identified negatives.

    • ’avg_abs_AUC_DATASET’: Average absolute area under the curve (AUC), which assesses the similarity between the set of identified negatives and other datasets. ‘DATASET’ must be ‘pos’ (positive samples) and ‘unl’ (unlabeled samples), as well as, optionally, ‘neg’ (ground-truth negative samples).

    Optional columns include:

    • ’avg_KLD_DATASET’: The average Kullback-Leibler Divergence (KLD), which measures the distribution alignment between the set of identified negatives and other datasets (‘pos’, ‘unl’, or ‘neg’).

  • figsize (tuple, default=(6, 4)) – Figure dimensions (width, height) in inches.

  • dict_xlims (dict, optional) – A dictionary containing x-axis limits for subplots. Keys should be the subplot axis number ({0, 1, 2, 4}) and values should be tuple specifying (xmin, xmax). If None, x-axis limits are auto-scaled.

  • legend (bool, default=True) – If True, legend is set under dissimilarity measures.

  • legend_y (float, default=-0.175) – Legend position regarding the plot y-axis applied if legend=True.

  • colors (list of str, optional) – List of colors for identified negatives and the following reference datasets: positive samples (‘Pos’), unlabeled samples (‘Unl’), and ground-truth negative samples (‘Neg’).

Returns:

  • fig (plt.Figure) – Figure object for evaluation plot

  • axes (array of plt.Axes) – Array of Axes objects, each representing a subplot within the figure. .

Notes

  • Ground-truth negatives are only shown if provided by df_eval.

See also

Examples

You can evaluate different sets of identified negative samples using the dPULearn().eval() method. Load first one of our example datasets with its respective features:

import matplotlib.pyplot as plt
import aaanalysis as aa
aa.options["verbose"] = False
# Dataset with positive (γ-secretase substrates)
# and unlabeled data (proteins with unknown substrate status)
df_seq = aa.load_dataset(name="DOM_GSEC_PU")
labels = df_seq["label"].to_numpy()
n_pos = sum([x == 1 for x in labels])
df_feat = aa.load_features(name="DOM_GSEC")
aa.display_df(df_seq)
  entry sequence label tmd_start tmd_stop jmd_n tmd jmd_c
1 P05067 MLPGLALLLLAAWTA...GYENPTYKFFEQMQN 1 701 723 FAEDVGSNKG AIIGLMVGGVVIATVIVITLVML KKKQYTSIHH
2 P14925 MAGRARSGLLLLLLG...EEEYSAPLPKPAPSS 1 868 890 KLSTEPGSGV SVVLITTLLVIPVLVLLAIVMFI RWKKSRAFGD
3 P70180 MRSLLLFTFSACVLL...RELREDSIRSHFSVA 1 477 499 PCKSSGGLEE SAVTGIVVGALLGAGLLMAFYFF RKKYRITIER
4 Q03157 MGPTSPAARGQGRRW...HGYENPTYRFLEERP 1 585 607 APSGTGVSRE ALSGLLIMGAGGGSLIVLSLLLL RKKKPYGTIS
5 Q06481 MAATGTAAAAATGRL...GYENPTYKYLEQMQI 1 694 716 LREDFSLSSS ALIGLLVIAVAIATVIVISLVML RKRQYGTISH
6 P35613 MAAALFVLLGFALLG...HQNDKGKNVRQRNSS 1 323 345 IITLRVRSHL AALWPFLGIVAEVLVLVTIIFIY EKRRKPEDVL
7 P35070 MDRAARCSGASSLPL...DITPINEDIEETNIA 1 119 141 LFYLRGDRGQ ILVICLIAVMVVFIILVIGVCTC CHPLRKRRKR
8 P09803 MGARCRSFSALLLLL...RFKKLADMYGGGEDD 1 711 733 GIVAAGLQVP AILGILGGILALLILILLLLLFL RRRTVVKEPL
9 P19022 MCRIAGALRTLLPLL...PRFKKLADMYGGGDD 1 724 746 RIVGAGLGTG AIIAILLCIIILLILVLMFVVWM KRRDKERQAK
10 P16070 MDKFWWHAAWGLCLV...DETRNLQNVDMKIGV 1 650 672 GPIRTPQIPE WLIILASLLALALILAVCIAVNS RRRCGQKKKL
11 P09603 MTAPGAAGRCPPTTW...GSPLTQDDRQVELPV 1 496 518 EGSFSPQLQE SVFHLLVPSVILVLLAVGGLLFY RWRRRSHQEP
12 O94985 MLRRPAPALAPAARL...TRQQQLEWDDSTLSY 1 860 882 PHPFAVVPST ATVVIVVCVSFLVFMIILGVFRI RAAHRRTMRD
13 Q9H4D0 MLPGRLCWVPLLLAL...ARQAQLEWDDSTLPY 1 831 853 SSIQHSSVVP SIATVVIIISVCMLVFVVAMGVY RVRIAHQHFI
14 P78310 MALLLCFVLLCGVVD...IPVMIPAQSKDGSIV 1 236 258 RLNVVPPSNK AGLIAGAIIGTLLALALIGLIIF CCRKKRREEK
15 D3ZZK3 MAGIFYFILFSFLFG...MRTQMQQMHGRMVPV 1 548 570 RIIGDGANST VLLVSVSGSVVLVVILIAAFVIS RRRSKYSQAK
16 Q14118 MRMSVGLSLLLPLSG...KNMTPYRSPPPYVPP 1 753 775 KSSEDDVYLH TVIPAVVVAAILLIAGIIAMICY RKKRKGKLTL
17 Q63155 MENSLGCVWVPKLAF...EGLMKQLNAITGSAF 1 1099 1121 SVTPQKNSNL LVITVVTVGVLTVLVVVIVAVIC TRRSSAQQRK
18 Q61483 MGRRSALALAVVSAL...VLSAEKDECVIATEV 1 545 567 HMESQGGPFP WVAVCAGVVLVLLLLLGCAAVVV CVRLKLQKHQ
19 Q9ERC8 MWILALSLFQSFANV...HLKGNNPYAKSYTLV 1 1595 1617 EGLTTNEGLK ILVTISCILVGVLLLFVLLLVVR RRRREQRLKR
20 P54763 MAVRRLGAALLLLPL...QVMRAQMNQIQSVEV 1 543 565 YQTSIKEKLP LIVGSSAAGLVFLIAVVVIAIVC NRRGFERADS
21 Q15303 MKPATGLWVWVSLLV...TVLPPPPYRHRNTVV 1 653 675 TLPQHARTPL IAAGVIGGLFILVIVGLTFAVYV RRKSIKKKRA
22 P16882 MDLCQVFLTLALAVT...SCGYVSTDQLNKIMQ 1 274 296 ILEACEEDIQ FPWFLIIIFGIFGVAVMLFVVIF SKQQRIKMLI
23 P04439 MAVMAPRTLLLLLSG...DSAQGSDVSLTACKV 1 308 330 WELSSQPTIP IVGIIAGLVLLGAVITGAVVAAV MWRRKSSDRK
24 P08069 MKSGSGGGSPTSLWG...RKNERALPLPQSSTC 1 936 958 AKTGYENFIH LIIALPVAVLLIVGGLVIMLYVF HRKRNNSRLG
25 P27930 MLRLYVLVMGVSAFT...TVLWPHHQDFQSYPK 1 347 369 LRTTVKEASS TFSWGIVLAPLSLAFLVLGGIWM HRRCKHRTGK
26 Q9Y219 MRAQGRGRLPRRLLL...RAVRSINEARYAGKE 1 1083 1105 VVTGGSSTGL LVPVLCGAFSVLWLACVVLCVWW TRKRRKERER
27 P15382 MILSNTTAVTPFLTK...IEQPNTHLPETKPSP 1 44 66 SPRSSDGKLE ALYVLMVLGFFGFFTLGIMLSYI RSKKLEHSND
28 Q9Y6J6 MSTLSNFTQTLEDVF...TIHENIGAAGFKMSP 1 47 69 LQAKVDAENF YYVILYLMVMIGMFSFIIVAILV STVKSKRREH
29 P11627 MVVMLRYVWPLLLCS...SSGATSPINPAVALE 1 1124 1146 VSTTGSFASE GWFIAFVSAIILLLLILLILCFI KRSKGGKYSV
30 P01130 MGPWGWKLRWTVALL...SYPSRQMVSLEDDVA 1 787 809 GRGNEKKPSS VRALSIVLPIVLLVFLCLGVFLL WKNWRLKNIN
31 P16150 MATLLLLLGVLVVSP...APAPDEPEGGDGAAP 1 255 277 FRNPDENSRG MLPVAVLVALLAVIVLVALLLLW RRRQKRRTGA
32 P0CC10 MAQAHIQGSPCPLLP...LFKSGSKENVQETQI 1 573 595 LKDLDDVMKT TKIIIGCFVAITFMAAVMLVAFY KLRKQHQLHK
33 Q07954 MLTPPLLLLLPLLSA...LLGRGPEDEIGDPLA 1 4421 4443 HVFSQQQPGH IASILIPLLLLLLLVLVAGVVFW YKRRVQGAKG
34 O75581 MGAVLRSLLACSFCV...HHLYPPPPSPCTDSS 1 1371 1393 YPTEEPAPQA TNTVGSVIGVIVTIFVSGTVYFI CQRMLCPRMK
35 Q924X6 MGRPELGALRPLALL...KCKRVALSLEDDGLP 1 859 881 GSQMGSTVTA AVIGVIVPIVVIALLCMSGYLIW RNWKRKNTKS
36 Q12866 MGPAPLPLLLGLFLP...LLFADDSSEGSEVLM 1 502 524 STPAPGNADP VLIIFGCFCGFILIGLILYISLA IRKRVQETKF
37 P08581 MKAPAVLAPGILVLL...DDEVDTRPASFWETS 1 933 955 VIVQPDQNFT GLIAGVVSISTALLLLLGFFLWL KKRKQIKDLG
38 P15941 MTPGTQSPFFLLLLL...LSYTNPAVAATSANL 1 1159 1181 SAQSGAGVPG WGIALLVLVCVLVALAIVYLIAL AVCQCRRKNY
39 Q9JKF6 MARMGLAGAAGRWWG...SQNDGSFISKKEWYV 1 355 377 GRRAGQMPTA IIGGVAGSVLLVLIVVGGIIVAL RRRRHTFKGD
40 Q62765 MALPRCMWPNYVWRA...PHPHPHPHSHSTTRV 1 697 719 VDQRDYSTEL SVTIAVGASLLFLNILAFAALYY KKDKRRHDVH
41 O35516 MPALRPAALRALLWL...THMSEPPHSNMQVYA 1 1680 1702 SELESPRNAQ LLYLLAVAVVIILFFILLGVIMA KRKRKHGFLW
42 Q61982 MGLGARGRRRRRRLM...LGPQPEVTPKRQVMA 1 1644 1666 PLEAPEQSVP LLPLLVAGAVFLLIIFILGVMVA RRKREHSTLW
43 P31695 MQPQLLLLLLLPLNF...VHQEIPLNSVVRNLN 1 1441 1463 QAGTRPSANQ LPWPILCSPVVGVLLLALGALLV LQLIRRRRRE
44 Q8CJ26 MLYNVSKGVVYSDTA...VVQVLSSPAESSSVV 1 52 74 FPPEPPGASS NIIPVYCALLATVILGLLAYVAF KCWRSHKQRQ
45 Q63373 MYQRMLRCGAELGSP...SANKNKKNKDKEYYV 1 392 414 AEVIRESSST TGMVVGIVAAAALCILILLYAMY KYRNRDEGSY
46 P15209 MSPWLKWHGPAMARL...QNLAKASPVYLDILG 1 431 453 VADQSNREHL SVYAVVVIASVVGFCLLVMLLLL KLARHSKFGM
47 Q86YL7 MWKVSALLFVLGSAS...IIVVVMRKMSGRYSP 1 130 152 TVEKDGLSTV TLVGIIVGVLLAIGFIGAIIVVV MRKMSGRYSP
48 Q13308 MGAARGSPARPRRLP...EIASALGDSTVDSKP 1 704 726 GSPPPYKMIQ TIGLSVGAAVAYIIAVLGLMFYC KKRCKAKRLQ
49 P10586 MAPEPAPGRTMVPLV...RAALEYLGSFDHYAT 1 1262 1284 PAQQQEEPEM LWVTGPVLAVILIILIVIAILLF KRKRTHSPSS
50 P28828 MRTLGTCLVTLAGLL...YKFCYEVALEYLNSG 1 743 764 PEKQTDHTVK IAGVIAGILLFVIIFLGVVLVM KKRKLAKKRK
51 Q7M729 MSRAGNRGNTQARWL...GLPGSKAEEKPPTKV 1 161 183 VVDKLEKVDN TVTLIILAVVGGVIGLLVCILLL KKLITFILKK
52 O75056 MKPGPPHRAGAAHGA...SVTYQKPDKQEEFYA 1 387 409 KSILERKEVL VAVIVGGVVGALFAAFLVTLLIY RMKKKDEGSY
53 P78324 MEPAGPAPGRLGPLL...EPSFSEYASVQVPRK 1 372 394 AENTGSNERN IYIVVGVVCTLLVALLMAALYLV RIRQKKAQGS
54 Q92673 MATRSSRRESRLPFL...PMITGFSDDVPMVIA 1 2136 2158 SATQAARSTD VAAVVVPILFLILLSLGVGFAIL YTKHRRLQSS
55 Q99523 MERPWGAADGLSRWP...NKSGYHDDSDEDLLE 1 756 778 SPEKQNSKSN SVPIILAIVGLMLVTVVAGVLIV KKYVCGGRFL
56 Q8BGV3 MARGLDLAPLLLLLL...VELKELGEMRSEPSL 1 269 291 PPQFSMKRLT AGVIAVIAVVSVAVVAGVVVLVV TKRRKSGKYK
57 P35590 MVWRVPPFLLPILFL...ENFTYAGIDATAEEA 1 764 786 EEGLDQQLIL AVVGSVSATCLTILAALLTLVCI RRSCLHRRRT
58 P08138 MGAGATGRAMDGPRL...LVESLCSESTATSPV 1 250 272 QPVVTRGTTD NLIPVYCSILAAVVVGLVAYIAF KRWNSCKQNK
59 Q02223 MLQMAGQCSQNEYFD...AALSATEIEKSISAR 1 54 76 SVTNSVKGTN AILWTCLGLSLIISLAVFVLMFL LRKINSEPLK
60 P19438 MGLSTVPDLLLPLVL...LCGPAALPPAPSLLR 1 212 234 VKGTEDSGTT VLLPLVIFFGLCLLSLLFIGLMY RYQRWKSKLY
61 Q06418 MALRRSMGRPGLPPL...RLLLLQQGLLPHSSC 1 429 451 QQGPPHSRTS WVPVVLGVLTALVTAAALALILL RKRRKETRFG
62 P30530 MAWRCPRMGRVPLAW...ADRGSPAAPGQEDGA 1 450 472 EPSTPAFSWP WWYVLLGAVVAAACVLILALFLV HRRKKETRYG
63 Q6EMK4 MCSRVPLLLPLLLLL...PGPGLQSPLHAKPYI 1 577 599 TQAREGNLPL LIAPALAAVLLAALAAVGAAYCV RRGRAMAAAA
64 P12821 MGAASGRRGPGLLLP...SHGPQFGSEVELRHS 2 1257 1276 GLDLDAQQAR VGQWLLLFLGIALLVATLGL SQRLFSIRHR
65 P36896 MAESAGASSFFPLVV...KKTLSQLSVQEDVKI 2 127 149 EHPSMWGPVE LVGIIAGPVFLLFLIIIIVFLVI NYHQRVYHNR
66 Q8NER5 MTRALCSALRQALLL...KKTISQLCVKEDCKA 2 114 136 PNAPKLGPME LAIIITVPVCLLSIAAMLTVWAC QGRQCSYRKK
67 P37023 MTLGSPRKGLLMLLM...LQKISNSPEKPKVIQ 2 119 141 PSEQPGTDGQ LALILGPVLALLALVALGVLGLW HVRRRQEKQR
68 O43184 MAARPLPVSPARALL...YPHQVPRSTHTAYIK 2 707 729 DSGPIRQADN QGLTIGILVTILCLLAAGFVVYL KRKTLIRLLF
69 Q13444 MRLALLWALGLLGAG...SRPAPPPPTVSSLYL 2 695 717 TTQLKATSSL TTGLLLSLLVLLVLVMLGASYWY RARLHQRLCQ
70 Q9Z0F8 MRRRLLILTTLVPFV...KLQRQSRVDSKETEC 2 672 694 NTFGKFLADN IVGSVLVFSLIFWIPFSILVHCV DKKLDKQYES
71 Q9Y3Q7 MFLLLALLTELGRLQ...NRNSSVVSESDDVGH 2 685 707 FYTEKGYNTH WNNWFILSFCIFLPFFIVFTTVI FKRNEISKSC
72 Q9R157 MPLLFILAELAMLFA...KRNERKIVPQGEHKI 2 684 703 TKRLSKNEDS WVILGFFIFLPFIVTFLVGI MKRNERKIVP
73 O35674 MPGRAGVARFCLLAL...EYRSQRVGAIISSKI 2 704 726 VDSGPLPPKS VGPVIAGVFSALFVLAVLVLLCH CYRQSHKLGK
74 O43506 MAVGEPLVHIRVTLL...VLFKKRTKSKEDEEG 2 692 714 MEGLNVMGKL RYLSLLCLLPLVAFLLFCLHVLF KKRTKSKEDE
75 Q9UKJ8 MAVDGTLVYIRVTLL...RQCSGPKETKAHSSG 2 685 707 PASAKRGVFL PLIVIPSLSVLTFLFTVGLLMYL RQCSGPKETK
76 Q9JI76 MECFIMLGADARTLM...KIPSGPKETKASSPG 2 687 709 SGPTSQKRRV IITVLSITVPVLSILICLLIAGL YRIYCKIPSG
77 O75077 MKPPGSSSRQPPLAG...NVKKRRFDPTQQGPI 2 794 816 GPKGPSATNL IIGSIAGAILVAAIVLGGTGWGF KNVKKRRFDP
78 Q9R160 MVAMSEALVHARITL...PSYETVKPPDEWANP 2 698 720 SKKDAPEKPN VIIWLLPIICVAVVLSVLFCLSG ATKKSREAAA
79 Q9R159 MQTTQRASSFAAAED...ENKEDTNEVMNTETE 2 706 728 TEKKHKKSIG LVILFWILFACFSVLFIVFLFFL RSYVELPMSE
80 Q9UKQ2 MLQGLLPVSLLLSVA...KDNPVSTPKDSNPKA 2 664 686 PDCDDSSVVF HFSIVVGVLFPMAVIFVVVAMVI RHQSSREKQK
81 Q9UKF5 MKMLLLLHCLGVFLS...PQLMPSQSQPPVTPS 2 676 698 PPPKRKKKKK FCYLCILLLIVLFILLCCLYRLC KKSKPIKKQQ
82 Q9UKF2 MRSVQIFLSQCRLLL...ESKRPKAKSVKKQKK 2 686 708 GLLRGAIPSS IWVVSIIMFRLILLILSVVFVFF RQVIGNHLKP
83 Q8TC27 MFRLWLLLAGLCGLL...RSKSQDSTQTQSSSN 2 681 703 IMERASGKTE NTWLLGFLIALPILIVTTAIVLA RKQLKKWFAK
84 Q9BZ11 MGWRPRRARGTPLLL...DPQADQVQMPRSCLW 2 13 35 WRPRRARGTP LLLLLLLLLLWPVPGAGVLQGHI PGQPVTPHWV
85 Q99965 MWRVLFLLSGLGGLR...YSSDEQPESESEPKG 2 689 711 IYHSKPMRWP FFLFIPFFIIFCVLIAIMVKVNF QRKKWRTEDY
86 Q3TTE0 MFLLLLLFLHLKGLQ...GSNTNVTSSGGSTSH 2 695 712 HSNLKKNQLQ LILYISLPLLVMISAVVI KQSKLSRVCD
87 Q9H2U9 MLPGCIFLMILLIPQ...SKDSRGIADPNQSAK 2 668 690 VACEETLHVT NITILVVVLVLVIVGIGVLILLV RYRKCIKLKQ
88 Q13443 MGSGARFPSGTLRVR...PARPAPAPPLYSSLT 2 699 718 NEMNTALRDG LLVFFFLIVPLIVCAIFIFI KRDQLWRSYF
89 Q60813 MSVAAAGRGFASSLS...SESSSSSSWSDSDSQ 2 741 763 KMEDEEVNLK VMVLVVPIFLVVLLCCLMLIAYL WSEVQEVVSP
90 Q8R534 MERLKLGKIPEHWCI...AAAEKKDEDEEEGEE 2 705 727 STEELILNLK LIVLAVILVLMILLIIICIISAY TKSETASEAG
91 A2AJA7 MCLPSHLLSTWVLFM...PESITSNPQSPPDLA 2 1161 1183 EVAAPVSVPV AVGGALLFFMFLVLMGLGGWHWL QKQHCPGQRS
92 Q86WK6 MHPHRDPRGLWLLLP...PESVSSVFSDTPIVV 2 371 393 FTLHGHHDTL NTAYTTLVGCILSVVLVLIYLYL TPCRCWCRGV
93 Q8K592 MLGTLGLWTLLPAAA...SPDPVGDTVQVYVNE 2 146 168 EPQATPGGPV WMALLLLGMFLVLLLSSIILALL QRKACRVQGG
94 Q9BXJ7 MGVLGRVLLWLQLCA...SYFVNPLFAGAEAEA 2 361 383 AHVWGSSAAG LAGGVAAAVLLALLVLLVAPPLL RRAGRLRWRR
95 P58335 MVAERSPARSPGSWL...DEVCIWECIEKELTA 2 319 341 IVTATECSNG IAAIIVILVLLLLLGIGLMWWFW PLCCKVVIKD
96 P51693 MGPASPAARGLSRRP...HGYENPTYRFLEERP 2 581 603 APAGTGVSRE AVSGLLIMGAGGGSLIVLSMLLL RRKKPYGAIS
97 O75882 MVAAAAATEARLRRR...RNRKQQPPAQPGTCI 2 1279 1301 AFSQHSNFMD LVQFFVTFFSCFLSLLLVAAVVW KIKQSCWASR
98 P27037 MGAAAKLAFAVFLIS...TMVTNVDFPPKESSL 2 139 161 PVTPKPPYYN ILLYSLVPLMLIAGIVICAFWVY RHHKMAYPPV
99 P56817 MAQALPWLLLWMGAG...RQQHDDFADDISLLK 2 455 477 YNIPQTDEST LMTIAYVMAAICALFMLPLCLMV CQWRCLRCLR
100 Q13145 MDRHSSYIFIWLQLE...VHWGMYSGHGKLEFV 2 154 176 SSKELWFRAA VIAVPIAGGLILVLLIMLALRML RSENKRLQDQ
101 Q13873 MTSSLQRPWRVPWLP...TATTMVSKDIGMNCL 2 152 174 PHSFNRDETI IIALASVSVLAVLIVALCFGYRM LTGDRKQGLH
102 P36894 MPQLYIYIRLLGAYL...KKTLAKMVESQDVKI 2 153 175 IGPFFDGSIR WLVLLISMAVCIIAMIIFSSCFC YKHYCKSISS
103 O00238 MLLRSAGKLNVGTKK...KKTLAKMSESQDIKL 2 126 148 RDFVDGPIHH RALLISVTVCSLLLVLIILFCYF RYKRQETRPR
104 Q9BWV1 MLRGTMTAWRGMRPE...GPLVRVSFETPPLTI 2 856 878 MVARSSDLPY LIVGVVLGSIVLIIVTFIPFCLW RAWSKQKHTT
105 Q13410 MAVFPSSGLPRCLLT...LHSKLIPTQPSQGAP 2 245 267 IPASSLPRLT PWIVAVAVILMVLGLLTIGSIFF TWRLYNERPR
106 Q8WVV5 MEPAAALHFSLPASL...EEGLKLHRVGTHQSL 2 264 286 ALAVILTASP WMVSMTVILAVFIIFMAVSICCI KKLQREKKIL
107 Q96KV6 MEPAAALHFSRPASL...HRELVVPQLPARKKV 2 246 268 ESFMPSRSPC VVILPVIMIILMIPIAICIYWIN NLQKEKKDSH
108 P78410 MKMASSLAFLLLNFH...LTRGEESSSDTNKSA 2 248 270 ADPFFRSAQP WIAALAGTLPILLLLLAGASYFL WRQQKEITAL
109 Q7Z6A9 MKTLPAMLGTGKLFW...VKEAPTEYASICVRS 2 153 175 PSKDEMASRP WLLYRLLPLGGLPLLITTCFCLF CCLRRHQGKQ
110 Q7TST0 MMKGSPSVPPAGCLL...YFTRNSMGLSATAQP 2 249 271 PEPFFPKTCP WKVALVCSVLILLVLLGGISLGI WKEHQVKRRE
111 Q6UXE8 MAFVLILVLSFYELV...EEKGTPIFICPVSWG 2 237 259 FFQPSPWRLA SILLGLLCGALCGVVMGMIIVFF KSKGKIQAEL
112 Q8BJE2 MADFSVFLGFLKQIP...YEPLDPAWAVNEAVS 2 259 281 LPRMSPWKKA FVGTLVVLPLSLIVLTMLALRYF YKLRSFQEKQ
113 A8MVZ5 MAVTCDPEAFLSICF...RQREKNKASLEEERE 2 253 275 FSRSSQFTAW KAALPLILVAMGLVIAGGICIFW KRQREKNKAS
114 Q86VB7 MSKLRMVLLEDSGSA...KEAILSHTEKENGNL 2 1049 1071 KATTGRSSRQ SSFIAVGILGVVLLAIFVALFFL TKKRRQRQRL
115 Q9NR16 MMLPQNSWHIDFGRC...DTSLLGVLPASEATK 2 1360 1382 LKSLNASSGH LALILSSIFGLLLLVLFILFLTW CRVQKQKHLP
116 P55291 MDAAFLLVLGLLAQS...LSPGALLPRHRGRTA 2 604 626 ALLAGGTGLS LGALVIVLASALLLLVLVLLVAL RARFWKQSRG
117 P33146 MGSALLLALGLLAQS...WGPRFARLADMYGHQ 2 603 625 ALRGGGVGVS LGALVIVLASTVVLLVLILLAAL RTRFRGHSRG
118 O75309 MVPAWLWLLCVSVPQ...DPDQPADSVPLKATV 2 786 808 RMKGMPTKLS AVGILVGTLVAIGIFLILIFTHW TMSRKKDPDQ
119 Q9R100 MVSAQLHFLCLLTLY...KVENPQSPENKPLRS 2 784 806 PAGRQDGIPT VGMAVGILLTTFLVIGIILAVVF IRMRKDKVEN
120 Q9H159 MNCYLLLRFMLGIPL...KRLACMFGSAVQSNN 2 596 618 ELVLSMGFKT EVIIAILICIMIIFGFIFLTLGL KQRRKQILFP
121 Q9H251 MGRHVATSCHVAWLL...LRDVIMETPLEITEL 2 3068 3090 LPDDMSALQM AIIVLAILLFLAAMLFVLMNWYY RTVHKRKLKA
122 Q8IXH8 MAMRSGRHPSLLLLL...ATPFEEIYSESGVPS 2 614 636 VELADAEVGL HVGALFPVCAAFVALAVALLFLL RCYFVLEPKR
123 P59862 MDTRGCAWLLLLLSL...STPSEAMCFTSRVPS 2 592 614 ECEEPSDTWL LWWALSPVGAALMVLSAALLCLL RCSCTFGPKR
124 Q8NFZ8 MGRARRFQWPLLLLW...LNGSDGHKRKEEFFI 2 324 346 VVEAQTSVPY AIVGGILALLVFLIICVLVGMVW CSVRQKGSYL
125 O43570 MPRRSLHAAAVLLLV...VIYKPATKMETEAHA 2 305 327 CTAAGLSLGI ILSLALAGILGICIVVVVSIWLF RRKSIKKGDN
126 Q9ULX7 MLFSALLLEVIWILA...RKSVVFTSAQATTEA 2 290 312 QAGSSYTTGE MLSLGVGILVGCLCLLLAVYFIA RKIRKKRLEN
127 Q16790 MAPLCPSPWLPLLIP...GGVSYRPAEVAETGA 2 411 433 AEPVQLNSCL AAGDILALVFGLLFAVTSVAFLV QMRRQHRRGT
128 P27824 MEGKWLLCMLLVLGT...EILNRSPRNRKPRRE 2 483 505 MIEAAEERPW LWVVYILTVALPVFLVILFCCSG KKQTSGMEYK
129 O75976 MASGRDERPPWRLGR...DETDTEEETLYSSKH 2 1300 1322 DNRIFGLPRE LVVTVSGATMSALILTACIIWCI CSIKSNRHKD
130 Q13740 MESKGASSCRLLFCL...EENKKLEENNHKTEA 2 528 550 NREKVNDQAK LIVGIVVGLLLAALVAGVVYWLY MKKSKTASKH
131 P15391 MPPPRLLFFLLFLTP...PAWGGGGRMGTWSTR 2 295 317 LRTGGWKVSA VTLAYLIFCLCSLVGILHLQRAL VLRRKRKRMT
132 P29017 MLFLQFLLLALLLPG...LVLWFKKHCSYQDIL 2 301 323 DIILYWGHHF SMNWIALVVIVPLVILIVLVLWF KKHCSYQDIL
133 P11609 MRYLPWLLLWAFLQV...VYYIWRRRSAYQDIR 2 304 326 ILYWDARQAP VGLIVFIVLIMLVVVGAVVYYIW RRRSAYQDIR
134 P15813 MGCLLFLLLWALLQA...FTSRFKRQTSYQGVL 2 299 321 QDIVLYWGGS YTSMGLIALAVLACLLFLLIVGF TSRFKRQTSY
135 P15812 MLLLFLLFEGLCCPG...NRVLKKWKTRLNQLW 2 303 325 GGHDLIIHWG GYSIFLILICLTVIVTLVILVVV DSRLKKQSSN
136 Q15762 MDYPTLLLALLHVYR...YVNYPTFSRRPKTRV 2 253 275 GKTDNQYTLF VAGGTVLLLLFVISITTIIVIFL NRRRRRERRD
137 P20273 MHLLGPWLLLLVLEY...RPQAQENVDYVILKH 2 684 706 TLTVYYSPET IGRRVAVGLGSCLAILILAICGL KLQRRWKRTQ
138 Q07763 MLGQAVLFTTFLLLR...ARLSRRELENFDVYS 2 226 248 QSVPSNFRFL PFGVIIVILVTLFLGAIICFCVW TKKRKQLQFS
139 Q9HCU0 MLLRLLLAWAAAGPT...PRGSLTGVQTCRTSV 2 686 708 LAEHSQRDDR WLLVALLVPTCVFLVVLLALGIV YCTRCGPHAP
140 Q5ZPR3 MLRRRGSPGMGVHVG...LKHSDSKEDDGQEIA 2 466 488 TGQPMTFPPE ALWVTVGLSVCLIALLVALAFVC WRKIKQSCEE
141 P26842 MARPHPWWLCVLGTL...PIQEDYRKPEPACSP 2 189 211 PPQRSLCSSD FIRILVIFSGMFLVFTLAGALFL HQRRKYRSNK
142 P10747 MLRLLLALNLFPSIQ...YQPYAPPRDFAAYRS 2 154 176 PLFPGPSKPF WVLVVVGGVLACYSLLVTVAFII FWVRSKRSRL
143 P06729 MSFPCKFVASFLLIF...PPHGAAENSLSPSSN 2 212 234 SCPEKGLDIY LIIGICGGGSLLMVFVALLVFYI TKRKKQRSRR
144 Q8IX05 MLRAALPALLLPLLG...VLVVGEENEYPVQFD 2 171 193 RKYLSDNHIL ISALVIASTVILTVLGAIIWFLY KKHSDSRFTT
145 Q9NPF0 MSGGWMAQVGAWRTG...MKESLLLSEQKTSLP 2 231 253 DQSGSPTAYG VIAAAAVLSASLVTATLLLLSWL RAQERLRPLG
146 P28906 MLVRRGARAGPRMPR...NGHSARQHVVADTEL 2 291 313 ASHQSYSQKT LIALVTSGALLAVLGITGYFLMN RRSWSPTGER
147 P04235 MEHSGILASLILIAV...QYSRLGGNWPRNKKS 2 105 127 NCVELDSGTM AGVIFIDLIATLLLALGVYCFAG HETGRPSGAA
148 P07766 MQSGTHWRVLGLCLL...KGQRDLYSGLNQRRI 2 130 152 ENCMEMDVMS VATIVIVDICITGGLLLLVYYWS KNRKAKAKPV
149 P09693 MEQGKGLAVLILAII...DDQYSHLQGNQLRRN 2 115 137 QNCIELNAAT ISGFLFAEIVSIFVLAVGVYFIA GQDGVRQSRA
150 P20963 MKWKALFTAAILQAQ...KDTYDALHMQALPPR 2 31 53 AQSFGLLDPK LCYLLDGILFIYGVILTALFLRV KFSRSADAPA
151 P01730 MNRGVPFRHLLLVLQ...TCQCPHRFQKTCSPI 2 398 420 PTWSTPVQPM ALIVLGGVAGLLLFIGLGIFFCV RCRHRRRQAE
152 P30203 MWLFFGITGLLTAAL...PDSTDNDDYDDISAA 2 402 424 ENKESRELML LIPSIVLGILLLGSLIFIAFILL RIKGKYALPV
153 P11912 MPGGPGVLQALPATI...DVGSLNIGDVQLEKP 2 143 165 FLDMGEGTKN RIITAEGIILLFCAVVPGTLLLF RKRWQNEKLG
154 P40259 MARLALSPVPSHWMV...TGEVKWSVGEHPGQE 2 161 180 LKQRNTLKDG IIMIQTLLIILFIIVPIFLL LDKDDSKAGM
155 P50283 MTQQAVLALLLTLAG...SYSNRKTPCIPNQYQ 2 149 171 SQEPLQTSFS FPAAIAVGFFFTGLLLGVVCSML RKIQIKKLCA
156 P33681 MGHTRRQGTSPSKCP...RRRNERLRRESVRPV 2 247 269 EHFPDNLLPS WAITLISVNGIFVICCLTYCFAP RCRERRRNER
157 Q00609 MACNCQLMQDTPLLK...TFGPEEALAEQTVFL 2 249 271 EDPPDSKNTL VLFGAGFGAVITVVVIVVIIKCF CKHRSCFRRN
158 Q01151 MSRGLQLLLLSCAYS...NKHLGLVTPHKTELV 2 146 168 ETFKKYRAEI VLLLALVIFYLTLIIFTCKFARL QSIFPDFSKA
159 P42081 MDPQCTMGLSNILFV...KSSKTSSCDKSDTCF 2 246 268 LEDPQPPPDH IPWITAVLPTVIICVMVFCLILW KWKKKKRPRN
160 P42082 MDPRCTMGLAILIFV...KELEPQIASAKPNAE 2 246 263 FPSPQTYWKE ITASVTVALLLVMLLIIV CHKKPNQPSR
161 P01732 MALPVTALLLPLALL...VVKSGDKPSLSARYV 2 184 206 TRGLDFACDI YIWAPLAGTCGVLLLSLVITLYC NHRNRRRVCK
162 Q9BYE9 MAQLWLSCFLLPALV...EGPSYTNAGLDTTDL 2 1155 1177 ESDLSKQLIS VIIGLGVALLLVLVIMTMAFVCV RKSYNRKLQA
163 Q6ZTQ4 MQEAIILLALLGAMS...PAFMNRAYPKPHPGK 2 710 732 LRKNVYSPSA WYVPFVITLGSILLLGLLVYLVV LLAKAIHRHC
164 A6H8M9 MVLLRLLVFLFAPVV...RDYLFNTHTGARRWL 2 686 708 TDTEAFWQPQ PWFVVVLTATGALLLLALGWLLG RLLQGLAQLL
165 Q9HBB8 MGSWALLWPPLLFTG...GGGPYDAPGGDDSYI 2 668 690 SEDKRFSVVD MAALGGVLGALLLLALLGLAVLV HKHYGPRLKC
166 Q9D871 MDFSRPSFSPWRWLT...CSRGKTCHKCPWQTN 2 334 356 TVNRELYIPG PLVIFLILLTSLGGAFVCRVLVY SLFQSCSRGK
167 P13688 MGHLSAPLHRVRVPW...SLTATEIIYSEVKKQ 2 433 455 NGLSPGAIAG IVIGVVALVALIAVALACFLHFG KTGRASDQRD
168 Q925P2 MELASAHLHKGQVPW...SPRATETVYSEVKKK 2 422 444 VIFDSTYDIS DVPIAVIITGAVAGVILIAGLAY RLCSRKSRWG
169 Q6ZU64 MFTLTGCRLVEKTQK...AEVLHPVVPLPTDLP 2 187 209 KMKYRPPKTK FFFTVIPQPIFLSPGITLTLPIV FRPLEAKEYM
170 Q9H9P2 MSRVVSLLLGAALLC...LWISKSTRKESGMEV 2 218 240 VTEAGIIPNL IYVVIPTIPLLLLILVAFGTCCF QMLHKSKGRT
171 Q96F05 MWTALVLIWIFSLSL...QVDYLINGMYADSEM 2 400 422 EPLTQAVVDK TLLLVVLLLGVTLFITVLVLFAL QAYESYKKKD
172 Q6NUJ2 MSARAPKELRLALPP...LDTAGEGLLQTVVLS 2 62 84 QQLFQSFSST LVLIVLVTLIFCLIVLSLSTFHI HKRRMKKRKM
173 Q86T13 MRPAFALCLLWQALW...EGALLAESPLGSSDA 2 399 421 PQAFDSSSAV VFIFVSTAVVVLVILTMTVLGLV KLCFHESPSS
174 Q8BG22 MTHRDSTGPVIGLKL...RKKRPSRKENETKFL 2 904 926 VVSRDDLILK GVLTTVGLIAILCLIMVVAHCIF NRKKRPSRKE
175 O14967 MHFQAFWLCLGLLFI...DGPIKSVRKRRVRKD 2 471 493 QLMAAAEGHP WLWLIYLVTAGVPIALITSFCWP RKVKKKHKDT
176 Q8TDQ1 MPLLTLYLLLFWLSG...RGPEEPTEYSTISRP 2 156 178 LDNRHKLLKL SVLLPLIFTILLLLLVAASLLAW RMMKYQQKAA
177 Q08708 MTARAWASWRSSALL...RSSRSRQNWPKGENQ 2 184 206 HPGSLFSNVR FLLLVLLELPLLLSMLGAVLWVN RPQRSSRSRQ
178 A8K4G0 MWLPPALLLLSLSGC...IYMNFSEPLTKDMAT 2 150 172 AVFIGSHKRN HYMLLVFVKVPILLILVTAILWL KGSQRVPEEP
179 Q9H6B4 MSLLLLLLLVSYYVG...TPSMIPSQSRAFQTV 2 234 256 TVQYVQSIGM VAGAVTGIVAGALLIFLLVWLLI RRKDKERYEE
180 Q9BZ76 MASVAWAVLKVLLLL...LRKENESKVSKKEEC 2 1244 1266 EPLVNADRRD SAVIGGVIAVVIFILLCITAIAI RIYQQRKLRK
181 Q6NT55 MLPITDRLLHLLGLE...ENGLWLKVEPLPPRA 2 21 40 LLGLEKTAFR IYAVSTLLLFLLFFLFRLLL RFLRLCRSFY
182 P17927 MGASSPRSPEPVGPP...PRTLQTNEENSRVLP 2 1974 1996 KCTSRTHDAL IVGTLSGTIFFILLIIFLSWIIL KHRKGNNAHE
183 P20023 MGAAGLLGVFLALVA...EAREVYSVDPYNPAS 2 976 998 AVCRSRSLAP VLCGIAAGLILLTFLIVITLYVI SKHRARNYYT
184 Q9NZV1 MYLVAGDRGLAGCGH...QKQNHLQADNFYQTV 2 939 961 LHPSEDSSLD SIASVVVPIIICLSIIIAFLFIN QKKQWIPLLC
185 Q9HC73 MGRLVLLWGAAVFLL...GGFTFVMNDRSYVAL 2 233 252 TPPKPKLSKF ILISSLAILLMVSLLLLSLW KLWRVKKFLI
186 O95727 MWWRVLSLLAWFPLQ...SKLEEKHIQVPESIV 2 288 310 YLGLARKKSG ILLLTLVSFLIFILFIIVQLFIM KLRKAHVIWK
187 Q8VHS2 MKLKRTAYLLFLYLS...EMWIRMPPPALERLI 2 1346 1368 ADDRLLGIFT AVGSGTLALFFILLLAGVASLIA SNKRATQGTY
188 Q5IJ48 MALARPGTPDPQALA...EMDSVLKVPPEERLI 2 1225 1247 PLPLPFPLLE VAVPAACACLLLLLLGLLSGILA ARKRRQSEGT
189 Q9BUF7 MANPGLGLLLALGLP...PPTPNLKLPPEERLI 2 57 79 SSDGNLRPEA ITAIIVVFSLLAALLLAVGLALL VRKLREKRQT
190 Q8NEA5 MDKVQSGFLILFLFL...NKTKNASHNGKMEDL 2 102 124 IRHRPALVKV ILISSVAFSIALICGMAISYMIY RLAQAEERQQ
191 P07333 MGPGVLLLLLVATAW...DIAQPLLQPNNYQFC 2 515 537 AHTHPPDEFL FTPVVVACMSIMALLLLLLLLLL YKYKQKPKYQ
192 P15509 MLLLVTSLLLCELPH...GKGYREEVLTVKEIT 2 324 346 EFGSDDGNLG SVYIYVLLIVGTLVCGIVLGFLF KRFLRIQRLF
193 Q99062 MARLGNCSLTWAALI...LQGIRVHGMEALGSF 2 626 648 LMTLTPEGSE LHIILGLFGLLLLLTCLCGTAWL CCSPNRKNPL
194 Q96PZ7 MTAWRRFQSLLLLLG...AVRFDTTLNTVCTVV 2 3487 3509 SSHYHGTSSG SVAAAILVPFFALILSGFAFYLY KHRTRPKVQY
195 O95196 MGRAGGGGPGRGPPP...DQADLDVNCLQNNLT 2 421 443 RCESIITDFQ VMCVAVGSAALVLLLLFMMTVFF AKKLYLLKTE
196 P16410 MACLGFQRHKAQLNL...PECEKQFQPYFIPIN 2 162 184 IDPEPCPDSD FLLWILAAVSSGLFFYSFLLTAV SLSKMLKKRS
197 Q86XM0 MLMLMLVAAVTMWLR...PPGRHRTPHGGRSDH 2 719 741 IYVYGAFPVQ LVSAGVVILLIISSILGSVWLAY KTPKLLRTAR
198 E9Q9F6 MLVLMLAAAVATMVR...QNRGKVRVAQKHPET 2 755 777 KQLRSEKGQR LLGFCYQILQLCLGVCFCTWLRG KLRQWLRPRR
199 Q5SY80 MSAREVAVLLLWLSC...IYEPLHKPQRKRKKN 2 905 927 TFGLIPSPSV YLVASFLFVLMLLFFTILVLSYF RYMRIYRRYI
200 Q6ZRH7 MCGPAMFPAGPPWPR...DRAEPKEAVERQLMT 2 1073 1095 PKRALFIIMV SASVFVGLVIFYIAFCLLWPLVV KGCTMIRWKI
201 Q86UP6 MELVRRLMPLTLLIL...VNQRADYKYQKLQNY 2 571 593 TPNQPFNSVH LFSFMVLALNVVTVATITVRHFV NQRADYKYQK
202 Q5JRM2 MNLVICVLLLSIWKN...DRGYNQVTSEVTLND 2 48 70 QTKLNYLRRN LLILVGIIIMVFVFICFCYLHYN CLSDDASKAG
203 Q96J86 MDAPRLPVRPGVLLP...AQRSPPPPYPGNARK 2 63 85 YIGNILSGTA IAGIVFGIVFIMGVIAGIAICIC MCMKNHRATR
204 Q61476 MVSSTWGYDPRAGAG...RRSDFQGKERKDVSK 2 367 389 ESNSGGDRYI YGFVAVIAMIDSLIIVKTLWTIL SPNRRSDFQG
205 Q8N8Z6 MVPGARGGGALARAA...DCLTPLNQTAMTALL 2 458 480 EETSTGINIT TVAIPLVLLVVLVFAGMGIFAAF RKKKKKGSPY
206 Q96PD2 MASRAVVRARRCPQC...GAGRDGECDVFKEIL 2 527 549 TTVTPNVTKD VALAAVLVPVLVMVLTTLILILV CAWHWRNRKK
207 Q16832 MILIPRMLLVLFLLL...SFQEIHLLLLQQGDE 2 399 421 PMLKVDDSNT RILIGCLVAIIFILLAIIVIILW RQFWQKMLEK
208 P80370 MTATEALLRVLLLLL...IDMTTFSKEAGDEEI 2 305 327 KTPLLTEGQA ICFTILGVLTSLVVLGTVGIVFL NKCETWVSNL
209 Q6UY11 MPSGCRCLHLVCLLC...LPRDLPPEPGKTTAL 2 307 329 RQEAGLGEPS LVALVVFGALTAALVLATVLLTL RAWRRGVCPP
210 P28068 MITFLPLLLGLSLGC...TPLPGSNYSEGWHIS 2 219 238 PGLSPMQTLK VSVSAVTLGLGLIIFSLGVI SWRRAGHSSY
211 Q96KC8 MTAPCSQPAQLPGRR...KLLVELVQKKKQAKS 2 153 175 YRRVRKMSNA ELALLLFIILTVGHYAVVWSIYL EKQLDELLSR
212 P20036 MRPEDRMFHIRAVIL...KSLRSGHDPRAQGTL 2 223 245 EPIQMPETTE TVLCALGLVLGLVGIIVGTVLII KSLRSGHDPR
213 P01903 MAISGVPVLGFFIIA...KGLRKSNAAERRGPL 2 217 239 APSPLPETTE NVVCALGLTVGLVGIIIGTIFII KGLRKSNAAE
214 P13762 MVCLKLPGGSCMAAL...NQKGHSGLQPTGLLS 2 228 250 SARSESAQSK MLSGVGGFVLGLLFLGTGLFIYF RNQKGHSGLQ
215 Q08554 MALASAAPGSIFCKQ...LEPKFRTLAKTCIKK 2 692 714 DVRPNVILGR WAILAMVLGSVLLLCILFTCFCV TAKRTVKKCF
216 Q19T08 MGTAGAMQLCWVILG...NMNNGKQSLSAEKVL 2 123 145 SPTSETVLTV AAFGVISFIVILVVVVIILVGVV SLRFKCRKSK
217 Q9UNE0 MAHVGDCTQTPWLPV...WAGVVPPASQPHAAS 2 188 210 LSGQGHLATA LIIAMSTIFIMAIAIVLIIMFYI LKTKPSAPAC
218 P00533 MRPSGTAGAALLALL...EYLRVAPQSSEFIGA 2 646 668 CPTNGPKIPS IATGMVGALLLLLVVALGIGLFM RRRHIVRKRT
219 P01133 MLLTLIILLPVVSKF...QQRALDPPHQMELTQ 2 1033 1055 RHAGHGQQQK VIVVAVCVVVLVMLLLLSLWGAH YYRTQKLLSK
220 Q6UXG2 MAEPGHSHHLSARVR...SVPLKTSSGGLDMDL 2 908 930 RVTICKTIDF WLKVGISAGTCTAILLTVLTCYF WKKNQKLEYK
221 P0C7U0 MAGRGWGALWVCVAA...DILDYWKGVSAQHKS 2 418 440 PVPSPSTATH YIMTILGCLFGMVLVLGAVYYCL RRRRRQEEKH
222 Q6PCB8 MRALPGLLEARARTP...ENNVPRHRKNESLGQ 2 264 281 VLSYLVPLKP FLVIVAEVILLVATILLC EKYTQKKKKH
223 Q5UCC4 MAAASAGATRLLLLL...QGGGGGGGGGGGSGR 2 221 243 QKAKNPQEQK SFFAKYWMYIIPVVLFLMMSGAP DTGGQGGGGG
224 Q9NPA0 MAAALWGFFPVLLLL...SGSSKTGKSGAGKRR 2 160 182 IKRESWGWTD FLMNPMVMMMVLPLLIFVLLPKV VNTSDPDMRR
225 Q902F9 MNPSEMQRKAPPRRR...NVGKSKRDQIVTVSV 2 633 655 NLNTVTWVKT IGSTTIINLILILVCLFCLLLVY RCTQQLRRDS
226 Q9Y6X5 MKLLVILLFSGLITG...SRLQLQEDDDDPLIG 2 406 428 LVDQWCINLP EAIAIVIGSLLVLTMLTCLIIIM QNRLSVPRPF
227 Q6UW88 MALGVPISVYLLFNA...LKSPYNVCSGERRPL 2 111 133 TSYAVDSYEK YIAIGIGVGLLLSGFLVIFYCYI RKRCLKLKSP
228 Q60750 MERRWPLGLALLLLL...GHQKRILCSIQGFKD 2 549 571 PVSRSLTGGE IVAVIFGLLLGIALLIGIYVFRS RRGQRQRQQR
229 Q9UF33 MGGCEVREFLLQFGF...LRLHMMHIQEKGFHV 2 549 571 SDMAAEQGQI LVIATAAVGGFTLLVILTLFFLI TGRCQWYIKA
230 P29322 MAPARGRLPPALWVV...MRAQLTSTQGPRRHL 2 541 563 TGKPRPRYDT RTIVWICLTLITGLVVLLLLLIC KKRHCGYSKA
231 P19235 MDHLGASLWPQVGSL...IPAAEPLPPSYVACS 2 251 273 SLLTPSDLDP LILTLSLILVVILVLLTVLALLS HRRALKQKIW
232 Q9NQ60 MNFILFIFIPGVFSL...GSDNEMHENDESVTR 2 184 206 DLEDLKIKIM LGISLMTLLLFVVLLAFCSATLY KLRHLSYKSC
233 P04626 MELAALCRWGLLLAL...PTAENPEYLGLDVPV 2 653 675 PAEQRASPLT SIISAVVGILLVVVLGVVFGILI KRRQQKIRKY
234 P21860 MRANDALQVLGLLFS...YWHSRLFPKANAQRT 2 644 666 LVLIGKTHLT MALTVIAGLVVIFMMLGGTFLYW RGRRIQNKRA
235 O14944 MTAGRRMEMLCAGRV...EYERVTSGDPELPQV 2 118 140 LTVHQPLSKE YVALTVILIILFLITVVGSTYYF CRWYRNRKSK
236 Q925F2 MILQAGTPETSLLRV...VPVMVPAQSQAGSLV 2 252 274 LDVMTGSKAA VVAGAVVGTFVGLVLIAGLVLLY QRRSKTLEEL
237 P58658 MLLPGRARQPPTPQP...SGLDTSLPRNMGQFY 2 322 344 FAYIRAHPER AALLFVSSVCIGLALTLCALVIR ESCAKDFRDL
238 P22794 MPTDMEHTGHYLHLA...KDEEGTEKLTNKQIG 2 136 158 CAENNNNMAM LICLIIIAVLFLICTFLFLSTVV LANKVSSLRR
239 P34910 MDPKYFILILFCGHL...QDLNESLPPPPAELL 2 203 225 QTPQKNNYNS IAAILIGVLLTSMLVAIIIIVLW KCLRKPVLND
240 Q6P995 MARLCRRVPCTLLLG...NIWKKREERPLIPIN 2 353 375 EDSKDITAYH TVFLTAILGGTIVIVIGFFAVLL CYCRDKCGTP
241 Q8TBP5 MKASQCCCCLSHLLA...EDDDNTLFDANHPRR 2 124 146 NPGDKPMTQR ALTVLMVVSGAVLVYFVVRTVRM RRRNRKTRRY
242 Q3ZCQ3 MRAVPLPAPLLPLLL...DDEDEDSTVFDIKYR 2 93 115 ILLRDLPTLK AAVIVAFAFTTLLIACLLLRVFR SGKRLKKTRK
243 Q9BVV8 MGPRVLQPPLLLLLL...LDSDEETVFESRNLR 2 74 93 GASGSALTRS FYVILGFCGLTALYFLIRAF RLKKPQRRRY
244 Q9D3R5 MSLAHTTVLLWAWGS...CQSRCCPNFSAQTLL 2 377 399 PASLSDPETR TAIELTLMGYLLITIFFITIHLC RCCCQSRCCP
245 Q17R55 MPPMLWLLLHFAAPA...HPSPGRRSTQVLVVK 2 329 351 ARKALRGRAD SVLKGLKLVLLVVTVLALLGALL KCIHPSPGRR
246 Q15884 MILLVNLFVLLSVVC...ERPHSLIGVIRETVL 2 86 108 CSAVHLLLKK VLFALCALNALTTTVCLVAAALR YLQIFATRRS
247 Q5JX71 MWTLKSSLVLLLCLT...YHVTICEIWGEESSS 2 51 73 HFRIRQNLPE HTQGWLGSKWLWLLFVVVPFVIL QCQRDSEKNK
248 Q14517 MGRHLALLLLLLLLF...EVTIPPLDSQQHTEV 2 4179 4201 QYVSTPWNIG LAEGIGIVVFVAGIFLLVVVFVL CRKMISRKKK
249 Q9NYQ8 MTIALLGFAIFLLHC...DMVESDYGSCEEVMF 2 4049 4071 IQRGDWGQQE LLIITVAVAFIIISTVGLLFYCR RCKSHKPVAM
250 Q8TDW7 MDIIMGHCVGTRPPA...SLHIPFVETQHQTQV 2 4156 4175 GHSYVGKEEL IGIAVVLFVIFILVVLFIVF RKKVFRKNYS
251 Q8WWV6 MPLFLILCLLQGSSF...PAGASLTAPERNPGP 2 451 470 TFPEDESSSR TLAPVSTMLALFMLMALVLL QRKLWRRRTS
252 P12319 MAPAMESPTLLCVAL...FRLLNPHPKPNPKNN 2 203 225 TVIKAPREKY WLQFFIPLLVVILFAVDTGLFIS TQQQVTFLLK
253 P12318 MTMETQMSQNVCPRN...IYLTLPPNDHVNSNN 2 218 240 PSMGSSSPMG IIVAVVIATAVAAIVAAVVALIY CRKKRISANS
254 P08637 MWQLLLPTALLLLVS...WKDHKFKWRKDPQDK 2 207 229 STISSFFPPG YQVSFCLVMVLLFAVDTGLYFSV KTNIRSSTRD
255 P12314 MWFLTTLLLWVPVDG...QLQEGVHRKEPQGAT 2 289 311 QVLGLQLPTP VWFHVLFYLAVGIMFLVNTVLWV TIRKELKRKK
256 P55899 MGVPRPQPWALGLLL...QDADLKDVNVIPATA 2 298 320 VELESPAKSS VLVVGIVIGVLLLTAAAVGGALL WRRMRSGLPA
257 Q96LA6 MLPRLLLLICAPLCE...LRKANITDVDYEDAM 2 308 330 TGARSNHLTS GVIEGLLSTLGPATVALLFCYGL KRKIGRRSAR
258 Q96LA5 MLLWSLLVIFDAVTE...ENKDSQVIYSSVKKS 2 400 422 DGYRRDLMTA GVLWGLFGVLGFTGVALLLYALF HKISGESSAT
259 Q96P31 MLLWLLLLILTPGRE...ENYENVPRVLLASDH 2 572 594 VTGTSRNRTG LTAAGITGLVLSILVLAAAAALL HYARARRKPG
260 Q68SN8 MSGSFSPCVVFTQMW...ESESPRSRCQMAEKK 2 495 517 FDMTKNRSVP MAAGITVGLLIMAVGVFLFYCWF SRKAGGKPTS
261 Q6DN72 MLLWTAVLLFVPCVG...RTLQEPLSDCEEVLC 2 307 329 SQVLFTPASN WLVPWLPASLLGLMVIAAALLVY VRSWRKAGPL
262 P11362 MWSWKCLLFWAVLVT...PRHPAQLANGGLKRR 2 375 397 RPAVMTSPLY LEIIIYCTGAFLISCMVGSVIVY KMKSGTKKSD
263 Q8N441 MTPSPLLLLLLPPLL...SHVEGKVHQHIHYQC 2 377 399 ASSSSATSLP WPVVIGIPAGAVFILGTLLLWLC QAQKKPCTPA
264 P36888 MPALARDGGQLPLLV...MDLGLLSPQAQVEDS 2 542 564 PGPFPFIQDN ISFYATIGVCLLFIVVLTLLICH KYKKQFRYES
265 F2Z333 MRAPPLLLLLAACAP...LMRPALARPGLRRHP 2 183 205 FTAEPAGMQD IVVAMTAVGGSICVMLVVICLLV AYITENLMRP
266 Q8NAU1 MHPGSPSAWPPRARA...STPEHQGGGLLRSKI 2 150 172 TMKEMGRNQQ LRTGEVLIIVVVLFMWAGVIALF CRQYDIIKDN
267 Q9P2B2 MGRLASRPLLLALLS...ETRRERRRLMSMEMD 2 831 853 VKMDVLNAFK YPLLIGVGLSTVIGLLSCLIGYC SSHWCCKKEV
268 Q5SZK8 MHSAGTPGLSSRRTG...PMVPPQSHHNDSSEV 2 3111 3133 ELNSPSSAVS LVTVVGGTTVGLLTICLTVIAVL MCRGKESFRG
269 P09958 MELRPWLLWVVAATG...GRGERTAFIKDQSAL 2 716 738 AGLLPSHLPE VVAGLSCAFIVLVFVTVFLVLQL RSGFSFRGVK
270 P23188 MELRSWLLWVVAAAG...GRGERTAFIKDQSAL 2 713 735 RLQAGLASHL PEVLAGLSCLIIVLIFGIVFLFL HRCSGFSFRG
271 O95866 MAVFLQLLPLLLSRA...TADPADASTIYAVVV 2 143 165 GPTHGSVYPQ LLIPLLGAGLVLGLGALGLVWWL HRRLPPQPIR
272 D7PDD4 MALVLPLLPLLLSKV...TVVSGDASTVYAVVV 2 141 160 GSTHGYEYPK VLIPLLGVGLVLGLGVAGVV WRRRRLSPPP
273 Q9NU53 MEGAPPGSLALRLLL...GPEKRAENLEDKTCI 2 262 284 LCRFWSNVFP VFFQFLNIMVVGITGAAVVITIL KVFFPVSEYK
274 Q8WWB7 MRGSVECTWGWGHCA...LLLHHKKYSEYQSIN 2 372 394 PVDGLSPLVL GIMAVALGAPGLMLLGGGLVLLL HHKKYSEYQS
275 P02724 MYGKIIFVLLLSEIV...PLSSVEIENPETSDQ 2 92 114 QLAHHFSEPE ITLIIFGVMAGVIGTILLISYGI RRLIKKSPSD
276 Q86XS8 MSCAGRAGPARLAAL...IRATASLNANEVEWF 2 195 217 MPPKNFSRGS LVFVSISFIVLMIISSAWLIFYF IQKIRYTNAR
277 P07359 MPLLLLLLLLPSPLH...DLLSTVSIRYSGHSL 2 533 555 PDFCCLLPLG FYVLGLFWLLFASVVLILLLSWV GHVKPQALDS
278 Q99795 MVGKMWPVLWTLCAV...EQRSTGRESPDHLDQ 2 235 257 TVAVRSPSMN VALYVGIAVGVVAALIIIGIIIY CCCCRGKDDN
279 Q14956 MECLYYFLGFLLLAA...EKDPLLKNQEFKGVS 2 497 519 DRDPASPLRM ANSALISVGCLAIFVTVISLLVY KKHKEYNPIE
280 P40197 MLRGTLLCAVLGLLR...IGQLFRKLIRERALG 2 522 544 TGKGQDHSPF WGFYFLLLAVQAMITVIIVFAMI KIGQLFRKLI
281 P25092 MKTLLLDLALWSLLF...EYLQLNTTDKESTYF 2 432 454 NDITGRGPQI LMIAVFTLTGAVVLLLLVALLML RKYRKDYELR
282 Q02846 MTACARRAGGLPDPG...ERRRKLEKARPGQFS 2 465 487 NICGGGLEPG LVFLGFLLVVGMGLAGAFLAHYV RHRLLHMQMV
283 A0A0U1RPR8 MAGLQQGCHFEGQNW...GLAEPRKSGEAGPGP 2 480 502 CIRGVQPLGS LLTLTIACVLALVGGFLAYFIRL GLQQLRLLRG
284 P51841 MFLGLGRFSRLVLWF...FQRRKAERQLVRNKP 2 468 490 KICHGGIDPA FAMMVCLTLLIALLSINGFAYFI RRRINKIQLI
285 B1B212 MAKSSLSLNWSLLVL...KKKKGALCCSSSSTT 2 211 233 QHNSDTQGLS FTWIVIICIGGIVSFMAFMVFAW CMLKKKKGAL
286 Q8TDQ0 MFSHLPFDCVLLLLL...RQQPSQPLGCRFAMP 2 202 224 LRDSGATIRI GIYIGAGICAGLALALIFGALIF KWYSHSKEKI
287 Q99075 MKLLPSVVLKLFLAA...VENEEKVKLGMTNSH 2 162 184 NRLYTYDHTT ILAVVAVVLSSVCLLVIVGLLMF RYHRRGGYDV
288 Q9QUJ0 MDPPGYLLFLLLLPV...AQEDGRVYINMPGRG 2 35 57 CSGCGTLSLP LLAGLVAADAVMSLLIVGVVFVC MRPHGRPAQE
289 A8MVW5 MGQDAFMEPFGDTLG...EVIQHIPAQQQDHPE 2 350 372 EKLAQKGKSL SPLASITGISLFLIISMCLLFLW KKYQPYKVIK
290 Q14CZ8 MKRERGALSRASRAL...IIREQDEAGPVEISA 2 241 263 VKITVYRRSS LYIILSTGGIFLLVTLVTVCACW KPSKRKQKKL
291 E9Q7X6 MATPRAPRWPPPSLL...NPSFISDESRRRDYF 2 1204 1226 GGLNCGNPYQ LITVVIAAAGGGLLLILGVALIV TCCRKSKNDI
292 Q9BQS7 MESGHLLWALLFMQS...RSILDDSFKLLSFKQ 2 1110 1132 IPIKNVEMLA SVLVAISVTLLLVVLALGGVVWY QHRQRKLRRN
293 Q9UM44 MKAQTALSFFLILIT...APDNGEENVPLSGKV 2 346 365 PSQETASHNK GLWILVPSAILAAFLLIWSV KCCRAQLEAR
294 Q75VT8 MPWTILLFASGSLAI...SSPEPPEFSTFRACQ 2 123 145 QVSFPVPTWI LALSLSLAGAVLFSGLVAITVLV RKAKAKNLQK
295 Q8HWB0 MMLLLPLLAVFLVKR...VMYQPTQVNEGSSPS 2 297 319 APRESGDILR VSTISGTTILIIALAGVGVLIWR RSQELKEVMY
296 Q6MZM0 MPRKQPAGCIFLLTF...DYQQVQSCALPTDAL 2 1115 1137 KNLGPTGAKA ALVILFIIGLLLLITTVILSLRL CSAMKQTDYQ
297 Q08334 MAWSLGSWLGGCLLV...DSCSLGTPPGQGPQS 2 224 246 HDETVPSWMV AVILMASVFMVCLALLGCFALLW CVYKKTKYAF
298 Q64385 MSSSCSGLTRVLVAV...LPGIPNLQRTPENFS 2 369 391 LDHRDPLEQV AVLASLGIFSCLGLAVGALALGL WLRLRRSGKD
299 P42701 MEPLVTWVVPLLFLF...TELSLEDGDRCKAKM 2 546 568 RFSIEVQVSD WLIFFASLGSFLSILLVGVLGYL GLNRAARHLC
300 Q99665 MAHTFRGCSLAFMFI...LTLDQLKMRCDSLML 2 623 645 REFCLQGKAN WMAFVAPSICIAIIMVGIFSTHY FQQKVFVLLA
301 Q14627 MAFVCLAIGCLYTFL...PNTYPKMIPEFFCDT 2 344 363 EDLSKKTLLR FWLPFGFILILVIFVTGLLL RKPNTYPKMI
302 O88786 MAFVHIRCLCFILLC...VDLNKEVCAYEDTLC 2 335 357 WEGYTGPDSK IIFIVPVCLFFIFLLLLLCLIVE KEEPEPTLSL
303 Q60819 MASPQLRGYGVQAIP...MTVRASSKEDEDTGA 2 206 228 ISPHSSKMTK VAISTSVLLVGAGVVMAFLAWYI KSRQPSQPCR
304 Q96F46 MGAARSPPSAVPGPL...SGWDTMGSESEGPSA 2 319 341 TPEPIPDYMP LWVYWFITGISILLVGSVILLIV CMTWRLAGPG
305 Q8NAC3 MPVPWFLLSLALGRS...RGVGPGAGPGAGDGT 2 540 559 PMDKYIHKRW ALVWLACLLFAAALSLILLL KKDHAKGWLR
306 Q8BH06 MGSPRLAALLLSLPL...DYQGSTNSPCGFSCL 2 415 437 VLCPDVSHRH LGLLILALLALTALVGVVLVLLG RRLLPGSGRT
307 O95256 MLCLGWIFLWLVAGE...SRTETTGRSSQPKEW 2 360 382 VQLKEKRGVV LLYILLGTIGTLVAVLAASALLY RHWIEIVLLY
308 Q6PHB0 MHTPGTPAPGHPDPP...TRFMEEWGLHVQMES 2 255 277 VQTSAWKAKV IFWYVFLTSVIVFLFSAIGYLVY RYIHVGKEKH
309 Q6UXL0 MQTFTMVLEEIWTSL...TAVMSPEELLRAWIS 2 233 255 CVEVQGEAIP LVLALFAFVGFMLILVVVPLFVW KMGRLLQYSC
310 Q8N6P7 MRTLLTILTVGSLAA...DSLFRGLALTVQWES 2 227 249 CRVKTLPDRT WTYSFSGAFLFSMGFLVAVLCYL SYRYVTKPPA
311 Q6UWB1 MRGGRGAPFWLWPLP...EELGLLGPPRPQVLA 2 517 539 HLPDNTLRWK VLPGILFLWGLFLLGCGLSLATS GRCYHLRHKV
312 P05362 MAPSSPRPALPALLV...QKGTPMKPNTQATPP 2 481 503 TVNVLSPRYE IVIITVVAAAVIMGTAGLSTYLY NRQRKIKKYR
313 P35330 MSSFACWSLSLLILF...AAWRRLPRAFRARPV 2 224 246 VYEPMQDNQM VIIIVVVSILLFLFVTSVLLCFI FGQHWHRRRT
314 O75144 MRLGSPGLLFLLFSS...GAWAVSPETELTGHV 2 254 276 ENPVSTGEKN AATWSILAVLCLLVVVAVAIGWV CRDRCLQHSY
315 Q9JHJ8 MQLKCPCFVSLGTRQ...YTGPKTVQLELTDHA 2 280 299 PQETHNNELK VLVPVLAVLAAAAFVSFIIY RRTRPHRSYT
316 Q9Y6W8 MKSGLWYFFLFCLRI...AVNTAKKSRLTDVTL 2 142 164 ESQLCCQLKF WLPIGCAAFVVVCILGCILICWL TKKKYSSSVH
317 P98153 MVPKADSGAFLLLFL...PGGGRHSRSSLNTVV 2 346 368 DGNSLFDSMA SGMRLVVSCISSFLILSLLLFMV HRLRQRRRER
318 Q8IVU1 MAVQRAASPRRPPAP...RPAAARVTQPAHSEQ 2 639 661 KEEAANQTST TGIVIGIHIGVTCIIFCVLFLLF GQRGRVLLCK
319 Q9H665 MGPGRCLLTALLLLA...LRVLSKLGSSGVCWA 2 165 187 QQAWPNFLPL VVLVLLLTLAVIAILLFILLWHL CWPKEKADPY
320 A8E0Y8 MACILCVASLFLSLT...KTSLQKEAGEESGHY 2 971 993 VSSLICSSGP LLHFLIVCPFVMLLLLATSFLCL YRKARKLSQL
321 O75054 MKCFFPVLSCLAVLG...CLEPPVLSIHPGAID 2 1125 1147 LQSIICSNDA LFYFVFFYPFPIFGILIITILLV RFKSRNSSKN
322 Q7TSN7 MEGSWRDVLAVLVIL...FDIASPQKVRNVTLV 2 239 261 GEEGPALPTW AIILLAVAFSLLLILIIVLIIIF CCCCASRREK
323 O95976 MGTASRSNIARHLQT...NTYENRRVLSNYERP 2 154 176 IKLLSKELRS FLTALVSLLSVYVTGVCVAFILL SKSKSNPLRN
324 Q61098 MHHEELILTLCILIV...PWREESEARSVLSAP 2 326 348 IPDIPGHVFT GGVTVLVLASVAAVCIVILCVIY KVDLVLFYRR
325 Q9NPH3 MTLLWCVVSLYFYGI...SSDEQGLSYSSLKNV 2 360 382 KVPAPRYTVE LACGFGATVLLVVILIVVYHVYW LEMVLFYRAH
326 Q9HBE5 MPRGWAAPLLLLLLQ...VVIPPPLSSPGPQAS 2 233 255 FQTQSEELKE GWNPHLLLLLLLVIVFIPAFWSL KTHPLWRLWK
327 Q5VWK5 MNQVTIQWDAVIALY...NILESHFNRISLLEK 2 354 376 GHLTSDNRGD IGLLLGMIVFAVMLSILSLIGIF NRSFRTGIKR
328 P14784 MAAPALSWRLPLLIL...LSLQELQGQDPTHLV 2 243 265 PAALGKDTIP WLGHLLVGLSGAFGFIILVYLLI NCRNTGPWLK
329 P31785 MLKPSLPFTSLLFLQ...SPYWAPPCYTLKPET 2 262 284 KENPFLFALE AVVISVGSMGLIISLLCVYFWLE RTMPRIPTLK
330 Q8NI17 MMWTWALWMLPSLCK...FLVSEKLPEHTKGEV 2 521 543 KTLSFSVFEI ILITSLIGGGLLILIILTVAYGL KKPNKLTHLC
331 P26952 MAANLWLILGLLASH...EPALEDCEVTPVTDA 2 333 355 CPPEVMPVKT ALVTSVATVLGAGLVAAGLLLWW RKSLLYRLCP
332 P32927 MVLAQGLLSMALLAL...LSLPPWEVNKPGEVC 2 443 465 SWDTESVLPM WVLALIVIFLTIAVLLALRFCGI YGYRLRRKWE
333 Q01344 MIIVAHVLLILLGAT...YIEKPGVETLEDSVF 2 342 361 GNDEHKPLRE WFVIVIMATICFILLILSLI CKICHLWIKL
334 P40189 MLTLQTWLVQALFIF...SYLPQTVRQGGYMPQ 2 620 642 TPKFAQGEIE AIVVPVCLAFLLTTLLGVLFCFN KRDLIKKHIW
335 P16871 MTILGTTFGMVFSLL...QEEAYVTMSSFYQNQ 2 241 263 INNSSGEMDP ILLTISILSFFSVALLVILACVL WKKRIKPIVW
336 Q01114 MALGRCIAEGWTLER...LTLAQPVALPVSSRA 2 269 291 RQGLLVPRWQ WSASILVVVPIFLLLTGFVHLLF KLSPRLKRIF
337 Q86SU0 MAWPKLPAPWLLLCT...RSEKDSSHSGRSVVI 2 163 185 TSGDPDKEVK LIVLHWLTVIFIILGALLLLLLI GVCWCQCCPQ
338 Q01638 MGFWILAILTILMYS...PRKASSLTPLAAQKQ 2 328 350 LSRKNPIDHH SIYCIIAVCSVFLMLINVLVIIL KMFWIEATLL
339 Q9BZV3 MIMFPLFGKISLGIL...PEFAAFVREQQVEEV 2 1102 1124 CEEFVSEPVI IGITIASVVGLLVIFSAIIYFFI RTLQAHHDRS
340 P17181 MMVVLLGATTLVLVA...ESESKTSEELQQDFV 2 437 459 EKTKPGNTSK IWLIVGICIALFALPFVIYAAKV FLRCINYVFF
341 P33896 MLAVVGAAALVLVAG...YLQSPALRTEPALLC 2 427 449 KLCEKTRPGS FSTIWIITGLGVVFFSVMVLYAL RSVWKYLCHV
342 P15260 MALLFLLPLVMQGVS...SLIGYRPTEDSKEFS 2 248 270 IFNSSIKGSL WIPVVAALLLFLVLSLVFICFYI KKINPLKEKS
343 P38484 MRPTLLWSLLLLLGV...ISFPEKEQEDVLQTL 2 248 270 MADASTELQQ VILISVGTFSLLSVLAGACFFLV LKYRGLIKYW
344 Q8IU57 MAGPERWGPLLLCLL...RTEDRGRTLGHYMAR 2 227 249 CFLLEVPEAN WAFLVLPSLLILLLVIAAGGVIW KTLMGNPWFQ
345 Q9WTL4 MAVPALWPWGVHLLM...NGASDYSAPNGGPGH 2 922 944 LEEEDTGGMR IFLTVTPVGFMLLVTLAALGFFY SRKRNSTLYT
346 Q3MIP1 MSVHYTLNLRVFWPL...PPRSQRTQGFLEGEP 2 46 64 ARAEPADGVD GGFPLLKVAVLLLLSYVLL RCRHAVRQRF
347 Q9NZN1 MKAPIPHLILLYATF...LPLLPRETSISSVIW 2 356 378 SVLLHKRELM YTVELAGGLGAILLLLVCLVTIY KCYKIEIMLF
348 P26006 MGPGPSRAPRAPRLM...MKSQPSETERLTDDY 2 992 1014 LVEELPAEIE LWLVLVAVGAGLLLLGLIILLLW KCGFFKRART
349 P13612 MAWEARREPGPRRAA...RRDSWSYINSKSNDD 2 978 1000 HHQRPKRYFT IVIISSSLLLGLIVLLLISYVMW KAGFFKRQYK
350 P20701 MKDSCITVMAMALLS...KPLHEKDSESGGGKD 2 1090 1112 KVDVVYEKQM LYLYVLSGIGGLLLLLLIFIVLY KVGFFKRNLK
351 Q3UV74 MLGQCTLLPVLAGLL...HHVEPVWNQERQGTQ 2 672 694 LVCAEISNTT ILLGVIVGVLLAVIFLLVYCMVY LKGTQKAAKL
352 A2A863 MAGPCCSPWVKLLLL...SGSLSTHMDQQFFQT 2 711 733 LVHKKKDCPP GSFWWLIPLLIFLLLLLALLLLL CWKYCACCKA
353 P26010 MVALPMVLVLLLVLS...TTINPRFQEADSPTL 2 724 746 VRPQEKGADH TQAIVLGCVGGIVAVGLGLVLAY RLSVEIYDRR
354 P26012 MCGSALAFFTAAFVC...DISKLNAHETFRCNF 2 682 704 QTSECFSSPS YLRIFFIIFIVTFLIGLLKVLII RQVILQWNSN
355 Q8IYV9 MGPHFTLLCAALAGC...QTQVPKEKATDSRQQ 2 291 313 QPLQPEKMLA SRLLGLLICGSLALITGLTFAIF RRRKVIDFIK
356 Q9D9J7 MGPHFTLLLAALANC...FNSDYSGDKSEATEN 2 320 342 QNPEKKMKTR LLILLTLGFVVLVASIIISVLHF RKVSAKLKNA
357 Q6UXV1 MPLALTLLLLSGLGA...VSACTYRQNRKLLLQ 2 187 209 QMDSKYPRNQ ALLGILISVSLAVFVFVVIVVSA CTYRQNRKLL
358 Q5VZ72 MGDLWLFLLLPLSAF...GKIDEKEEKDFRLRK 2 178 200 RKAENREIAL FLILLATAVILGSAVLLFHFCIF HRRKMKAIRR
359 P57087 MARRSRHRLLLLLLR...TMSENDFKHTKSFII 2 239 261 RMQVDDLNIS GIIAAVVVVALVISVCGLGVCYA QRKGYFSKET
360 Q80UL9 MLCLLKLIVIPVILA...SPKASSLVRSSVRSK 2 280 302 KGQQGILNGN QLVIIVGIVCATFLLLPVLILIV KKAKWNKSSV
361 O76095 MLAGAGRPGLPQGRH...DRKALEKVRKQIESI 2 109 126 ALMEQRLFWK FEGAVVCVALIFACLVII RQRQLDRKAL
362 Q5VV43 MAPPTGVLSSLLLLV...SIRNGASFSYCSKDR 2 956 978 WDGESNCEWS IFYVTVLAFTLIVLTGGFTWLCI CCCKRQKRTK
363 Q8IYS2 MWLQQRLKGLPGLLS...PGAKPLFRSKEDPSV 2 592 614 HMAQQDPGLP FLFWFSVASLITLFHLFLFKLIY NEYCGPGAKP
364 Q9Y6H6 METTNGTETWYESLH...SDPYHVYIKNRVSMI 2 57 79 RASLPGRDDN SYMYILFVMFLFAVTVGSLILGY TRSRKVDKRS
365 Q9QZ26 MNCSESQRLQTLLNR...AAGALPALAQGAERV 2 59 81 REATSAKGND AYLYILLIMIFYACLAGGLILAY TRSRKLVEAK
366 Q99706 MSMSPTVIILACLGF...SQTQLASSNVPAAGI 2 243 265 FKTGIARHLH AVIRYSVAIILFTILPFFLLHRW CSKKKDAAVM
367 P83555 MLLWFLSLVCSGFFL...SEFSADTIVYMEIMK 2 337 359 DTKTNNYKNL HILTGLLVTMVLVVIIIFYSCYF SKQNKSQKQA
368 Q96J84 MLSLLVWILTLSDTF...SDYGQRFQQRMQTHV 2 497 519 LEEREVLPVG IIAGATIGASILLIFFFIALVFF LYRRRKGSRK
369 Q6UWL6 MLRMRVPALLVLLFC...AAFPTPSHPRLQTHV 2 511 533 GRRDLLPTVR IVAGVAAATTTLLMVITGVALCC WRHSKASASF
370 Q8IZU9 MKPFQLDLLFVCFFL...SDPSRPLQRRMQTHV 2 536 558 GLEAESVPMA VIIGVAVGAGVAFLVLMATIVAF CCARSQRNLK
371 P10721 MRGARGAWDFLCVLL...STASSSQPLLVHDDV 2 521 543 NNKEQIHPHT LFTPLLIGFVIVAGMMCIIVMIL TYKYLQKPMY
372 P05532 MRGARGAWDLLCVLL...SSASSTQPLLVHEDA 2 524 546 NNKEQIQAHT LFTPLLIGFVVAAGAMGIIVMVL TYKYLQKPMY
373 Q96MU8 MAPPAARLALLSAAA...KGQSQQDDRNPLVSD 2 391 413 MGAGSHRVEG WTVYGLATLLILTVTAIVAKILL HVTFKSHRVP
374 A6NMS7 MSSAQCPALVCVMSR...DSEAPTEEEESEALP 2 1582 1604 EVPGYGYTDK LILALIVTGILTILIILFCLIVI CCHRRSLQED
375 Q8BG84 MSLHPVILLVLVLCL...TDMAESSTYAAIIRH 2 143 165 SDTSWLKTYS IYIFTVVSVIFLLCLSALLFCFL RHRQKKQGLP
376 P13473 MVCFRLFPVPGSGLV...YFIGLKHHHAGYEQF 2 378 400 CSADDDNFLV PIAVGAALAGVLILVLLAYFIGL KHHHAGYEQF
377 Q9UQV4 MPRQLSAAAALFASL...YKIRLRCQSSGYQRI 2 380 402 FGNVDECSSD YTIVLPVIGAIVVGLCLMGMGVY KIRLRCQSSG
378 Q9UJQ1 MDLQGRGVPSIDRLR...QVQIPRDRSQYKHMG 2 236 258 PVDEREQLEE TLPLILGLILGLVIMVTLAIYHV HHKMTANQVQ
379 Q6UX15 MRPGTALQAVLLAVL...RSKESGWVENEIYGY 2 236 258 SREAALNLAY ILIPSIPLLLLLVVTTVVCWVWI CRKRKREQPD
380 Q86UK5 MDPSGSRGRPTWVLA...NFLNAKKAMRALGMD 2 299 321 VTVLPHHGLH AAGFFIAFLLSLVLTWAALFLMV RYQCLKGNML
381 P48357 MICQKFCVVLLHWEF...QTHKIMENKMCDLTV 2 840 862 QDDIEKHQSD AGLYVIVPVIISSSILLLGTLLI SHQRMKKLFW
382 P19256 MVAGSDAGRALGVLS...GILKCDRKPDRTNSN 2 216 238 IPSSGHSRHR YALIPIPLAVITTCIVLYMNGIL KCDRKPDRTN
383 P42702 MMDIYVCLKRPSWMV...GGWSFTNFFQNKPND 2 835 857 MYVVTKENSV GLIIAILIPVAVAVIVGVVTSIL CYRKREWIKE
384 Q96FE5 MQVSKRMLAGGVRSM...ISSADAPRKFNMKMI 2 560 582 FPFDIKTLII ATTMGFISFLGVVLFCLVLLFLW SRGKGNTKHN
385 Q6UY18 MDAATAPKQAWPPWP...GDKNSGGNRVTAKLF 2 535 557 FFLDSRGVAM VLAVGFLPFLTSVTLCFGLIALW SKGKGRVKHH
386 O75022 MTPALTALLCLGLSL...EPPAEPSIYATLAIH 2 442 464 PPSTPGLGRY LEVLIGVSVAFVLLLFLLLFLLL RRQRHSKHRT
387 Q8VCD3 MLEIRGLSPSLCLLS...IGVLRRQPISPSMQA 2 439 461 SGWLLGSSTC LHTSIFLFFLLLQTVGFFCYVNF SRQELDKRLQ
388 Q9H0V9 MAATLGPLGSWQQWR...ILYNKWQEQSRKRFY 2 314 336 APLPPLSGLA LFLIVFFSLVFSVFAIVIGIILY NKWQEQSRKR
389 Q12907 MAAEGWIWRWGWGRR...AVVFQKRQERNKRFY 2 323 345 FRSGPLTGWR VFLLLLCALLGIVVCAVVGAVVF QKRQERNKRF
390 A0A1B0GTW7 MLLLLLLLLLLPPLV...ELHSTRVPVRGIREV 2 734 756 TSDHNPSMTH LRLSMGLCLMLLILVGVMGTTAY QKRATLPVRP
391 Q86YD5 MWLLGPLCLLLSSAA...AEPRDSEPSQGTEEV 2 172 194 SENQLVYYPS ITYAIIGSSVIFVLVVALLALVL HHQRKRNNLM
392 Q8TF66 MPLKHYLLLLVGCQA...RSQAVLMQMKAPNEC 2 539 561 VWGMTQAQSG LAIAAIVIGIVALACSLAACVGC CCCKKRSQAV
393 Q9H756 MKVTGITILFWPLSM...IEDKYIDIHELCEEN 2 269 291 RNSEHEPLGK SWAFLVGVVVTVLTTSLLIFIAI KCPIWYNILL
394 Q8N386 MGGTLAWTLLLPLLL...GQAPMDEEEYVIPGH 2 166 188 SCAPGLASAT IGAVVVSGCLLLGLAIAGPVLAW RLWRCRVARS
395 Q2I0M4 MRGPSWSRPRPLLLL...ASPADPGSPAAAAQA 2 265 287 QPLALRDLAV VYTLGPASFLVSLASCLALGSGL TACRARRRRL
396 Q14392 MRPQILLLLALLTLG...CCCVRRQKFNQQYKA 2 629 651 EKGGLKNINL IIILTFILVSAILLTTLAACCCV RRQKFNQQYK
397 Q86YC3 MELLPLWLCLGFHFL...LLQVIKSRCHWSSVY 2 652 674 KWERLDLGLL YLVLILPSCLTLLVACTVIVLTF KKPLLQVIKS
398 Q5VT99 MRPRAPACAAAALGL...APNKDAEDEDEDKDD 2 251 273 FSLSLTDLCI IIFSGVAVSIAAIISSFFLATVV QCLQRCAPNK
399 Q9BTN0 MAILPLLLCLLPLAP...LDCEPWGPGHEPVGP 2 537 559 CGAPHAPFLG GTMIIALGGVIVASVLVFIFVLL MRYKVHGGQP
400 Q96JA1 MARPVRGGLGAPRRS...LPGKQRVPLLLAPKS 2 793 815 AAGCRKDGTT VGIFTIAVVSSIVLTSLVWVCII YQTRKKSEEY
401 Q9P2V4 MRVALGMLWLLALAW...AFGVKGGRRINEYFC 2 530 552 DAENTQQLIN VVVISVAIVIALPLTLLVCCSAL QKRCRKCFNK
402 A6NDA9 MASVFHYFLLVLVFL...DTEGDKEKGGTEDNS 2 463 485 DAGGLEAREH LLHVTVVLCVVLLAVPVGAYAWA AQGPCSCSKW
403 Q3SXY7 MHLFACLCIVLSFLE...QVTFKSEGSRPEYYC 2 581 603 TERVEGDDSQ WSLLLVVTSTACVVILPLICFLL YKVCKLQCKS
404 Q8ND94 MLGSPCLLWLLAVTF...WGCPRRAAARAAGAL 2 198 220 VPPNPRTLVH AAVGVGTALALLSCAALVWHFCL RDRWGCPRRA
405 Q86VZ4 MASVAQESAGSQRRL...ITSEESDYLINGMYL 2 451 473 GGEHPAPETG AVLPLALGLAITALLLLMVACRL RLVKQKLKKA
406 Q9Y561 MACRWSTKESPRWRS...TLKNETSDDEALLLC 2 13 32 CRWSTKESPR WRSALLLLFLAGVYGNGALA EHSENVHISG
407 O75096 MRRQWGALLLGALLC...TGWKHERKLSSESQV 2 1724 1746 VPAAPGEGLH ISYAIGGLLSILLILVVIAALML YRHKKSKFTD
408 Q8WUT4 MRQTLPLLLLTVLRP...NPAFDDYPLGLQTVS 2 683 705 AFTTKPSFAL LLSGLCAASGLLLASTVVLSACL CRRGQTLGLQ
409 Q86UE6 MDFLLLGLCLYWLLR...GSCTCHQQPARECEV 2 428 450 HAENAVQIHK VVTGTMALIFSFLIVVLVLYVSW KCFPASLRQL
410 Q9HBL6 MKGELLLFSSVIVLL...PGKVEEKERFDSSPA 2 286 308 KPRPANLRHA IATVIITGVVCGIVCLMMLAAAI YGCTYAAITA
411 Q5SQ64 MAVLFLLLFLCGTPQ...NIHLARLGPPAHKPR 2 235 257 CAPSTGWDMP WILMLLLTMGQGVVILALSIVLW RQRVRGAPGR
412 O60449 MRTGWATPRRPAGLL...QGVNEDEIMLPSFHD 2 1669 1691 CKVPLGPDYT AIAIIVATLSILVLMGGLIWFLF QRHRLHLAGF
413 Q9HBG7 MVAPKSHTDDWAPGP...NDLEIPESPTYENFT 2 455 476 ICSGPERNTK LWIGLFLMVCLLCVGIFSWCIW KRKGRCSVPA
414 P14151 MIFPWKCQSTQRDLW...LKKGKKSKRSMNDPY 2 333 355 FSMIKEGDYN PLFIPVAVMVTAFSGLAFIIWLA RRLKKGKKSK
415 P16581 MIASQFLSALTLVLL...SLESDGSYQKPSYIL 2 556 578 TCEAPTESNI PLVAGLSAAGLSLLTLAPFLLWL RKCLRKAKKF
416 P16109 MANCQIAILYQRFQR...GTYGVFTNAAFDPSP 2 773 795 AGPLTIQEAL TYFGGAVASTIGLIMGGTLLALL RKRFRQKDDG
417 Q9Y5Y7 MARCFSLVLLLTSIW...KSPSKTTVRCLEAEV 2 236 258 FKNEAAGFGG VPTALLVLALLFFGAAAGLGFCY VKRYVKAFPF
418 P20916 MIFLTALPLFWIMIS...TLTEELAEYAEIRVK 2 511 533 LPFQGAHRLM WAKIGPVGAVVAFAILIAIVCYI TQTRRKKNVT
419 Q5VYJ5 MLFFLDRMLAFPMNE...GTTSGSLETLSHHLK 2 2075 2097 TDFTYAQNNT WTLLGIGLAFLMTHITVAVLCFL ANRKVPIRKT
420 Q9H8J5 MFFGGEGSLTYTLVI...YSRLDYLINGIYVDI 2 386 408 QYGLPFEKWL LIGSLLFGVLFLVIGLVLLGRIL SESLRRKRYS
421 A6NHS7 MHVAEVAVNVILLLS...SLQIKNRNHMKENSS 2 286 308 DEVSVTSKTW LVSVALCTSVIFLGCCIVILASG CCGKQQGQYK
422 Q3UU94 MRAVELLLLLGLASM...RSASGCRRNTLKENS 2 284 306 EPWDGAPASA GVWLACVTLGAAVISLCCRVVLG TSRCCGKRQG
423 Q14703 MKLVNIWLLLLVVLL...PQLMQQVHPPKTPSV 2 999 1021 IMPGRYNQEV GQTIPVFAFLGAMVVLAFFVVQI NKAKSRPKRR
424 P15529 MEPPGRRECPFPSWR...YLTDETHREVKFTSL 2 344 366 PEEGILDSLD VWVIAVIVIAIVVGVAVICVVPY RYLQRRKKKG
425 Q96KG7 MVISLNSCLSFICLL...EDSGGSSSNSSSSSE 2 856 878 STALPADSYQ IGAIAGIIILVLVVLFLLALFII YRHKQKGKES
426 A6BM72 MVLSLTGLIAFSFLQ...VRQSPANGPSQDKQS 2 849 871 SPALGAERHS VGAVTGIMLLLFLIVVLLGLFAW HRRRQKEKGR
427 Q7Z7M0 MALGKVLAMALVLAL...RKGLLSQDNLTSMSL 2 2648 2670 FFRQDQAHID LFVFFSVFFSCFFLFLSLCVLLW KAKQALDQRQ
428 Q9H1U4 MNGGAERAMRSLPSL...NGQLTLTTPIHNYKA 2 515 537 LADVSWTQFN IIILTVIIIVVVLLMGFVGAVYM YREYQNRKLN
429 Q16820 MDLWNLSWFLFLDAL...SSNRPNLTPQNQHAF 2 654 676 EKRGSTRDTI VIAVSSTVAVFALMLIITLVSVY CTRKKYRERM
430 Q9H9K5 MGSLSNYALLQLTLT...AMKGLTTHQYDTSLL 2 490 512 FAKVGDWFRS WGYVLLIVLFCLFIFVLIYVRVF RKSRRSLNSQ
431 O75121 MDRLKSHLTVCFLPS...VTHDKNTCIIYESHV 2 150 172 LRVIFTSGDM GVYYMVVCLVAFTIVMVLNITRL CMMSSHLKKT
432 Q29983 MGLGPVFLLLAGIFP...PLMSDLGSTGSTEGA 2 306 328 PSGKVLVLQS HWQTFHVSAVAAAAIFVIIIFYV RCCKKKTSAA
433 P51512 MILLTFSTGRRLDFV...PRHILYCKRSMQEWV 2 565 587 LDNTASTVKA IAIVIPCILALCLLVLVYTVFQF KRKGTPRHIL
434 Q8TD46 MLCPWRTANLGLLLI...SEALQSEVDTDLHTL 2 244 266 VPGAKKSAKL YIPYIILTIIILTIVGFIWLLKV NGCRKYKLNK
435 Q6Q8B3 MSAPRLLISIIIMVS...GFVFFQRINHVRKVL 2 239 261 RTSGSPALSL LIILYVKLSLFVVILVTTGFVFF QRINHVRKVL
436 Q2M385 MNNFRATILFWAAAA...ATGDTTYQEQGQSPA 2 656 678 HGDGGGLSGG AAAGVTVGVTTILAVVITLAIYG TRKFKKKAYQ
437 P20645 MFPFYSCWRTGLLLL...LGEESEERDDHLLPM 2 188 210 ACSPEISHLS VGSILLVTFASLVAVYVVGGFLY QRLVVGAKGM
438 P11717 MGAAAGRSPHLGPAP...LVSFHDDSDEDLLHI 2 2305 2327 MHKGLSERSQ AVGAVLSLLLVALTCCLLALLLY KKERRETVIS
439 Q3TEW6 MAEAVGAVALIAAPA...INKSESVVYADIRKD 2 162 191 LHVVEIDNLL VFLVWVVVGTVTAVVLGLTLLISLVLVVLY RRKHSKRDYT
440 O60487 MYGKSSTRAVLLLLG...LNQEKKVSVYLEDTD 2 153 175 VHTVRFSEIH FLALAIGSACALMIIIVIVVVLF QHYRKKRWAE
441 Q6UWV2 MQQRGAAGSRGCALF...VRCAECLDSDYEETY 2 159 181 ERGFGTMLSS VALLSILVFVPSAVVVALLLVRM GRKAAGLKKR
442 Q61830 MRLLLLLAFISVIPV...KDLMGNIEQNEHAII 2 1388 1410 MDPQPKGSSK AAGVVTVVLLIVIGAGVAAYFFY KKRHALHIPQ
443 Q13505 MLLGGPPRSPRSGTS...PGTRTLGMAEEDEEE 2 421 443 EEEPYRRRNQ ILSVLAGLAAMVGYALLSGIVSI QRATPARAPG
444 Q9UKN1 MLVIWILTLALRLCA...TELHIQRPEMVASTV 2 5381 5403 EFNIAKSLVY GIVGAVMAVLLLALIILIILFSL SQRKRHREQY
445 Q9H3R2 MKAIIHLTLLALLSV...QNPYSRHSSMPRPDY 2 421 443 SGLDCKDKFQ LILTIVGTIAGIVILSMIIALIV TARSNNKTKH
446 Q8C6Z1 MLTLAKIALISSLFI...DGIPMDAIPPLRPSI 2 235 257 DTPKENKNTG IVFGAILGAILGASLLSLVGYLL CGQRKTDSFS
447 Q8WXI7 MLKPSGLPGSSSPTR...CPGYYQSHLDLEDLQ 2 14453 14475 PLTGNSDLPF WAVILIGLAGLLGVITCLICGVL VTTRRRKKEG
448 Q5SSG8 MKMQKGNVLLMFGLL...VSSIAMEMSGRNSGP 2 480 502 KPGGSLVPWE IFLITLVSVVAAVGLFAGLFFCV RNSLSLRNTF
449 Q04900 MSRLSRSLLWAATCL...LYKFCKSKERNYHTL 2 164 186 QPVRKSTFDA ASFIGGIVLVLGVQAVIFFLYKF CKSKERNYHT
450 Q9ULC0 MELLQVTILFLLPSI...SHESGEHSAQGKTKN 2 191 213 TSATSRSYSS IILPVVIALIVITLSVFVLVGLY RMCWKADPGT
451 Q3MIW9 MAQPVHSLCSAFGLQ...MEQQNLGMGQIPSPR 2 447 469 QMGENDSFPA WAIVIVVLVAVILLLVFLGLIFL VSYMMRTRRT
452 Q9BRK3 MALPSRILLWKLVLL...KYIDLDKGFRKENCK 2 341 363 VPESRAHFFQ QLGYVLATLLLFILLLVTVLLAA RRRRGGYEYS
453 P25189 MAPGAPSSSPSPILA...EKKAKGLGESRKDKK 2 157 179 FEKVPTRYGV VLGAVIGGVLGVVLLLLLLFYVV RYCWLRRQAA
454 Q9UK23 MATSTGRWLLLRLAL...AEKEQPGGAHNPFKD 2 450 472 GELSFFTRTA WLALTLALAFLLLISTAANLSLL LSRAERNRRL
455 P13591 MLQTKDLIWTLFFLG...VPNDATQTKENESKA 2 724 746 SPTSGLSTGA IVGILIVIFVLLLVVVDITCYFL NKCGLFMCIA
456 O35136 MSLLLSFYLLGLLVR...VSNDIIQSKEDDIKA 2 696 718 PKPNIIKDTL FNGLGLGAIIGLGVAALLLILVV TDVSCFFIRQ
457 Q5T1S8 MTTATPLGDTTFFSL...ATVTFSPVDVQVETR 2 29 51 TRGEDFLYKS SGAIVAAVVVVVIIIFTVVLILL KMYNRKMRTR
458 O76036 MSSTLPALLCVGLCL...ASTWEGRRRLNTQTL 2 256 274 HALWDHTAQN LLRMGLAFLVLVALVWFLV EDWLSRKRTR
459 O95944 MAWRALHPLLLLLLL...VARTKISDDDDEHTL 2 193 215 LRPGPAAPIA LVPVFCGLLVAKSLVLSALLVWW GDIWWKTMME
460 Q96NY8 MPLSLGAEMWGPEAW...PTGNGIYINGRGHLV 2 350 372 GKQVDLVSAS VVVVGVIAALLFCLLVVVVVLMS RYHRRKAQQM
461 Q8TDF5 MIHGRSVLHIVASLI...GSLSKHESEYNTTRV 2 343 365 SLLDQLTNTS GTVIGVTSCIVIILIIISVIVQI KQPRKKYVQR
462 Q8NET5 MENQPVRWRALPGLP...RFEDDGELNLVYENL 2 164 186 YREPPQSPQK LLLFGFTGLLSVLSVVGTALLLW NKKRMRGPGK
463 Q92542 MATAGGGSGADPGSR...DVLFIAPREPGAVSY 2 670 692 IFLIASKELE LITLTVGFGILIFSLIVTYCINA KADVLFIAPR
464 O60500 MALGTTLRASLLLLG...LEPDSLPFELRGHLV 2 1064 1086 PSGPSGLPLL PVLFALGGLLLLSNASCVGGVLW QRRLRRLAEG
465 Q68D85 MTWRAAASTCAALLI...PVLSSQPPTLLLPLQ 2 262 284 LSETEKTDNF SIHWWPISFIGVGLVLLIVLIPW KKICNKSSSA
466 O35181 MSEGAAGASPPGAAS...FVLRNEIQRDSVLTK 2 363 385 MESEDVYQRQ VLSISCIIFGIVIVGMFCAAFYF KSKKQAKQIQ
467 Q8WWG1 MPTDHEEPCGPSHKS...VETSSTSAHHSHEQH 2 61 83 PGSSIQTKSN LFEAFVALAVLVTLIIGAFYFLC RKGHFQRASS
468 O14786 MERGLPLLCAVLALV...LKKDKLNTQSTYSEA 2 857 879 PGNVLKTLDP ILITIIAMSALGVLLGAVCGVVL YCACWHNGMS
469 Q96PE5 MSFSLNFTLPANTTS...RRRGLWWLVPRLSLE 2 31 53 GKETDCGPSL GLAAGIPLLVATALLVALLFTLI HRRRSSIEAM
470 Q99650 MALFAVFQTTFFLTL...SLSSITLLDPGEHYC 2 739 761 TKVTTPDEHS SMLIHILLPMVFCVLLIMVMCYL KSQWIKETCY
471 Q86WC4 MEPGPTAAQRRCSLP...LKSSTSFANIQENSN 2 283 305 FNCSVPCSDT VPVIAVSVFILFLPVVFYLSSFL HSEQKKRKLI
472 Q96FE7 MLLAWVQAFLVSNML...EGTTPLMGQAGTPGA 2 169 191 NSKEKKDLGT LGYVLGITMMVIIIAIGAGIILG YSYKRGKDLK
473 Q8NBR0 MAPPPPSPQLLLLAA...LPPTPDSGPEGESSE 2 310 329 ARGPTPRTEE AAWAAMALTFLLVLLTLATL CTRLHRNFRR
474 Q6UWI2 MVYKTLFALCILTAG...YGSWGNYNNPLYDDS 2 258 280 QEVEHALSSG SIAAITVTVIAVVLLVFGVAAYL KIRHSSYGRL
475 Q923D3 MVCKVLIALCIFTAG...YGSWGNYNNPLYDDS 2 244 266 QEVENALSSG SIAAITVTVIAVVLLVFGGAAYL KIRHSSYGRL
476 Q9P2E7 MIVLLLFALLWMVEG...RAPYKPPYLTRKRIC 2 716 738 GGGETSLDLT LILIIALGSVSFIFLLAMIVLAV RCQKEKKLNI
477 Q96QU1 MFRQFYLWTCLASGI...VEGTEKQSHSQSTSL 2 1375 1397 KRGESLGYTE GALLALAFIIILCCIPAILVVLV SYRQFKVRQA
478 Q9HCL0 MHQMNAKMHFRFVFA...LVAEINKLLQDVRQS 2 698 720 MTSVSQASLD VSMIIIISLGAICAVLLVIMVLF ATRCNREKKD
479 Q8TAB3 MESLLLPVLLLLAIL...NKESPGVKRLKDIVL 2 679 701 QESMGSVNLS LIFIIALGSIAGILFVTMIFVAI KCKRDNKEIR
480 Q8N6Y1 MRGRGNARSSQALGV...DLHMRERKPMDISNI 2 889 911 RKVESVSCMP TLVALSVISLGSITLVTGMGIYI CLRKGEKHPR
481 Q9Y5F3 MAGTRRKSLQNRQVG...SQRLEGHDQVSDDYM 2 690 712 HSRKVNPSTK YLVISLVILSFLFLLSVIVIFII HVYQKIKYRE
482 Q9H158 MVGCGVAVLCLWVSC...EKKEKGNSTTDNSDQ 2 682 704 EPGGQLSAQN LYLVIALACISFLFLGCLLFFVC TKLHQSPGCC
483 Q9Y5F7 MLRKVRSWTEIWRWA...GGNGNKKKSGKKEKK 2 690 712 SAPREGESRL TLYLAVSLVAICFVSFGSFVALL SKCLRGAACG
484 O60245 MLRMRTAGWARGWCL...YSKQMRLHPYITVFG 2 878 900 DPSYEISKQR LSIVIGVVAGIMTVILIILIVVM ARYCRSKNKN
485 Q7TSK3 MSPAKRWGSPCLFPL...GSRYVSPKKGINENV 2 748 770 SGPSLQWDTP LIVIIVLAGSCTLLLAAIIAIAT TCNRRKKEVR
486 Q9HC56 MDLRDFYLLAALIAC...KQAGGATESPKEHQL 2 814 836 SQPYQNEDYL TIMIAIIAGAMVVIVVIFVTVLV RCRHASRFKA
487 Q92824 MGWGSRCCCPGRLDL...IDELEYDDESYSYYQ 2 1745 1764 RPATEHFKTA LFITSSMMLVLLLGAAVVVW KKSRGRVQPA
488 Q16549 MPKGRQKVPHLDAPL...HQHLDVPHGKEEQIC 2 13 35 KGRQKVPHLD APLGLPTCLWLELAGLFLLVPWV MGLAGTGGPD
489 Q9EP73 MRIFAGIIFTACCHL...DTSSKNRNDTQFEET 2 238 260 LPATHPPQNR THWVLLGSILLFLIVVSTVLLFL RKQVRMLDVE
490 Q9BQ51 MIFLLLMLSLELQLH...KRPVTTTKREVNSAI 2 221 243 SQMEPRTHPT WLLHIFIPFCIIAFIFIATVIAL RKQLCQKLYS
491 Q15116 MQIPQAPWPVVWAVL...AQPLRPEDGHCSWPL 2 168 190 PSPRPAGQFQ TLVVGVVGGLLGSLVLLVWVLAV ICSRAARGTI
492 Q9NZ53 MGRLLRAARLPPLLS...RDPEDSDVFEEDTHL 2 500 522 RASQVRSDYG TLFVVLVVIGAICIIIIALGLLY NCWQRRLPKL
493 P16284 MQPRWAQGATMWLGV...VESRYSRTEGSLDGT 2 603 625 RVILAPWKKG LIAVVIIGVIIALLIIAAKCYFL RKAKAKQMPV
494 P07202 MRALAVLSVTLVMAC...AGMEGRDTHRLPRAL 2 849 871 VDSGRLPRVT WISMSLAALLIGGFAGLTSTVIC RWTRTGTKST
495 Q8IYJ0 MESRMWPALLLSHLL...GAPAFQLNRIPLVNL 2 181 203 GRGEGVDPQL YVTITISIIIVLVATGIIFKFCW DRSQKRRRPS
496 P01833 MLLFVLTCLLAVFPA...SSTVAAEAQDGPQEA 2 639 661 SSEEQGGSSR ALVSTLVPLGLVLAVGAVAVGVA RARHRKNVDR
497 Q969N2 MAAAMPLALLVLLLL...RLANLIRRARGVPPL 2 522 544 PLLVNLPTPD FSMPYNVICLTCTVVAVCYGSFY NLLTRTFHIE
498 Q9UKJ1 MGRPLLLPLLPLLLP...LKSPQNETLYSVLKA 2 196 218 RSDSWHISLE TAVGVAVAVTVLGIMILGLICLL RWRRRKGQQR
499 Q13018 MLLSPSLLLLLLLGA...LEENILISDLEKSDQ 2 1398 1420 ALPEKGPSHS IIPLAVVLTLIVIVAICTLSFCI YKHNGGFFRR
500 Q3TTY0 MELYPGVSPVGLLLL...VTQDAVSEKRLKAGN 2 1420 1442 LPDKAEEPSN ALYWAVPVAAIGGLAVGILGVML WRTVKPVQQE
501 Q9Z239 MASPGHILALCVCLL...GTFRSSIRRLSSRRR 2 35 57 EPDPFTYDYH TLRIGGLTIAGILFILGILIILS KRCRCKFNQQ
502 O75051 MEQRRPWPRALEVDS...AYKVEQLINAMSIES 2 1238 1260 VISDSLLTLP AIVSIAAGGSLLLIIVIIVLIAY KRKSRENDLT
503 Q9QY40 MLTDFLQAPVMAPWS...LQQVAALVEYKVTDL 2 1244 1266 ESMMSTFPVE AQLGLGMGAAVLIAAVLLLTLMY RHKSKKALRD
504 Q9QZC2 MEVSRRKTPPRPPYP...HVKVLFDEKKKCKWM 2 950 972 LYVEQESVPS TWYFLIALPILLAIVIVVAVVVT RYKSKELSRK
505 Q8TEM1 MAARGRGLLLLTLSV...ASPPSGLWSPAYASH 2 1809 1831 LFQHFLDSYQ VMFFTLFALLAGTAVMIIAYHTV CTPRDLAVPA
506 O00592 MRCALALSALLLLLS...NLTKDDLDEEEDTHL 2 461 483 PEEAEDRFSM PLIITIVCMASFLLLVAALYGCC HQRLSQRKDQ
507 Q8N131 MGLGARGAWAALLLG...RGIRYRTIDEHDAII 2 169 191 KKGSKFDTGS FVGGIVLTLGVLSILYIGCKMYY SRRGIRYRTI
508 P16471 MKENVASATVFTLLL...GLDYLDPACFTHSFH 2 236 258 IPSDFTMNDT TVWISVAVLSAVICLIIVWAVAL KGYSMVTCIF
509 P0DTF9 MCWLRAWGQILLPVF...ETVPIHDRSATVYDE 2 98 120 SIYWLNCKVD MFGIMMLLLIAVLITGFVWYCCA YHFYLQDLNR
510 P18433 MDSWFILVLLGSGLI...VQEYIDAFSDYANFK 2 152 174 DSKDRRDETP IIAVMVALSSLLVIVFIIIVLYM LRFKKYKQAG
511 B2RU80 MLRHGALTALWITLS...NVNPEYHRDAIYSRH 2 1620 1642 ITTESEPLFG VIEGVSAGLFLIGMLVALVAFFI CRQKASHSRE
512 P08575 MTMYLWLKLLAFGFA...HSVNGPASPALNQGS 2 580 602 HSTSYNSKAL IAFLAFLIIVTSIALLVVLYKIY DLHKKRSCNL
513 P23469 MEPLCPLLLVGFSLP...VQDFIDIFSDYANFK 2 47 69 GPPDPGASQP LLAWLLLPLLLLLLVLLLAAYFF RFRKQRKAVV
514 P23470 MRRLLEPCWWILFLK...IADESDPAESMESLV 2 737 759 ISRPAPGRME WIIPLIVVSALTFVCLILLIAVL VYWRGCNKIK
515 Q12913 MKPAAREARLPPRSP...LAPVTTFGKTNGYIA 2 974 996 VSLPQDPGVI CGAVFGCIFGALVIVTVGGFIFW RKKRKDAKNN
516 Q16849 MRRPRRPGGLGGSGG...AVAEEVNAILKALPQ 2 577 599 TAHSTSPMRS VLLTLVALAGVAGLLVALAVALC VRQHARQQDK
517 E9Q612 MGHLPRGTLGGRRLL...QFCISDVIYENVSKS 2 831 853 TMVTEVNPNV VVISVLAILSTLLIGLLLVTLVI LRKKHLQMAR
518 Q9UMZ3 MKKVPIKPEQPEKLR...AMEGDVELEWEETTM 2 1948 1970 GEGLSERTVE IILSVTLCILSIILLGTAIFAFA RIRQKQKEGG
519 Q15256 MRRAVCFPALCLLLN...ALCLYESRLSAETVQ 2 227 249 HEADKIWSKE GFYAVVIFLSIFVIIVTCLMILY RLKERFQLSL
520 Q99M80 MGSLGGLALCLLRLL...YKFVYEVALEYLSSF 2 772 791 KQVDNTVKMA GVIAGLLMFIIILLGVMLTI KRRKLAKKQK
521 Q92729 MARAQALVLALTFQL...CYDVALEYLEGLESR 2 747 769 EVSQRSEEMG LILGICAGGLAVLILLLGAIIVI IRKGRDHYAY
522 P70289 MRPLILLAALLWLQD...CLNSALRNRLPRARK 2 1078 1100 QASISLVAMP LTVMMGTVVGCIIIVCAVLCLLC RRGLKGPRSE
523 P15151 MARAMAAAWPLLLVA...RENSSSQDPQTEGTR 2 345 367 SEHSGISRNA IIFLVLGILVFLILLGIGIYFYW SKCSREVLWH
524 Q9NXS2 MRSGGRGRPRLRLGE...LCRILAVFLAEYLGL 2 33 55 LPPKRRLLPR VRLLPLLLALAVGSAFYTIWSGW HRRTEELPLG
525 Q8TD07 MRRISLTSSPVRLLL...QNGEWQAGLWPLRTS 2 226 248 IHWSSSSLPD RWIILGAFILLVLMGIVLICVWW QNGEWQAGLW
526 O75787 MAVFVVLLALVAGVL...DSIIYRMTNQKIRMD 2 309 331 YNFEYSVVFN MVLWIMIALALAVIITSYNIWNM DPGYDSIIYR
527 P07949 MAKATSGAAGLRLLL...MLSPSAAKLMDTFDS 2 13 32 KATSGAAGLR LLLLLLLPLLGKVALGLYFS RDAYWEKLYV
528 Q68DV7 MSGGHQLQLAALWPW...PGSEEELEELCEQAV 2 199 218 KEPPAWPDYD VWILMTVVGTIFVIILASVL RIRCRPRHSR
529 Q04912 MELLPPLPQSFLLLL...NVRRPRPLSEPPRPT 2 960 982 PDGVPQSTLL GILLPLLLLVAALATALVFSYWW RRKQLVLPPN
530 Q01974 MARGSALPRRPLLCI...CDTLQVDEAQVQLEA 2 403 425 SCSPRDSSKM GILYILVPSIAIPLVIACLFFLV CMCRNKQKAS
531 P08922 MKNIYCLIPKLVNFA...NYACLTHSGYGDGSD 2 1860 1882 LVGDDFWIPE TSFILTIIVGIFLVVTIPLTFVW HRRLKNQKSA
532 P04843 MEAPAAGLFLLLLLG...RQELVTKIDHILDAL 2 440 459 FNKVLMLQEP LLVVAAFYILFFTVIIYVRL DFSITKDPAA
533 Q9HBV2 MSPRGTGCSAGLLMT...TEMPGEDDALSEWNE 2 217 239 MRRSSLPATD AALIFVLTIGVIICVFIIFLLIF IIINWAAVKA
534 Q96BY9 MAAACGPGAAGYCLL...TKTRTASGYGGTRRR 2 172 194 YYKWSSADSC NMSGLITIVVLLGIAFVVYKLFL SDGQYSPPPY
535 P21583 MKKTQTWILTCIYLQ...EISMLQEKEREFQEV 2 215 237 KNPPGDSSLH WAAMALPALFSLIIGFAFGALYW KKRQPSLTRA
536 Q9JL59 MLAYSVTSSGLFPRM...LIHGSPGIPYLTLPP 2 160 182 PDKPPTAVRT EVIIIIAIATTIIITGIGVFVWY KQFPVAPQIQ
537 Q8WVN6 MQTCPLAFPGHVSQA...ELLSPQPLFPYAADP 2 146 168 AEPQSAPDTG FWPVPAVVTAVFILLVALVMFAW YRCRCSQQRR
538 Q7Z5N4 MARGARPSAAGGGGG...GPGARTPLTGFSSFV 2 2008 2030 AQVEAPFYEE WWFLLVMALSSLIVILLVVFALV LHGQNKKYKN
539 Q9UBV2 MRVRIGLTLLLCAVL...APPQQEGPPEQQPPQ 2 739 761 MFTQLDMDQL LGPEWDLYLMTIIALLLGTVIAY RQRQHQDMPA
540 Q14242 MPLQLLLLLILLGPG...EDREGDDLTLHSFLP 2 321 343 APDHISVKQC LLAILILALVATIFFVCTVVLAV RLSRKGHMYP
541 Q9H3S1 MALPALGLDPWSLLG...SDVDADNNCLGTEVA 2 681 703 GAALAAQQSY WPHFVTVTVLFALVLSGALIILV ASPLRALRAR
542 Q92854 MRMCTPIRGLLMALA...VKCELKFADSDADGD 2 734 756 KTMYLKSSDN RLLMSLFLFFFVLFLCLFFYNCY KGYLPRQCLK
543 Q9Z123 MLARAERPRPGPRPP...RLTGAPLATCDETSI 2 665 687 QRGPANRAHT VVGAGLVGFFLGVLAASLTLLLI GRRQQRRRQR
544 Q9NTN9 MWGRLWPLLLSILTA...RKHTQLVEQLDESSV 2 679 701 LAPDVRLLYV LAIAALGGLCLILASSLLYVACL REGRRGRRRK
545 Q13591 MKGTCVIAWLFSSLG...YSNAYFTDLNNYDEY 2 971 993 KRCGEFNMFH MIAVGLSSSILGCLLTLLVYTYC QRYQQQSHDA
546 Q9H2E6 MRSEALLLYFTLLHF...FAPLSTSMKPNDACT 2 648 670 YLKGHDQLVP VTLLAIAVILAFVMGAVFSGITV YCVCDHRRKD
547 Q9H3T3 MQTPRASPPRPALLL...LLPYGGADRTAPPVP 2 602 624 GLVSVNLLVT SSVAAFVVGAVVSGFSVGWFVGL RERRELARRK
548 Q9WTM3 MPRAPHSMPLLLLLL...ASPPQPAPHGGHFNF 2 604 626 SPASASRSIP IPLLLACVAAAFALGASVSGLLV SCACRRANRR
549 Q8NFY4 MRVFLLCAYILLLMV...VPQTPSVRPLNKYTY 2 664 686 ESNQMVHMNV LITCVFAAFVLGAFIAGVAVYCY RDMFVRKNRK
550 Q53EL9 MRPVALLLLPSLLAL...TYETGSLSFAGDERI 2 926 948 AASSTLDAAH IAAAIFLPLVAMVLLVGGVYFYF SRLQGKSSLQ
551 Q16586 MAETLFWTPLLVVLL...PRVDSAQVPLILDQH 2 290 312 EAPDRDFLVD ALVTLLVPLLVALLLTLLLAYVM CCRREGRLKR
552 Q6UWI4 MWGARRSSVSSSWNA...PHTNSEQKMYPAVTV 2 114 136 KDGPDGSAVP IYVPFLIVGSVFVAFIILGSLVA ACCCRCLRPK
553 Q96DD7 MPPAGLRRAAPLTAI...APPPYMPPQPSYPGA 2 86 108 KHCLAFSPKT IAGIASAVILFVAVVATTICCFL CSCCYLYRRR
554 Q6ZSJ9 MALRRLLLLLLLSLE...GHHTCYTASKTEVTV 2 176 195 KYDPEKDKTN FTVYITCGVIAFVIVAGVFA KVSYDKAHRP
555 B8ZZ34 MARAGARGLLGGRRP...GSRYLRTNSKTEVTV 2 141 163 RDPGRERSHT AVYAVCGVAALLVLAGIGARLGL ERAHSPRARR
556 Q3SXP7 MTSCGQQSLNVLAVL...DAHSPPLMTFQSSSA 2 99 121 EGYMHNNYTA LLGVWIYGFFVLMLLVLDLLYYS AMNYDICKVY
557 Q96LC7 MLLPLLLSSLLGGSQ...MPKGTQADYAEVKFQ 2 548 570 PDKKGLISTA FSNGAFLGIGITALLFLCLALII MKILPKRRTQ
558 Q08ET2 MLPLLLLPLLWGGSL...TRCGGPQQSRAERPG 2 359 381 KQQGSWPLVL TLIRGALMGAGFLLTYGLTWIYY TRCGGPQQSR
559 O43699 MQGAQEASASEMLPL...PKVTDTEYSEIKIHK 2 346 368 VHWKPEGRAG GVLGAVWGASITTLVFLCVCFIF RVKTRRKKAA
560 Q9NYZ4 MLLLLLLLPLLWGTK...ACLRNHNPSSKEVRG 2 362 384 TSRPVSQVTL AAVGGAGATALAFLSFCIIFIIV RSCRKKSARP
561 Q5JXA9 MCSTMSAPTCLAHLP...PAGAMNTLAWSKGQE 2 289 311 PATEMSPTGL LVVFAPVVLGLKAITLAALLLAL ATSRRSPGQE
562 Q9Y3P8 MNQADPRLRAVCLWT...SFPDQAYANSQPAAS 2 41 63 LALGIPSITQ AWGLWVLLGAVTLLFLISLAAHL SQWTRGRSRS
563 Q13291 MDPKGLLSLTFVLFL...TNSITVYASVTLPES 2 238 260 CRTDPSETKP WAVYAGLLGGVIMILIMVVILQL RRRGKTNHYQ
564 Q9QUM4 MDPKGSLSWRILLFL...PNPTTVYASVTLPES 2 243 265 KQESSSESSP WMQYTLVPLGVVIIFILVFTAII MMKRQGKSNH
565 Q9UIB8 MAQHHLWILLLCLQT...QDSKPPGTSSYEIVI 2 224 246 DIAMGFRTHH TGLLSVLAMFFLLVLILSSVFLF RLFKRRQGRI
566 Q96DU3 MLWLFQSLLFVFCFG...SKPTFSRATALDNVV 2 226 248 DVKIQYTDTK MILFMVSGICIVFGFIILLLLVL RKRRDSLSLS
567 Q9ET39 MAVSRAPAPDSACQR...QPITLKVNTLINYNS 2 240 262 KGVLTNPPWN AVWFMTTISIISAVILIFVCWSI HVWKRRGSLP
568 Q8BHK6 MARFSTYIIFTSVLC...AKPLVPRSLSFENVI 2 224 246 DAATDLTSLR GILYILCFSAVLILFAVLLTIFH TTWIKKGKGC
569 Q9D3G2 MWSLWSLLLFEALLP...ACTDGVLPETENALV 2 234 256 SGKASYKDVL LVVVPITLFLILAGLFGAWHHGL CSGKKKDACT
570 Q96A28 MCAFPWLLLLLLLQE...RMKLRKEAKPGSSPA 2 240 259 PSTAFCLLAK GLLIFLLLVILAMGLWVIRV QKRHKMPRMK
571 Q96PX8 MLLWILLLETSLCFA...GAHRVYDCGSHSLSD 2 621 643 DTSRVSISVL VPGLLLVFVTSAFTVVGMLVFIL RNRKRSKRRD
572 Q9H156 MLSGVWFLSVLTVAG...DYLEVLEKQTAISQL 2 622 644 LHTEVPLSVL ILGLLVVFILSVCFGAGLFVFVL KRRKGVPSVP
573 Q3V0X1 MKPLKLFCIGLLLCP...SLEMQNMNLIKLFGG 2 83 105 FKNHLSDFFK SSIPPAAIFALFVTTAIMRAAIV NKRLEEPHRQ
574 Q62230 MCVLFSLLLLASVFS...ATKKNTIQEEVVAAL 2 1641 1663 LHQLQLFQRL LWVLGFLAGFLCLLLGLVAYHTW RKKSSTKLNE
575 Q96PQ0 MAHRGPSRASKGPGP...VLSINSREMHSYLVS 2 1078 1100 GTGAEQLGGG GGYWAVVVLFVIGLFAAGAFILY KFKRKRPGRT
576 Q9WU03 MAQLCELRRGRALLA...TADDKEQLVKNTCVL 2 198 220 MHPFLTPGLK AVILVGLFLMVLILLLGTSMVCL IRVVRRKQER
577 P43307 MRLLPRLLLLLLLVF...LPRKRAQKRSVGSDE 2 207 229 IEREDGLDGE TIFMYMFLAGLGLLVIVGLHQLL ESRKRKRPIQ
578 P43308 MRLLSFVVLALFAVT...YSSKRKYDTPKTKKN 2 147 169 REFDRRFSPH FLDWAAFGVMTLPSIGIPLLLWY SSKRKYDTPK
579 Q9NY15 MAGPRGLLPLCLLAF...LEEDFPDTQRILTVK 2 2477 2499 AVLAPEAPPV AAGVGAVLAAGALLGLVAGALYL RARGKPMGFG
580 Q8WWQ8 MMLQHLVIFCLGLVV...SEERQLEGNDPLRTL 2 2461 2483 APVTLTHTGL GAGIFFAIILVTGAVALAAYSYF RINRRTIGFQ
581 Q13586 MDVCVRLALWLLWGL...RKKFPLKIFKKPLKK 2 214 232 LLTRHNHLKD FMLVVSIVIGVGGCWFAYI QNRYSKEHMK
582 Q9UGT4 MKPALLPWALLLLAT...LRRRKGNTHVWGAQP 2 786 808 PKCQPGRSYA VLLGIIFGGLAVVAAVALVYVLL RRRKGNTHVW
583 Q5VX71 MYHGMNPSNGDGFLE...GIDIADEIPLMEEDP 2 317 339 QTWPSTHETL LTTWKIVAFTATSVLLVLLLVIL ARMFQTKFKA
584 O60279 MTAEGPSPPARWHRR...RQARHYHQQIEMEKV 2 576 598 GCPGLSRGPV IATIVTVLCLLLLLAGVGMVWGY RKCQHKSSVY
585 Q9UQF0 MALPYHIFLFTVLLP...SAAQPLLRPNSAGSS 2 455 477 MPWILPFLGP LAAIILLLLFGPCIFNLLVNFVS SRIEAVKLQM
586 Q24JP5 MCARMAGRTTAAPRG...PEELRNYMERIRGSS 2 851 873 QHVTELELGM YALLGVFCVAIFIFLVNGVVFVL RYQRKEPPDS
587 P40200 MEKKWKYCAVYYIIQ...EPNESDLPYHEMETL 2 518 540 TGIVVNKPKD GMSWPVIVAALLFCCMILFGLGV RKWCQYQKEI
588 B6A8C7 MIPKLLSLLCFRLCV...SRNVSPGESEAFKPE 2 234 256 TTSSNYSLGN FVRLGLAAVIVVIMGAFLVEAWY SRNVSPGESE
589 A0A1B0GTY4 MSNQRLPLIFSLLFI...CKLKKKSKEEGARRY 2 76 98 FANMDIFQGC LYLIYNLLQAVFFVLFVLSVHYL WKKWKKHQKK
590 P13726 METPAWPRVPRPETA...GVGQSWKENSPLNVS 2 252 274 MGQEKGEFRE IFYIIGAVVFVVIILVIILAISL HKCRKAGVGQ
591 P37173 MGRGLLRGLWPLHIV...SEEKIPEDGSLNTTK 2 167 189 NPDLLLVIFQ VTGISLLPPLGVAISVIIIFYCY RVNRQQKLSS
592 O43493 MRFVVALVLLNVAAA...RRPKASDYQRLDQKS 2 385 402 SGNGSAESSH FFAYLVTAAILVAVLYIA HHNKRKIIAF
593 Q9UPZ6 MGLQARRWASGSRGA...RLKPLTLAYDGDADM 2 1607 1629 PFGPDGRLKT WVYGVAAGAFVLLIFIVSMIYLA CKKPKKPQRR
594 Q9NS62 MKPMLKDFSNLLLVV...EDETTSTLSVEKLVI 2 414 436 QPQGPVKSNN IVTVTGISLCLFIIIATVLITLW RRFGRPAKCS
595 Q02763 MDSLASLVLCGVSLL...KFTYAGIDCSAEEAA 2 748 770 PADLGGGKML LIAILGSAGMTCLTVLLAFLIIL QLKRANVQRR
596 Q495A1 MRWCLLLIWAQGLRQ...SYRSLGNCSFFTETG 2 142 164 AEHGARFQIP LLGAMAATLVVICTAVIVVVALT RKKKALRIHS
597 Q96H15 MSKEPLILWLMIEFW...DVQHGREDEDGLFTL 2 314 336 MSMKNEMPIS QLLMIIAPSLGFVLFALFVAFLL RGKLMETYCS
598 Q8TB96 MAAAGRLPSSWALFS...REKRQEAHRFHFDAM 2 566 588 SAKLYLTPSN IVLLTAIALIGVCVFILAIIGIL HWQEKKADDR
599 Q6R5P0 MPRMERHQFCSVLLI...EQLKRRLSKAGQERD 2 719 741 FSFLATNCPH GTEFWGFLTSFILLLLLIILPLI SCPKWSWLHH
600 Q15399 MTSIFHFAIIFMLIL...LRAAINIKLTEQAKK 2 582 604 MSELSCNITL LIVTIVATMLVLAVTVTSLCSYL DLPWYLRMVC
601 Q9QUN7 MLRALWLFWILVAIT...QQEVFWVNLRTAIKS 2 588 610 ARPSVLECHQ AALVSGVCCALLLLILLVGALCH HFHGLWYLRM
602 O15455 MRQTLPCIYFWGGLL...RHKLQVALGSKNSVH 2 703 725 TSSCKDSAPF ELFFMINTSILLIFIFIVLLIHF EGWRISFYWN
603 O00206 MMSASRLAGTLIPAM...GTVGTGCNWQEATSI 2 634 656 NITCQMNKTI IGVSVLSVLVVSVVAVLVYKFYF HLMLLAGCIK
604 O60602 MGDHLDLLLGVVLMA...KKDNNIPLQTVATIS 2 644 666 VLKSLKFSLF IVCTVTLTLFLMTILTVTKFRGF CFICYKTAQR
605 Q9NYK1 MVFPMWTLKRQILIL...TDNHVAYSQVFKETV 2 843 865 CELDLTNLIL FSLSISVSLFLMVMMTASHLYFW DVWYIYHFCK
606 Q9NR97 MENMFLQSSMLTCIF...DSRYNNMYVDSIKQY 2 826 848 SLELTTCVSD VTAVILFFFTFFITTMVMLAALA HHLFYWDVWF
607 Q4V9L6 MVSAAAPSLLILLLL...PPESPCACSSVHPSV 2 95 117 DGIVDFFRQY VMLIAVVGSLAFLLMFIVCAAVI TRQKQKASAY
608 Q8N3G9 MAQAVWSRLGRILWL...GLLPPLYKSVKTYTV 2 340 362 IQVWPSRIQP AVFAFPCATLITVMLAFIMYMTL RNATQQKDMV
609 Q6P9G4 MQAPRAALVFALVIA...KEEKESNHNPSDSES 2 76 98 NFAPDENQLE FILMVLIPLILLVLLLLSVVFLA TYYKRKRTKQ
610 Q8WZ59 MLGCGIPALGLLLLL...GTEEGEETEGEEEED 2 82 104 PDENVRRKHM WALVWTCSGLLLLSCSICLFWWA KRRDVLHMPG
611 Q6UWW9 MSRSRLFSVTSAIST...PLGSPPPYEEIVKTT 2 49 71 CVNYNDQHPN GWYIWILLLLVLVAALLCGAVVL CLQCWLRRPR
612 A6NLX4 MAPGPWPVSCLRGGP...ASSEEPPPPPPLPPE 2 44 66 CECSLGLSRE ALIALLVVLAGISASCFCALVIV AIGVLRAKGE
613 A2RRL7 MQRLPAATRATLILS...KLMKLTPDEPKDLQA 2 70 89 ARCCRTGVDE YGWIAAAVGWSLWFLTLILL CVDKLMKLTP
614 Q5T292 MNLGVSMLRILFLLD...RNHNFSKRDAQVIEL 2 37 59 TPGAEIDFKY ALIGTAVGVAISAGFLALKICMI RRHLFDDDSS
615 Q4KMG9 MGVRVHVVAASALLY...PPTEKESTRIVDSWN 2 42 64 EHCLTTDWVH LWYIWLLVVIGALLLLCGLTSLC FRCCCLSRQQ
616 Q9D2R4 MQIQTILLCFSFSFS...QMKHLKDFFIAKKLV 2 183 202 RITSEDTNRN VLWWAFAQILIFISVGIFQM KHLKDFFIAK
617 P0DPE3 MAARTLASALVLTLW...KMGAQSWGSGALDGL 2 220 242 RSPMGWAGPL ALGLLTGFVGALGTGALVVLLTL WITGGDGDRA
618 Q13445 MMAAGAALALALWLL...CTLKRFFQDKRPVPT 2 193 215 RNLQEGNLER VNFWSAVNVAVLLLVAVLQVCTL KRFFQDKRPV
619 Q15363 MVTLAELLVLLAALL...QIYYLKRFFEVRRVV 2 169 191 RAINDNTNSR VVLWSFFEALVLVAMTLGQIYYL KRFFEVRRVV
620 Q9Y3Q3 MGSTVPRSASVLLLL...SFFTEKRPISRAVHS 2 179 201 RARAEDLNSR VSYWSVGETIALFVVSFSQVLLL KSFFTEKRPI
621 Q8WW62 MSPLLFGAGLVVLNL...LFNVPTTTDTKKPRC 2 201 223 FFLIQSNYNY VNWWSTAQSLVIILSGILQLYFL KRLFNVPTTT
622 P49755 MSGLSGPPARRGPFP...VFYLRRFFKAKKLIE 2 186 208 RDTNESTNTR VLYFSIFSMFCLIGLATWQVFYL RRFFKAKKLI
623 Q9P0T7 MKLLSLVAVVGCLLV...QEQRKTVFDRHKMLS 2 90 112 YEERSTTTIK VIIVIYLSVVGALLLYMAFLMLV DPLIRKPDAY
624 O14668 MGRVFLTGEKANSIL...SASAIPMVPVVTTIK 2 84 106 RGSDWFQFYL TFPLIFGLFIILLVIFLIWRCFL RNKTRRQTVT
625 O14669 MRGHPSLLLLYMALT...HDAPPPPYTSLRRPH 2 110 132 GGRGRVDVAS LAVGLTGGILLIVLAGLGAFWYL RWRQHRGQQP
626 Q9BZD7 MAVFLEAKDAHSVLK...PKYEEIVAANPGADK 2 79 101 SVRDPSQSSD AMYVVVPLLGVALLIVIALFIIW RCQLQKATRH
627 Q9BZD6 MFTLLVLLSQLPTVT...KGFRVFKKSMSLPSH 2 118 140 NREKIDVMGL LTGLIAAGVFLVIFGLLGYYLCI TKCNRLQHPC
628 Q8NEW7 MAGWPGAGPLCVLGG...EEDEKNEAKKKKGEK 2 56 78 KETVVFWDMR LWHVVGIFSLFVLSIIITLCCVF NCRVPRTRKE
629 Q9D7L8 MVWKITGPLQACQLL...KLCGKKNDPNSETAL 2 217 239 FHLLVKDKVF VMPAEPIIAACVVVVLTMAFALF SRRKRIMKLC
630 Q96BF3 MGSPGMVLGLLVQIW...TQQPRPKGFPKVGEE 2 150 172 QNRNRIASFP GFLFVLLGVGSMGVAAIVWGAWF WGRRSCQQRD
631 G3X8R9 MEFLLLLSLALFSDA...ASQGPSMVSITLARI 2 151 173 KAVQKAEGSR MSILIICILITSLGIIFIISHLS RGRRSQRNRE
632 Q9DCF1 MELPLSQATLRHTLL...YIYRVSSVSSDEIWL 2 238 260 ATRIEVPLLG IVVAGGLALGTLVGFSTLVACLV CRKEKKTKGP
633 Q9BXS4 MAAPKGSLWVRTQLG...AGPLPTKVNLAHSEI 2 240 262 FLRCLSLNSG WILTTTLVLSVMVLLWICCATVA TAVEQYVPSE
634 Q9D5K1 MKTGAIVFILRSLLS...KALGGTDGSGGRTRL 2 221 243 KPHLPVWQRK VTSALGIGIVAGVVGGVLVSVAV FKALGGTDGS
635 A2RUT3 MLHVLASLPLLLLLV...TQRQIQIKGTSTQSG 2 64 86 CPGYWLGPGA SRIYPVAAVMITTTMLMICRKIL QGRRRSQATK
636 Q6UXU6 MSQAWVPGLAPTLLF...EEYTGDQRGIDNPAF 2 54 76 SCCQENELFP GPVRIFVIIFLVILSVFCICGLA KCFCRNCREP
637 B7ZWI3 MLDTWVWGTLTLTFG...EGPAGQMRGRAYATL 2 64 86 VWDPANDRFR FLVILACIIFPILFICALVSLFC PNCTELQHDV
638 Q3KNT9 MWRLALGGVFLAAAQ...SLLVESHHLQAKSGL 2 146 165 PGSQDLWEAK ILLLSIFGAFLLLGVLSLLV ESHHLQAKSG
639 Q9H3N1 MAPSGSLAVPLAVLV...IRQRSLGPSLATDKS 2 181 203 EDLGLPVWGS YTVFALATLFSGLLLGLCMIFVA DCLCPSKRRR
640 Q9Y320 MAVLAPLIALVYSVP...STPTTVSDGENKKDK 2 103 125 IFMFSKVANT ILFFRLDIRMGLLYITLCIVFLM TCKPPLYMGP
641 O35305 MAPRARRRRQLPAPL...AQTSLHTQGSGQCAE 2 212 234 RRPPKEAQAY LPSLIVLLLFISVVVVAAIIFGV YYRKGGKALT
642 Q9CR75 MASAWPRSLPQILVL...EETGGEGCPGVALIQ 2 79 101 AAPPAHFRLL WPILGGALSLVLVLALVSSFLVW RRCRRREKFT
643 Q92956 MEPPGDWGPPPWRST...VEETIPSFTGRSPNH 2 201 223 KAGAGTSSSH WVWWFLSGSLVIVIVCSTVGLII CVKRRKPRGD
644 Q80WM9 MEPLPGWGSAPWSQA...TEVGFAETEEETASN 2 208 230 STDTTCSSQV VYYVVSILLPLVIVGAGIAGFLI CTRRHLHTSS
645 Q9Y5U5 MAQHGAMGAFRALCG...ERSAEEKGRLGDLWV 2 165 187 PGSPPAEPLG WLTVVLLAVAACVLLLTSAQLGL HIWQLRSQCM
646 P20333 MAPVAVWAALAVGLE...KPLPLGVPDAGMKPS 2 258 280 SPPAEGSTGD FALPVGLIVGVTALGLLIIGVVN CVIMTQVKKK
647 Q93038 MEQRPRGCAAVAAAL...DGCVEDLRSRLQRGP 2 200 222 RCAAVCGWRQ MFWVQVLLAGLVVPLLLGATLTY TYRHCWPHKP
648 P83626 MTRLRLLLLLGLLLR...YCKRGENIQLSSTML 2 163 185 SGSQCFCFSK PLGIVVIIAAFIIIIGAVIILIL KIICYCKRGE
649 P36941 MLLPWATSAPGLAWG...TPSNRGPRNQFITHD 2 226 248 PLPPEMSGTM LMLAVLLPLAFFLLLATVFSCIW KSHPSLCRKL
650 P25942 MVRLPLQCVLWGCLL...QEDGKESRISVQERQ 2 193 215 DVVCGPQDRL RALVVIPIIFGILFAILLVLVFI KKVAKKPTNK
651 P25446 MLWIWAVLPLVLAGS...STPDTGNENEGQCLE 2 170 187 NCRKQSPRNR LWLLTILVLLIPLVFIYR KYRKRKCWKR
652 P28908 MRVLLAALGLLFLGA...EEGKEDPLPTAASGK 2 386 408 STGKPVLDAG PVLFWVILVLVVVVGSSAFLLCH RRACRKRIRQ
653 Q07011 MGNSCYNIVATLLLV...CSCRFPEEEEGGCEL 2 191 213 PGHSPQIISF FLALTSTALLFLLFFLTLRFSVV KRGRKKLLYI
654 Q13641 MPGGCSRGPAAGDGR...NADPRLTNLSSNSDV 2 354 376 CDPILPPSLQ TSYVFLGIVLALIGAIFLLVLYL NRKGIKKWMH
655 P40238 MPSWALFMVTSCLLL...IANHSYLPLSYWQQP 2 491 513 TRVETATETA WISLVTALHLVLGLSAVLGLLLL RWQFPAHYRR
656 Q9BX59 MGTQEGWCLLLCLAL...SHLHEDRTARVSQPS 2 407 426 TQVVPPERRT ALGVIFASSLFLLALMFLGL QRRQAPTGLG
657 O15533 MKSLSLLLAVALGLA...AVYLSTCKDSKKKAE 2 414 436 GLSGPSLEDS VGLFLSAFLLLGLFKALGWAAVY LSTCKDSKKK
658 O00220 MAPPPARVHLGAFLA...FIYLEDGTGSAVSLE 2 240 262 VHKESGNGHN IWVILVVTLVVPLLLVAVLIVCC CIGSGCGGDP
659 O14763 MEQRGQNAPAASGAR...GKFMYLEGNADSAMS 2 209 231 TSSPGTPASP CSLSGIIIGVTVAAVVLIVAVFV CKSLLWKKVL
660 Q9UBN6 MGLWGQSVPTASSAR...LFYEEDEAGSATSCL 2 210 232 TTILGMLASP YHYLIIIVVLVIILAVVVVGFSC RKKFISYLKG
661 Q9JKE1 MSPLLLWLGLMLCVS...QPSKTSKVQGVSEKQ 2 138 160 LAWCQGKPVM VIVLTCGFILNKGLVFSVLFVFL CKAGPKVLQP
662 Q8K558 MDCYLLLLLLLLGLA...EPVQDPPNSQTPPSK 2 174 196 PHEFRRRENS IPLIWGAVLLLALVVVAVVIFAV MARKKGNRLV
663 Q5T2D2 MAPAFLLLLLLWPQG...DPPGRPEPYVEVYLI 2 267 289 SMPSIRHQDV YSTVLGVVLTLLVLMLIMVYGFW KKRHMASYSM
664 Q3LRV9 MAWRYSQLLLVPVQL...TVSHISGYEKKANWY 2 200 222 PGWTSPGLLV SVQYGLLLLKALMLSVFCVLLCW RSGQGREYMA
665 Q5BVD1 MDLAQPSQPVDELEL...CESLGLDPTSLLLYE 2 66 88 CNKNVVGRCK LWMIITSIFLGVITVIIIGLCLA AVTYVDEDEN
666 Q9P2J2 MVWCLGLAVLSLVIS...AYRQPVPHPEQATLL 2 738 760 PGLLPQPVLA GVVGGVCFLGVAVLVSILAGCLL NRRRAARRRR
667 Q9UPX0 MIWYVATFIASVIGT...PAPATSPPERALSKL 2 727 749 DGLARPVLAG IVATICFLAAAILFSTLAACFVN KQRKRKLKRK
668 Q96J42 MVPAAGRRPPRVMRL...SIRWLIPGQEQEHVE 2 324 342 LPSTLIKSVD WLLVFSLFFLISFIMYATI RTESIRWLIP
669 P0DTE4 MLNNLLLFSLQISLI...SCQKFGKIGKKKKRE 2 491 513 LTWFQYHSLD VIGFLLVCVTTAIFLVIQCCLFS CQKFGKIGKK
670 Q8IZJ1 MGARSGARGALLLAL...GKSEMLVAVATDGDC 2 376 398 HLLEASGDAA LYAGLVVAIFVVVAILMAVGVVV YRRNCRDFDT
671 Q80YF6 MVRTRWQPPLRALLL...VGCEPGLDPLPSLSP 2 196 218 IDTWPGRRSG CMIVITSILSALAGLLLLAFLAA STTRFSSLWW
672 B0FP48 MDNSWRLGPAIGLSA...YTTHLAFSTPAEGAS 2 203 225 AVPGPQSPGT VVIIAILSILLAVLLTVLLAVLI YTCFNSCRST
673 Q5DID0 MLRTSGLALLALVSA...FKIQSNNFSYQVFYE 2 1271 1293 PPHAEAGLGA GYVVLIVVAIFVLVAGTATLLIV RYQRMNGRYN
674 O75445 MNCPVLSLGSGFLFQ...SSVTKERTTFTDTHL 2 5041 5063 RSKSTEFYSE LWFIVLMAMLGLILLAIFLSLIL QRKIHKEPYI
675 P29533 MPVKMVAVLGASTVL...MKGSYSLVEAQKSKV 2 699 721 EHNKDYFSPE LLALYCASSLVIPAIGMIVYFAR KANMKGSYSL
676 Q9H7M9 MGVPTALEAGSWRWG...PSLDPVPDSPNFEVI 2 194 216 SSQDSENITA AALATGACIVGILCLPLILLLVY KQRQAASNRR
677 Q96AW1 MRRQPAKVAALLLGL...CNTPPPPYEQVVKAK 2 59 81 RCCVRALSIQ RLWYFWFLLMMGVLFCCGAGFFI RRRMYPPPLI
678 Q8N0Z9 MAAGGSAPEPRVLVC...SEEQSDIVQEEDRPV 2 412 434 WLSVKEPLNI GGIVGTIVSLLLLGLAIISGLLL HYSPVFCWKV
679 Q86XK7 MVFAFWKVFLILSCL...VVEPLSEDEKGVVKA 2 234 256 IDLTSSHPEV GIIVGALIGSLVGAAIIISVVCF ARNKAKAKAK
680 Q96IQ7 MAELPGPFLCGALLG...ASTVTTTKSKLPMVV 2 242 264 LSVTEPSQGR VAGALIGVLLGVLLLSVAAFCLV RFQKERGKKP
681 Q9Z109 MAWPLVGAFLCGHLL...ASTMTTTKSKLSMVV 2 243 265 LSVTDSSEGR VAGTLIGVLLGVLLLSVAAFCLI RFQKERKKEP
682 Q9Y279 MGILLGLLLLGHLTV...PLDYEFLATEGKSVC 2 284 306 TSAGPGKSLP VFAIILIISLCCMVVFTMAYIML CRKTSQQEHV
683 P0DPA2 MRVGGAFHLLLVCLS...DCAEGPVQCKNGLLV 2 265 287 KVSDSRRIGV IIGIVLGSLLALGCLAVGIWGLV CCCCGGSGAG
684 Q8IW00 MRLLALAAAALLARA...TSTVYAQILFEENKL 2 179 201 ETWAFFEDLY VYAVLVCCVGILSILLFMLVIVW QSVFNKRKSR
685 A8MXK1 MRPLPSGRRKTRGIS...KESTTEEIELEDVEC 2 148 170 VSEILYEDLH FVAVILAFLAAVAAVLISLMWVC NKCAYKFQRK
686 P55808 MESWWGLPCLAFLCF...NNRRNCFRTHEPENV 2 143 165 GNPEGNMVAK IVSPIVSVVVVTLLGAAASYFKL NNRRNCFRTH
687 Q9Y493 MVPPVWTLLLLVGAA...LARLVDTDTVLDCAC 2 2756 2778 PRDAPPPRKP ASNLVGVLLGLLVPVVVVLLAVT RECIYRTRRK
688 Q9ULT6 MRPRSGGRPGATGRR...AGPRSHSADSSSPGA 2 216 238 QHRPPRQPTE YFDMGIFLAFFVVVSLVCLILLV KIKLKQRRSQ
689 Q8WWF5 MPLCRPEHLMPRASR...EYTTVSSAPPEAPGQ 2 255 277 DLGCHPVLTV SWVLGCTLALVVSAFFVLNHLWL WAQACCSHRR
690 P60852 MAGGSATTWGYPVAL...LSQTWAQKLWESNRQ 2 602 624 DSNGNSSLRP LLWAVLLLPAVALVLGFGVFVGL SQTWAQKLWE
691 P20239 MARWQRKASVSSPCG...FICYLYKKRTIRFNH 2 684 703 IIAKDIASKT LGAVAALVGSAVILGFICYL YKKRTIRFNH
692 P21754 MELSYRLFICLLLWG...TRRCRTASHPVSASE 2 387 409 EQWALPSDTS VVLLGVGLAVVVSLTLTAVILVL TRRCRTASHP
693 Q12836 MWLLRCVLLCVSLSL...LAVKKQKSCPDQMCQ 2 506 528 EKLRVPVDSK VLWVAGLSGTLILGALLVSYLAV KKQKSCPDQM
694 Q8TCW7 MEQIWLLLLLTIRVL...PTSLVLNGIRNPVFD 2 374 396 PFQLNAITSA LISGMVILGVTSFSLLLCSLALL HRKGPTSLVL

And create their feature matrix using the SequenceFeatuer class:

sf = aa.SequenceFeature()
df_parts = sf.get_df_parts(df_seq=df_seq)
X = sf.feature_matrix(features=df_feat["feature"], df_parts=df_parts)

Using list comprehension, labels for all three Distance-based identification approaches can be retrieved using the dPULearn().fit() method with different metric parameters:

dpul = aa.dPULearn()
# List with valid distance measures
list_metrics = ["manhattan", "euclidean", "cosine"]
list_labels = [dpul.fit(X=X, labels=labels, metric=metric, n_unl_to_neg=n_pos).labels_ for metric in  list_metrics]

For the PCA-based identification, use the n_components parameter:

# List with percentage of total variance to be explaiend [0-1]
list_pca_var = [0.6, 0.7, 0.8, 0.9, 0.95]
list_labels.extend([dpul.fit(X=X, labels=labels, n_components=i, n_unl_to_neg=n_pos).labels_ for i in list_pca_var])

Now, the dPULearn().eval() and dPULearnPlot().eval() methods can be used:

names = list_metrics + [f"pca (var {int(x*100)}%)" for x in list_pca_var]
df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names)
dpul_plot = aa.dPULearnPlot()
dpul_plot.eval(df_eval=df_eval)
plt.tight_layout()
plt.show()
../_images/dpul_plot_eval_1_output_9_0.png

Extend the analysis by using a dataset of ground-truth negatives:

_df_seq = aa.load_dataset(name="DOM_GSEC")
# First 14 entries are ground-truth non-substrates
df_neg = _df_seq[_df_seq["label"] == 0].head(14)
df_parts_neg = sf.get_df_parts(df_seq=df_neg)
X_neg = sf.feature_matrix(features=df_feat["feature"], df_parts=df_parts_neg)
# Perform evaluation and visualization
df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names, X_neg=X_neg)
dpul_plot.eval(df_eval=df_eval)
plt.show()
../_images/dpul_plot_eval_2_output_11_0.png

You can effectively utilize the Kullback-Leibler Divergence (KLD) as a complementary measure alongside the adjusted area under the curve (AUC) to evaluate the dissimilarity between sets of identified negatives and reference datasets. These reference datasets include positive samples (‘Pos’), unlabeled samples (‘Unl’), and, when available, ground-truth negative samples (‘Neg’).

df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names, X_neg=X_neg, comp_kld=True)
dpul_plot.eval(df_eval=df_eval)
plt.show()
../_images/dpul_plot_eval_3_output_13_0.png

The legend can be turned-of by legend=False or shifted along the y-axis using legend_y, handy if the figsize is changed:

dpul_plot.eval(df_eval=df_eval, figsize=(8, 5), legend_y=-0.1)
plt.show()
../_images/dpul_plot_eval_4_output_15_0.png

You can customize the list of used colors:

dpul_plot.eval(df_eval=df_eval, colors=["r", "g", "b", "y"])
plt.show()
../_images/dpul_plot_eval_5_output_17_0.png

Customize the x-limits of each subplot using the dict_xlims parameter:

dict_xlims = {0: (0, 0.2), 3: (0, 0.5)} # Adjust first and fourth subplot
dpul_plot.eval(df_eval=df_eval, dict_xlims=dict_xlims)
plt.show()
../_images/dpul_plot_eval_6_output_19_0.png