aaanalysis.dPULearnPlot.eval
- static dPULearnPlot.eval(df_eval=None, figsize=(6, 4), dict_xlims=None, legend=True, legend_y=-0.175, colors=None)[source]
Plot evaluation output of dPULearn comparing multiple sets of identified negatives.
Evaluation measures can be grouped into ‘Homogeneity’ measures (‘avg STD’ and ‘avg IQR’) assessing the similarity within the sets of identified negatives, and ‘Dissimilarity’ measures (‘avg AUC’, ‘avg KLD’) assessing the dissimilarity between the identified negatives and the other reference groups including positive samples (‘Pos’), unlabeled samples (‘Unl’), and ground-truth negative samples (‘Neg’) if given.
- Parameters:
df_eval (pd.DataFrame, shape (n_datasets, n_metrics)) –
DataFrame with evaluation measures for sets of identified negatives. Each row corresponds to a specific dataset including identified negatives. required ‘columns’ are:
’name’: Name of datasets containing identified negatives (typically named by identification approach).
’avg_STD’: Average standard deviation (STD) assessing homogeneity of identified negatives.
’avg_IQR’: Average interquartile range (IQR) assessing homogeneity of identified negatives.
’avg_abs_AUC_DATASET’: Average absolute area under the curve (AUC), which assesses the similarity between the set of identified negatives and other datasets. ‘DATASET’ must be ‘pos’ (positive samples) and ‘unl’ (unlabeled samples), as well as, optionally, ‘neg’ (ground-truth negative samples).
Optional columns include:
’avg_KLD_DATASET’: The average Kullback-Leibler Divergence (KLD), which measures the distribution alignment between the set of identified negatives and other datasets (‘pos’, ‘unl’, or ‘neg’).
figsize (tuple, default=(6, 4)) – Figure dimensions (width, height) in inches.
dict_xlims (dict, optional) – A dictionary containing x-axis limits for subplots. Keys should be the subplot axis number ({0, 1, 2, 4}) and values should be tuple specifying (
xmin,xmax). IfNone, x-axis limits are auto-scaled.legend (bool, default=True) – If
True, legend is set under dissimilarity measures.legend_y (float, default=-0.175) – Legend position regarding the plot y-axis applied if
legend=True.colors (list of str, optional) – List of colors for identified negatives and the following reference datasets: positive samples (‘Pos’), unlabeled samples (‘Unl’), and ground-truth negative samples (‘Neg’).
- Returns:
fig (plt.Figure) – Figure object for evaluation plot
axes (array of plt.Axes) – Array of Axes objects, each representing a subplot within the figure. .
Notes
Ground-truth negatives are only shown if provided by
df_eval.
See also
dPULearn.eval(): the respective computation method.
Examples
You can evaluate different sets of identified negative samples using the
dPULearn().eval()method. Load first one of our example datasets with its respective features:import matplotlib.pyplot as plt import aaanalysis as aa aa.options["verbose"] = False # Dataset with positive (γ-secretase substrates) # and unlabeled data (proteins with unknown substrate status) df_seq = aa.load_dataset(name="DOM_GSEC_PU") labels = df_seq["label"].to_numpy() n_pos = sum([x == 1 for x in labels]) df_feat = aa.load_features(name="DOM_GSEC") aa.display_df(df_seq)
entry sequence label tmd_start tmd_stop jmd_n tmd jmd_c 1 P05067 MLPGLALLLLAAWTA...GYENPTYKFFEQMQN 1 701 723 FAEDVGSNKG AIIGLMVGGVVIATVIVITLVML KKKQYTSIHH 2 P14925 MAGRARSGLLLLLLG...EEEYSAPLPKPAPSS 1 868 890 KLSTEPGSGV SVVLITTLLVIPVLVLLAIVMFI RWKKSRAFGD 3 P70180 MRSLLLFTFSACVLL...RELREDSIRSHFSVA 1 477 499 PCKSSGGLEE SAVTGIVVGALLGAGLLMAFYFF RKKYRITIER 4 Q03157 MGPTSPAARGQGRRW...HGYENPTYRFLEERP 1 585 607 APSGTGVSRE ALSGLLIMGAGGGSLIVLSLLLL RKKKPYGTIS 5 Q06481 MAATGTAAAAATGRL...GYENPTYKYLEQMQI 1 694 716 LREDFSLSSS ALIGLLVIAVAIATVIVISLVML RKRQYGTISH 6 P35613 MAAALFVLLGFALLG...HQNDKGKNVRQRNSS 1 323 345 IITLRVRSHL AALWPFLGIVAEVLVLVTIIFIY EKRRKPEDVL 7 P35070 MDRAARCSGASSLPL...DITPINEDIEETNIA 1 119 141 LFYLRGDRGQ ILVICLIAVMVVFIILVIGVCTC CHPLRKRRKR 8 P09803 MGARCRSFSALLLLL...RFKKLADMYGGGEDD 1 711 733 GIVAAGLQVP AILGILGGILALLILILLLLLFL RRRTVVKEPL 9 P19022 MCRIAGALRTLLPLL...PRFKKLADMYGGGDD 1 724 746 RIVGAGLGTG AIIAILLCIIILLILVLMFVVWM KRRDKERQAK 10 P16070 MDKFWWHAAWGLCLV...DETRNLQNVDMKIGV 1 650 672 GPIRTPQIPE WLIILASLLALALILAVCIAVNS RRRCGQKKKL 11 P09603 MTAPGAAGRCPPTTW...GSPLTQDDRQVELPV 1 496 518 EGSFSPQLQE SVFHLLVPSVILVLLAVGGLLFY RWRRRSHQEP 12 O94985 MLRRPAPALAPAARL...TRQQQLEWDDSTLSY 1 860 882 PHPFAVVPST ATVVIVVCVSFLVFMIILGVFRI RAAHRRTMRD 13 Q9H4D0 MLPGRLCWVPLLLAL...ARQAQLEWDDSTLPY 1 831 853 SSIQHSSVVP SIATVVIIISVCMLVFVVAMGVY RVRIAHQHFI 14 P78310 MALLLCFVLLCGVVD...IPVMIPAQSKDGSIV 1 236 258 RLNVVPPSNK AGLIAGAIIGTLLALALIGLIIF CCRKKRREEK 15 D3ZZK3 MAGIFYFILFSFLFG...MRTQMQQMHGRMVPV 1 548 570 RIIGDGANST VLLVSVSGSVVLVVILIAAFVIS RRRSKYSQAK 16 Q14118 MRMSVGLSLLLPLSG...KNMTPYRSPPPYVPP 1 753 775 KSSEDDVYLH TVIPAVVVAAILLIAGIIAMICY RKKRKGKLTL 17 Q63155 MENSLGCVWVPKLAF...EGLMKQLNAITGSAF 1 1099 1121 SVTPQKNSNL LVITVVTVGVLTVLVVVIVAVIC TRRSSAQQRK 18 Q61483 MGRRSALALAVVSAL...VLSAEKDECVIATEV 1 545 567 HMESQGGPFP WVAVCAGVVLVLLLLLGCAAVVV CVRLKLQKHQ 19 Q9ERC8 MWILALSLFQSFANV...HLKGNNPYAKSYTLV 1 1595 1617 EGLTTNEGLK ILVTISCILVGVLLLFVLLLVVR RRRREQRLKR 20 P54763 MAVRRLGAALLLLPL...QVMRAQMNQIQSVEV 1 543 565 YQTSIKEKLP LIVGSSAAGLVFLIAVVVIAIVC NRRGFERADS 21 Q15303 MKPATGLWVWVSLLV...TVLPPPPYRHRNTVV 1 653 675 TLPQHARTPL IAAGVIGGLFILVIVGLTFAVYV RRKSIKKKRA 22 P16882 MDLCQVFLTLALAVT...SCGYVSTDQLNKIMQ 1 274 296 ILEACEEDIQ FPWFLIIIFGIFGVAVMLFVVIF SKQQRIKMLI 23 P04439 MAVMAPRTLLLLLSG...DSAQGSDVSLTACKV 1 308 330 WELSSQPTIP IVGIIAGLVLLGAVITGAVVAAV MWRRKSSDRK 24 P08069 MKSGSGGGSPTSLWG...RKNERALPLPQSSTC 1 936 958 AKTGYENFIH LIIALPVAVLLIVGGLVIMLYVF HRKRNNSRLG 25 P27930 MLRLYVLVMGVSAFT...TVLWPHHQDFQSYPK 1 347 369 LRTTVKEASS TFSWGIVLAPLSLAFLVLGGIWM HRRCKHRTGK 26 Q9Y219 MRAQGRGRLPRRLLL...RAVRSINEARYAGKE 1 1083 1105 VVTGGSSTGL LVPVLCGAFSVLWLACVVLCVWW TRKRRKERER 27 P15382 MILSNTTAVTPFLTK...IEQPNTHLPETKPSP 1 44 66 SPRSSDGKLE ALYVLMVLGFFGFFTLGIMLSYI RSKKLEHSND 28 Q9Y6J6 MSTLSNFTQTLEDVF...TIHENIGAAGFKMSP 1 47 69 LQAKVDAENF YYVILYLMVMIGMFSFIIVAILV STVKSKRREH 29 P11627 MVVMLRYVWPLLLCS...SSGATSPINPAVALE 1 1124 1146 VSTTGSFASE GWFIAFVSAIILLLLILLILCFI KRSKGGKYSV 30 P01130 MGPWGWKLRWTVALL...SYPSRQMVSLEDDVA 1 787 809 GRGNEKKPSS VRALSIVLPIVLLVFLCLGVFLL WKNWRLKNIN 31 P16150 MATLLLLLGVLVVSP...APAPDEPEGGDGAAP 1 255 277 FRNPDENSRG MLPVAVLVALLAVIVLVALLLLW RRRQKRRTGA 32 P0CC10 MAQAHIQGSPCPLLP...LFKSGSKENVQETQI 1 573 595 LKDLDDVMKT TKIIIGCFVAITFMAAVMLVAFY KLRKQHQLHK 33 Q07954 MLTPPLLLLLPLLSA...LLGRGPEDEIGDPLA 1 4421 4443 HVFSQQQPGH IASILIPLLLLLLLVLVAGVVFW YKRRVQGAKG 34 O75581 MGAVLRSLLACSFCV...HHLYPPPPSPCTDSS 1 1371 1393 YPTEEPAPQA TNTVGSVIGVIVTIFVSGTVYFI CQRMLCPRMK 35 Q924X6 MGRPELGALRPLALL...KCKRVALSLEDDGLP 1 859 881 GSQMGSTVTA AVIGVIVPIVVIALLCMSGYLIW RNWKRKNTKS 36 Q12866 MGPAPLPLLLGLFLP...LLFADDSSEGSEVLM 1 502 524 STPAPGNADP VLIIFGCFCGFILIGLILYISLA IRKRVQETKF 37 P08581 MKAPAVLAPGILVLL...DDEVDTRPASFWETS 1 933 955 VIVQPDQNFT GLIAGVVSISTALLLLLGFFLWL KKRKQIKDLG 38 P15941 MTPGTQSPFFLLLLL...LSYTNPAVAATSANL 1 1159 1181 SAQSGAGVPG WGIALLVLVCVLVALAIVYLIAL AVCQCRRKNY 39 Q9JKF6 MARMGLAGAAGRWWG...SQNDGSFISKKEWYV 1 355 377 GRRAGQMPTA IIGGVAGSVLLVLIVVGGIIVAL RRRRHTFKGD 40 Q62765 MALPRCMWPNYVWRA...PHPHPHPHSHSTTRV 1 697 719 VDQRDYSTEL SVTIAVGASLLFLNILAFAALYY KKDKRRHDVH 41 O35516 MPALRPAALRALLWL...THMSEPPHSNMQVYA 1 1680 1702 SELESPRNAQ LLYLLAVAVVIILFFILLGVIMA KRKRKHGFLW 42 Q61982 MGLGARGRRRRRRLM...LGPQPEVTPKRQVMA 1 1644 1666 PLEAPEQSVP LLPLLVAGAVFLLIIFILGVMVA RRKREHSTLW 43 P31695 MQPQLLLLLLLPLNF...VHQEIPLNSVVRNLN 1 1441 1463 QAGTRPSANQ LPWPILCSPVVGVLLLALGALLV LQLIRRRRRE 44 Q8CJ26 MLYNVSKGVVYSDTA...VVQVLSSPAESSSVV 1 52 74 FPPEPPGASS NIIPVYCALLATVILGLLAYVAF KCWRSHKQRQ 45 Q63373 MYQRMLRCGAELGSP...SANKNKKNKDKEYYV 1 392 414 AEVIRESSST TGMVVGIVAAAALCILILLYAMY KYRNRDEGSY 46 P15209 MSPWLKWHGPAMARL...QNLAKASPVYLDILG 1 431 453 VADQSNREHL SVYAVVVIASVVGFCLLVMLLLL KLARHSKFGM 47 Q86YL7 MWKVSALLFVLGSAS...IIVVVMRKMSGRYSP 1 130 152 TVEKDGLSTV TLVGIIVGVLLAIGFIGAIIVVV MRKMSGRYSP 48 Q13308 MGAARGSPARPRRLP...EIASALGDSTVDSKP 1 704 726 GSPPPYKMIQ TIGLSVGAAVAYIIAVLGLMFYC KKRCKAKRLQ 49 P10586 MAPEPAPGRTMVPLV...RAALEYLGSFDHYAT 1 1262 1284 PAQQQEEPEM LWVTGPVLAVILIILIVIAILLF KRKRTHSPSS 50 P28828 MRTLGTCLVTLAGLL...YKFCYEVALEYLNSG 1 743 764 PEKQTDHTVK IAGVIAGILLFVIIFLGVVLVM KKRKLAKKRK 51 Q7M729 MSRAGNRGNTQARWL...GLPGSKAEEKPPTKV 1 161 183 VVDKLEKVDN TVTLIILAVVGGVIGLLVCILLL KKLITFILKK 52 O75056 MKPGPPHRAGAAHGA...SVTYQKPDKQEEFYA 1 387 409 KSILERKEVL VAVIVGGVVGALFAAFLVTLLIY RMKKKDEGSY 53 P78324 MEPAGPAPGRLGPLL...EPSFSEYASVQVPRK 1 372 394 AENTGSNERN IYIVVGVVCTLLVALLMAALYLV RIRQKKAQGS 54 Q92673 MATRSSRRESRLPFL...PMITGFSDDVPMVIA 1 2136 2158 SATQAARSTD VAAVVVPILFLILLSLGVGFAIL YTKHRRLQSS 55 Q99523 MERPWGAADGLSRWP...NKSGYHDDSDEDLLE 1 756 778 SPEKQNSKSN SVPIILAIVGLMLVTVVAGVLIV KKYVCGGRFL 56 Q8BGV3 MARGLDLAPLLLLLL...VELKELGEMRSEPSL 1 269 291 PPQFSMKRLT AGVIAVIAVVSVAVVAGVVVLVV TKRRKSGKYK 57 P35590 MVWRVPPFLLPILFL...ENFTYAGIDATAEEA 1 764 786 EEGLDQQLIL AVVGSVSATCLTILAALLTLVCI RRSCLHRRRT 58 P08138 MGAGATGRAMDGPRL...LVESLCSESTATSPV 1 250 272 QPVVTRGTTD NLIPVYCSILAAVVVGLVAYIAF KRWNSCKQNK 59 Q02223 MLQMAGQCSQNEYFD...AALSATEIEKSISAR 1 54 76 SVTNSVKGTN AILWTCLGLSLIISLAVFVLMFL LRKINSEPLK 60 P19438 MGLSTVPDLLLPLVL...LCGPAALPPAPSLLR 1 212 234 VKGTEDSGTT VLLPLVIFFGLCLLSLLFIGLMY RYQRWKSKLY 61 Q06418 MALRRSMGRPGLPPL...RLLLLQQGLLPHSSC 1 429 451 QQGPPHSRTS WVPVVLGVLTALVTAAALALILL RKRRKETRFG 62 P30530 MAWRCPRMGRVPLAW...ADRGSPAAPGQEDGA 1 450 472 EPSTPAFSWP WWYVLLGAVVAAACVLILALFLV HRRKKETRYG 63 Q6EMK4 MCSRVPLLLPLLLLL...PGPGLQSPLHAKPYI 1 577 599 TQAREGNLPL LIAPALAAVLLAALAAVGAAYCV RRGRAMAAAA 64 P12821 MGAASGRRGPGLLLP...SHGPQFGSEVELRHS 2 1257 1276 GLDLDAQQAR VGQWLLLFLGIALLVATLGL SQRLFSIRHR 65 P36896 MAESAGASSFFPLVV...KKTLSQLSVQEDVKI 2 127 149 EHPSMWGPVE LVGIIAGPVFLLFLIIIIVFLVI NYHQRVYHNR 66 Q8NER5 MTRALCSALRQALLL...KKTISQLCVKEDCKA 2 114 136 PNAPKLGPME LAIIITVPVCLLSIAAMLTVWAC QGRQCSYRKK 67 P37023 MTLGSPRKGLLMLLM...LQKISNSPEKPKVIQ 2 119 141 PSEQPGTDGQ LALILGPVLALLALVALGVLGLW HVRRRQEKQR 68 O43184 MAARPLPVSPARALL...YPHQVPRSTHTAYIK 2 707 729 DSGPIRQADN QGLTIGILVTILCLLAAGFVVYL KRKTLIRLLF 69 Q13444 MRLALLWALGLLGAG...SRPAPPPPTVSSLYL 2 695 717 TTQLKATSSL TTGLLLSLLVLLVLVMLGASYWY RARLHQRLCQ 70 Q9Z0F8 MRRRLLILTTLVPFV...KLQRQSRVDSKETEC 2 672 694 NTFGKFLADN IVGSVLVFSLIFWIPFSILVHCV DKKLDKQYES 71 Q9Y3Q7 MFLLLALLTELGRLQ...NRNSSVVSESDDVGH 2 685 707 FYTEKGYNTH WNNWFILSFCIFLPFFIVFTTVI FKRNEISKSC 72 Q9R157 MPLLFILAELAMLFA...KRNERKIVPQGEHKI 2 684 703 TKRLSKNEDS WVILGFFIFLPFIVTFLVGI MKRNERKIVP 73 O35674 MPGRAGVARFCLLAL...EYRSQRVGAIISSKI 2 704 726 VDSGPLPPKS VGPVIAGVFSALFVLAVLVLLCH CYRQSHKLGK 74 O43506 MAVGEPLVHIRVTLL...VLFKKRTKSKEDEEG 2 692 714 MEGLNVMGKL RYLSLLCLLPLVAFLLFCLHVLF KKRTKSKEDE 75 Q9UKJ8 MAVDGTLVYIRVTLL...RQCSGPKETKAHSSG 2 685 707 PASAKRGVFL PLIVIPSLSVLTFLFTVGLLMYL RQCSGPKETK 76 Q9JI76 MECFIMLGADARTLM...KIPSGPKETKASSPG 2 687 709 SGPTSQKRRV IITVLSITVPVLSILICLLIAGL YRIYCKIPSG 77 O75077 MKPPGSSSRQPPLAG...NVKKRRFDPTQQGPI 2 794 816 GPKGPSATNL IIGSIAGAILVAAIVLGGTGWGF KNVKKRRFDP 78 Q9R160 MVAMSEALVHARITL...PSYETVKPPDEWANP 2 698 720 SKKDAPEKPN VIIWLLPIICVAVVLSVLFCLSG ATKKSREAAA 79 Q9R159 MQTTQRASSFAAAED...ENKEDTNEVMNTETE 2 706 728 TEKKHKKSIG LVILFWILFACFSVLFIVFLFFL RSYVELPMSE 80 Q9UKQ2 MLQGLLPVSLLLSVA...KDNPVSTPKDSNPKA 2 664 686 PDCDDSSVVF HFSIVVGVLFPMAVIFVVVAMVI RHQSSREKQK 81 Q9UKF5 MKMLLLLHCLGVFLS...PQLMPSQSQPPVTPS 2 676 698 PPPKRKKKKK FCYLCILLLIVLFILLCCLYRLC KKSKPIKKQQ 82 Q9UKF2 MRSVQIFLSQCRLLL...ESKRPKAKSVKKQKK 2 686 708 GLLRGAIPSS IWVVSIIMFRLILLILSVVFVFF RQVIGNHLKP 83 Q8TC27 MFRLWLLLAGLCGLL...RSKSQDSTQTQSSSN 2 681 703 IMERASGKTE NTWLLGFLIALPILIVTTAIVLA RKQLKKWFAK 84 Q9BZ11 MGWRPRRARGTPLLL...DPQADQVQMPRSCLW 2 13 35 WRPRRARGTP LLLLLLLLLLWPVPGAGVLQGHI PGQPVTPHWV 85 Q99965 MWRVLFLLSGLGGLR...YSSDEQPESESEPKG 2 689 711 IYHSKPMRWP FFLFIPFFIIFCVLIAIMVKVNF QRKKWRTEDY 86 Q3TTE0 MFLLLLLFLHLKGLQ...GSNTNVTSSGGSTSH 2 695 712 HSNLKKNQLQ LILYISLPLLVMISAVVI KQSKLSRVCD 87 Q9H2U9 MLPGCIFLMILLIPQ...SKDSRGIADPNQSAK 2 668 690 VACEETLHVT NITILVVVLVLVIVGIGVLILLV RYRKCIKLKQ 88 Q13443 MGSGARFPSGTLRVR...PARPAPAPPLYSSLT 2 699 718 NEMNTALRDG LLVFFFLIVPLIVCAIFIFI KRDQLWRSYF 89 Q60813 MSVAAAGRGFASSLS...SESSSSSSWSDSDSQ 2 741 763 KMEDEEVNLK VMVLVVPIFLVVLLCCLMLIAYL WSEVQEVVSP 90 Q8R534 MERLKLGKIPEHWCI...AAAEKKDEDEEEGEE 2 705 727 STEELILNLK LIVLAVILVLMILLIIICIISAY TKSETASEAG 91 A2AJA7 MCLPSHLLSTWVLFM...PESITSNPQSPPDLA 2 1161 1183 EVAAPVSVPV AVGGALLFFMFLVLMGLGGWHWL QKQHCPGQRS 92 Q86WK6 MHPHRDPRGLWLLLP...PESVSSVFSDTPIVV 2 371 393 FTLHGHHDTL NTAYTTLVGCILSVVLVLIYLYL TPCRCWCRGV 93 Q8K592 MLGTLGLWTLLPAAA...SPDPVGDTVQVYVNE 2 146 168 EPQATPGGPV WMALLLLGMFLVLLLSSIILALL QRKACRVQGG 94 Q9BXJ7 MGVLGRVLLWLQLCA...SYFVNPLFAGAEAEA 2 361 383 AHVWGSSAAG LAGGVAAAVLLALLVLLVAPPLL RRAGRLRWRR 95 P58335 MVAERSPARSPGSWL...DEVCIWECIEKELTA 2 319 341 IVTATECSNG IAAIIVILVLLLLLGIGLMWWFW PLCCKVVIKD 96 P51693 MGPASPAARGLSRRP...HGYENPTYRFLEERP 2 581 603 APAGTGVSRE AVSGLLIMGAGGGSLIVLSMLLL RRKKPYGAIS 97 O75882 MVAAAAATEARLRRR...RNRKQQPPAQPGTCI 2 1279 1301 AFSQHSNFMD LVQFFVTFFSCFLSLLLVAAVVW KIKQSCWASR 98 P27037 MGAAAKLAFAVFLIS...TMVTNVDFPPKESSL 2 139 161 PVTPKPPYYN ILLYSLVPLMLIAGIVICAFWVY RHHKMAYPPV 99 P56817 MAQALPWLLLWMGAG...RQQHDDFADDISLLK 2 455 477 YNIPQTDEST LMTIAYVMAAICALFMLPLCLMV CQWRCLRCLR 100 Q13145 MDRHSSYIFIWLQLE...VHWGMYSGHGKLEFV 2 154 176 SSKELWFRAA VIAVPIAGGLILVLLIMLALRML RSENKRLQDQ 101 Q13873 MTSSLQRPWRVPWLP...TATTMVSKDIGMNCL 2 152 174 PHSFNRDETI IIALASVSVLAVLIVALCFGYRM LTGDRKQGLH 102 P36894 MPQLYIYIRLLGAYL...KKTLAKMVESQDVKI 2 153 175 IGPFFDGSIR WLVLLISMAVCIIAMIIFSSCFC YKHYCKSISS 103 O00238 MLLRSAGKLNVGTKK...KKTLAKMSESQDIKL 2 126 148 RDFVDGPIHH RALLISVTVCSLLLVLIILFCYF RYKRQETRPR 104 Q9BWV1 MLRGTMTAWRGMRPE...GPLVRVSFETPPLTI 2 856 878 MVARSSDLPY LIVGVVLGSIVLIIVTFIPFCLW RAWSKQKHTT 105 Q13410 MAVFPSSGLPRCLLT...LHSKLIPTQPSQGAP 2 245 267 IPASSLPRLT PWIVAVAVILMVLGLLTIGSIFF TWRLYNERPR 106 Q8WVV5 MEPAAALHFSLPASL...EEGLKLHRVGTHQSL 2 264 286 ALAVILTASP WMVSMTVILAVFIIFMAVSICCI KKLQREKKIL 107 Q96KV6 MEPAAALHFSRPASL...HRELVVPQLPARKKV 2 246 268 ESFMPSRSPC VVILPVIMIILMIPIAICIYWIN NLQKEKKDSH 108 P78410 MKMASSLAFLLLNFH...LTRGEESSSDTNKSA 2 248 270 ADPFFRSAQP WIAALAGTLPILLLLLAGASYFL WRQQKEITAL 109 Q7Z6A9 MKTLPAMLGTGKLFW...VKEAPTEYASICVRS 2 153 175 PSKDEMASRP WLLYRLLPLGGLPLLITTCFCLF CCLRRHQGKQ 110 Q7TST0 MMKGSPSVPPAGCLL...YFTRNSMGLSATAQP 2 249 271 PEPFFPKTCP WKVALVCSVLILLVLLGGISLGI WKEHQVKRRE 111 Q6UXE8 MAFVLILVLSFYELV...EEKGTPIFICPVSWG 2 237 259 FFQPSPWRLA SILLGLLCGALCGVVMGMIIVFF KSKGKIQAEL 112 Q8BJE2 MADFSVFLGFLKQIP...YEPLDPAWAVNEAVS 2 259 281 LPRMSPWKKA FVGTLVVLPLSLIVLTMLALRYF YKLRSFQEKQ 113 A8MVZ5 MAVTCDPEAFLSICF...RQREKNKASLEEERE 2 253 275 FSRSSQFTAW KAALPLILVAMGLVIAGGICIFW KRQREKNKAS 114 Q86VB7 MSKLRMVLLEDSGSA...KEAILSHTEKENGNL 2 1049 1071 KATTGRSSRQ SSFIAVGILGVVLLAIFVALFFL TKKRRQRQRL 115 Q9NR16 MMLPQNSWHIDFGRC...DTSLLGVLPASEATK 2 1360 1382 LKSLNASSGH LALILSSIFGLLLLVLFILFLTW CRVQKQKHLP 116 P55291 MDAAFLLVLGLLAQS...LSPGALLPRHRGRTA 2 604 626 ALLAGGTGLS LGALVIVLASALLLLVLVLLVAL RARFWKQSRG 117 P33146 MGSALLLALGLLAQS...WGPRFARLADMYGHQ 2 603 625 ALRGGGVGVS LGALVIVLASTVVLLVLILLAAL RTRFRGHSRG 118 O75309 MVPAWLWLLCVSVPQ...DPDQPADSVPLKATV 2 786 808 RMKGMPTKLS AVGILVGTLVAIGIFLILIFTHW TMSRKKDPDQ 119 Q9R100 MVSAQLHFLCLLTLY...KVENPQSPENKPLRS 2 784 806 PAGRQDGIPT VGMAVGILLTTFLVIGIILAVVF IRMRKDKVEN 120 Q9H159 MNCYLLLRFMLGIPL...KRLACMFGSAVQSNN 2 596 618 ELVLSMGFKT EVIIAILICIMIIFGFIFLTLGL KQRRKQILFP 121 Q9H251 MGRHVATSCHVAWLL...LRDVIMETPLEITEL 2 3068 3090 LPDDMSALQM AIIVLAILLFLAAMLFVLMNWYY RTVHKRKLKA 122 Q8IXH8 MAMRSGRHPSLLLLL...ATPFEEIYSESGVPS 2 614 636 VELADAEVGL HVGALFPVCAAFVALAVALLFLL RCYFVLEPKR 123 P59862 MDTRGCAWLLLLLSL...STPSEAMCFTSRVPS 2 592 614 ECEEPSDTWL LWWALSPVGAALMVLSAALLCLL RCSCTFGPKR 124 Q8NFZ8 MGRARRFQWPLLLLW...LNGSDGHKRKEEFFI 2 324 346 VVEAQTSVPY AIVGGILALLVFLIICVLVGMVW CSVRQKGSYL 125 O43570 MPRRSLHAAAVLLLV...VIYKPATKMETEAHA 2 305 327 CTAAGLSLGI ILSLALAGILGICIVVVVSIWLF RRKSIKKGDN 126 Q9ULX7 MLFSALLLEVIWILA...RKSVVFTSAQATTEA 2 290 312 QAGSSYTTGE MLSLGVGILVGCLCLLLAVYFIA RKIRKKRLEN 127 Q16790 MAPLCPSPWLPLLIP...GGVSYRPAEVAETGA 2 411 433 AEPVQLNSCL AAGDILALVFGLLFAVTSVAFLV QMRRQHRRGT 128 P27824 MEGKWLLCMLLVLGT...EILNRSPRNRKPRRE 2 483 505 MIEAAEERPW LWVVYILTVALPVFLVILFCCSG KKQTSGMEYK 129 O75976 MASGRDERPPWRLGR...DETDTEEETLYSSKH 2 1300 1322 DNRIFGLPRE LVVTVSGATMSALILTACIIWCI CSIKSNRHKD 130 Q13740 MESKGASSCRLLFCL...EENKKLEENNHKTEA 2 528 550 NREKVNDQAK LIVGIVVGLLLAALVAGVVYWLY MKKSKTASKH 131 P15391 MPPPRLLFFLLFLTP...PAWGGGGRMGTWSTR 2 295 317 LRTGGWKVSA VTLAYLIFCLCSLVGILHLQRAL VLRRKRKRMT 132 P29017 MLFLQFLLLALLLPG...LVLWFKKHCSYQDIL 2 301 323 DIILYWGHHF SMNWIALVVIVPLVILIVLVLWF KKHCSYQDIL 133 P11609 MRYLPWLLLWAFLQV...VYYIWRRRSAYQDIR 2 304 326 ILYWDARQAP VGLIVFIVLIMLVVVGAVVYYIW RRRSAYQDIR 134 P15813 MGCLLFLLLWALLQA...FTSRFKRQTSYQGVL 2 299 321 QDIVLYWGGS YTSMGLIALAVLACLLFLLIVGF TSRFKRQTSY 135 P15812 MLLLFLLFEGLCCPG...NRVLKKWKTRLNQLW 2 303 325 GGHDLIIHWG GYSIFLILICLTVIVTLVILVVV DSRLKKQSSN 136 Q15762 MDYPTLLLALLHVYR...YVNYPTFSRRPKTRV 2 253 275 GKTDNQYTLF VAGGTVLLLLFVISITTIIVIFL NRRRRRERRD 137 P20273 MHLLGPWLLLLVLEY...RPQAQENVDYVILKH 2 684 706 TLTVYYSPET IGRRVAVGLGSCLAILILAICGL KLQRRWKRTQ 138 Q07763 MLGQAVLFTTFLLLR...ARLSRRELENFDVYS 2 226 248 QSVPSNFRFL PFGVIIVILVTLFLGAIICFCVW TKKRKQLQFS 139 Q9HCU0 MLLRLLLAWAAAGPT...PRGSLTGVQTCRTSV 2 686 708 LAEHSQRDDR WLLVALLVPTCVFLVVLLALGIV YCTRCGPHAP 140 Q5ZPR3 MLRRRGSPGMGVHVG...LKHSDSKEDDGQEIA 2 466 488 TGQPMTFPPE ALWVTVGLSVCLIALLVALAFVC WRKIKQSCEE 141 P26842 MARPHPWWLCVLGTL...PIQEDYRKPEPACSP 2 189 211 PPQRSLCSSD FIRILVIFSGMFLVFTLAGALFL HQRRKYRSNK 142 P10747 MLRLLLALNLFPSIQ...YQPYAPPRDFAAYRS 2 154 176 PLFPGPSKPF WVLVVVGGVLACYSLLVTVAFII FWVRSKRSRL 143 P06729 MSFPCKFVASFLLIF...PPHGAAENSLSPSSN 2 212 234 SCPEKGLDIY LIIGICGGGSLLMVFVALLVFYI TKRKKQRSRR 144 Q8IX05 MLRAALPALLLPLLG...VLVVGEENEYPVQFD 2 171 193 RKYLSDNHIL ISALVIASTVILTVLGAIIWFLY KKHSDSRFTT 145 Q9NPF0 MSGGWMAQVGAWRTG...MKESLLLSEQKTSLP 2 231 253 DQSGSPTAYG VIAAAAVLSASLVTATLLLLSWL RAQERLRPLG 146 P28906 MLVRRGARAGPRMPR...NGHSARQHVVADTEL 2 291 313 ASHQSYSQKT LIALVTSGALLAVLGITGYFLMN RRSWSPTGER 147 P04235 MEHSGILASLILIAV...QYSRLGGNWPRNKKS 2 105 127 NCVELDSGTM AGVIFIDLIATLLLALGVYCFAG HETGRPSGAA 148 P07766 MQSGTHWRVLGLCLL...KGQRDLYSGLNQRRI 2 130 152 ENCMEMDVMS VATIVIVDICITGGLLLLVYYWS KNRKAKAKPV 149 P09693 MEQGKGLAVLILAII...DDQYSHLQGNQLRRN 2 115 137 QNCIELNAAT ISGFLFAEIVSIFVLAVGVYFIA GQDGVRQSRA 150 P20963 MKWKALFTAAILQAQ...KDTYDALHMQALPPR 2 31 53 AQSFGLLDPK LCYLLDGILFIYGVILTALFLRV KFSRSADAPA 151 P01730 MNRGVPFRHLLLVLQ...TCQCPHRFQKTCSPI 2 398 420 PTWSTPVQPM ALIVLGGVAGLLLFIGLGIFFCV RCRHRRRQAE 152 P30203 MWLFFGITGLLTAAL...PDSTDNDDYDDISAA 2 402 424 ENKESRELML LIPSIVLGILLLGSLIFIAFILL RIKGKYALPV 153 P11912 MPGGPGVLQALPATI...DVGSLNIGDVQLEKP 2 143 165 FLDMGEGTKN RIITAEGIILLFCAVVPGTLLLF RKRWQNEKLG 154 P40259 MARLALSPVPSHWMV...TGEVKWSVGEHPGQE 2 161 180 LKQRNTLKDG IIMIQTLLIILFIIVPIFLL LDKDDSKAGM 155 P50283 MTQQAVLALLLTLAG...SYSNRKTPCIPNQYQ 2 149 171 SQEPLQTSFS FPAAIAVGFFFTGLLLGVVCSML RKIQIKKLCA 156 P33681 MGHTRRQGTSPSKCP...RRRNERLRRESVRPV 2 247 269 EHFPDNLLPS WAITLISVNGIFVICCLTYCFAP RCRERRRNER 157 Q00609 MACNCQLMQDTPLLK...TFGPEEALAEQTVFL 2 249 271 EDPPDSKNTL VLFGAGFGAVITVVVIVVIIKCF CKHRSCFRRN 158 Q01151 MSRGLQLLLLSCAYS...NKHLGLVTPHKTELV 2 146 168 ETFKKYRAEI VLLLALVIFYLTLIIFTCKFARL QSIFPDFSKA 159 P42081 MDPQCTMGLSNILFV...KSSKTSSCDKSDTCF 2 246 268 LEDPQPPPDH IPWITAVLPTVIICVMVFCLILW KWKKKKRPRN 160 P42082 MDPRCTMGLAILIFV...KELEPQIASAKPNAE 2 246 263 FPSPQTYWKE ITASVTVALLLVMLLIIV CHKKPNQPSR 161 P01732 MALPVTALLLPLALL...VVKSGDKPSLSARYV 2 184 206 TRGLDFACDI YIWAPLAGTCGVLLLSLVITLYC NHRNRRRVCK 162 Q9BYE9 MAQLWLSCFLLPALV...EGPSYTNAGLDTTDL 2 1155 1177 ESDLSKQLIS VIIGLGVALLLVLVIMTMAFVCV RKSYNRKLQA 163 Q6ZTQ4 MQEAIILLALLGAMS...PAFMNRAYPKPHPGK 2 710 732 LRKNVYSPSA WYVPFVITLGSILLLGLLVYLVV LLAKAIHRHC 164 A6H8M9 MVLLRLLVFLFAPVV...RDYLFNTHTGARRWL 2 686 708 TDTEAFWQPQ PWFVVVLTATGALLLLALGWLLG RLLQGLAQLL 165 Q9HBB8 MGSWALLWPPLLFTG...GGGPYDAPGGDDSYI 2 668 690 SEDKRFSVVD MAALGGVLGALLLLALLGLAVLV HKHYGPRLKC 166 Q9D871 MDFSRPSFSPWRWLT...CSRGKTCHKCPWQTN 2 334 356 TVNRELYIPG PLVIFLILLTSLGGAFVCRVLVY SLFQSCSRGK 167 P13688 MGHLSAPLHRVRVPW...SLTATEIIYSEVKKQ 2 433 455 NGLSPGAIAG IVIGVVALVALIAVALACFLHFG KTGRASDQRD 168 Q925P2 MELASAHLHKGQVPW...SPRATETVYSEVKKK 2 422 444 VIFDSTYDIS DVPIAVIITGAVAGVILIAGLAY RLCSRKSRWG 169 Q6ZU64 MFTLTGCRLVEKTQK...AEVLHPVVPLPTDLP 2 187 209 KMKYRPPKTK FFFTVIPQPIFLSPGITLTLPIV FRPLEAKEYM 170 Q9H9P2 MSRVVSLLLGAALLC...LWISKSTRKESGMEV 2 218 240 VTEAGIIPNL IYVVIPTIPLLLLILVAFGTCCF QMLHKSKGRT 171 Q96F05 MWTALVLIWIFSLSL...QVDYLINGMYADSEM 2 400 422 EPLTQAVVDK TLLLVVLLLGVTLFITVLVLFAL QAYESYKKKD 172 Q6NUJ2 MSARAPKELRLALPP...LDTAGEGLLQTVVLS 2 62 84 QQLFQSFSST LVLIVLVTLIFCLIVLSLSTFHI HKRRMKKRKM 173 Q86T13 MRPAFALCLLWQALW...EGALLAESPLGSSDA 2 399 421 PQAFDSSSAV VFIFVSTAVVVLVILTMTVLGLV KLCFHESPSS 174 Q8BG22 MTHRDSTGPVIGLKL...RKKRPSRKENETKFL 2 904 926 VVSRDDLILK GVLTTVGLIAILCLIMVVAHCIF NRKKRPSRKE 175 O14967 MHFQAFWLCLGLLFI...DGPIKSVRKRRVRKD 2 471 493 QLMAAAEGHP WLWLIYLVTAGVPIALITSFCWP RKVKKKHKDT 176 Q8TDQ1 MPLLTLYLLLFWLSG...RGPEEPTEYSTISRP 2 156 178 LDNRHKLLKL SVLLPLIFTILLLLLVAASLLAW RMMKYQQKAA 177 Q08708 MTARAWASWRSSALL...RSSRSRQNWPKGENQ 2 184 206 HPGSLFSNVR FLLLVLLELPLLLSMLGAVLWVN RPQRSSRSRQ 178 A8K4G0 MWLPPALLLLSLSGC...IYMNFSEPLTKDMAT 2 150 172 AVFIGSHKRN HYMLLVFVKVPILLILVTAILWL KGSQRVPEEP 179 Q9H6B4 MSLLLLLLLVSYYVG...TPSMIPSQSRAFQTV 2 234 256 TVQYVQSIGM VAGAVTGIVAGALLIFLLVWLLI RRKDKERYEE 180 Q9BZ76 MASVAWAVLKVLLLL...LRKENESKVSKKEEC 2 1244 1266 EPLVNADRRD SAVIGGVIAVVIFILLCITAIAI RIYQQRKLRK 181 Q6NT55 MLPITDRLLHLLGLE...ENGLWLKVEPLPPRA 2 21 40 LLGLEKTAFR IYAVSTLLLFLLFFLFRLLL RFLRLCRSFY 182 P17927 MGASSPRSPEPVGPP...PRTLQTNEENSRVLP 2 1974 1996 KCTSRTHDAL IVGTLSGTIFFILLIIFLSWIIL KHRKGNNAHE 183 P20023 MGAAGLLGVFLALVA...EAREVYSVDPYNPAS 2 976 998 AVCRSRSLAP VLCGIAAGLILLTFLIVITLYVI SKHRARNYYT 184 Q9NZV1 MYLVAGDRGLAGCGH...QKQNHLQADNFYQTV 2 939 961 LHPSEDSSLD SIASVVVPIIICLSIIIAFLFIN QKKQWIPLLC 185 Q9HC73 MGRLVLLWGAAVFLL...GGFTFVMNDRSYVAL 2 233 252 TPPKPKLSKF ILISSLAILLMVSLLLLSLW KLWRVKKFLI 186 O95727 MWWRVLSLLAWFPLQ...SKLEEKHIQVPESIV 2 288 310 YLGLARKKSG ILLLTLVSFLIFILFIIVQLFIM KLRKAHVIWK 187 Q8VHS2 MKLKRTAYLLFLYLS...EMWIRMPPPALERLI 2 1346 1368 ADDRLLGIFT AVGSGTLALFFILLLAGVASLIA SNKRATQGTY 188 Q5IJ48 MALARPGTPDPQALA...EMDSVLKVPPEERLI 2 1225 1247 PLPLPFPLLE VAVPAACACLLLLLLGLLSGILA ARKRRQSEGT 189 Q9BUF7 MANPGLGLLLALGLP...PPTPNLKLPPEERLI 2 57 79 SSDGNLRPEA ITAIIVVFSLLAALLLAVGLALL VRKLREKRQT 190 Q8NEA5 MDKVQSGFLILFLFL...NKTKNASHNGKMEDL 2 102 124 IRHRPALVKV ILISSVAFSIALICGMAISYMIY RLAQAEERQQ 191 P07333 MGPGVLLLLLVATAW...DIAQPLLQPNNYQFC 2 515 537 AHTHPPDEFL FTPVVVACMSIMALLLLLLLLLL YKYKQKPKYQ 192 P15509 MLLLVTSLLLCELPH...GKGYREEVLTVKEIT 2 324 346 EFGSDDGNLG SVYIYVLLIVGTLVCGIVLGFLF KRFLRIQRLF 193 Q99062 MARLGNCSLTWAALI...LQGIRVHGMEALGSF 2 626 648 LMTLTPEGSE LHIILGLFGLLLLLTCLCGTAWL CCSPNRKNPL 194 Q96PZ7 MTAWRRFQSLLLLLG...AVRFDTTLNTVCTVV 2 3487 3509 SSHYHGTSSG SVAAAILVPFFALILSGFAFYLY KHRTRPKVQY 195 O95196 MGRAGGGGPGRGPPP...DQADLDVNCLQNNLT 2 421 443 RCESIITDFQ VMCVAVGSAALVLLLLFMMTVFF AKKLYLLKTE 196 P16410 MACLGFQRHKAQLNL...PECEKQFQPYFIPIN 2 162 184 IDPEPCPDSD FLLWILAAVSSGLFFYSFLLTAV SLSKMLKKRS 197 Q86XM0 MLMLMLVAAVTMWLR...PPGRHRTPHGGRSDH 2 719 741 IYVYGAFPVQ LVSAGVVILLIISSILGSVWLAY KTPKLLRTAR 198 E9Q9F6 MLVLMLAAAVATMVR...QNRGKVRVAQKHPET 2 755 777 KQLRSEKGQR LLGFCYQILQLCLGVCFCTWLRG KLRQWLRPRR 199 Q5SY80 MSAREVAVLLLWLSC...IYEPLHKPQRKRKKN 2 905 927 TFGLIPSPSV YLVASFLFVLMLLFFTILVLSYF RYMRIYRRYI 200 Q6ZRH7 MCGPAMFPAGPPWPR...DRAEPKEAVERQLMT 2 1073 1095 PKRALFIIMV SASVFVGLVIFYIAFCLLWPLVV KGCTMIRWKI 201 Q86UP6 MELVRRLMPLTLLIL...VNQRADYKYQKLQNY 2 571 593 TPNQPFNSVH LFSFMVLALNVVTVATITVRHFV NQRADYKYQK 202 Q5JRM2 MNLVICVLLLSIWKN...DRGYNQVTSEVTLND 2 48 70 QTKLNYLRRN LLILVGIIIMVFVFICFCYLHYN CLSDDASKAG 203 Q96J86 MDAPRLPVRPGVLLP...AQRSPPPPYPGNARK 2 63 85 YIGNILSGTA IAGIVFGIVFIMGVIAGIAICIC MCMKNHRATR 204 Q61476 MVSSTWGYDPRAGAG...RRSDFQGKERKDVSK 2 367 389 ESNSGGDRYI YGFVAVIAMIDSLIIVKTLWTIL SPNRRSDFQG 205 Q8N8Z6 MVPGARGGGALARAA...DCLTPLNQTAMTALL 2 458 480 EETSTGINIT TVAIPLVLLVVLVFAGMGIFAAF RKKKKKGSPY 206 Q96PD2 MASRAVVRARRCPQC...GAGRDGECDVFKEIL 2 527 549 TTVTPNVTKD VALAAVLVPVLVMVLTTLILILV CAWHWRNRKK 207 Q16832 MILIPRMLLVLFLLL...SFQEIHLLLLQQGDE 2 399 421 PMLKVDDSNT RILIGCLVAIIFILLAIIVIILW RQFWQKMLEK 208 P80370 MTATEALLRVLLLLL...IDMTTFSKEAGDEEI 2 305 327 KTPLLTEGQA ICFTILGVLTSLVVLGTVGIVFL NKCETWVSNL 209 Q6UY11 MPSGCRCLHLVCLLC...LPRDLPPEPGKTTAL 2 307 329 RQEAGLGEPS LVALVVFGALTAALVLATVLLTL RAWRRGVCPP 210 P28068 MITFLPLLLGLSLGC...TPLPGSNYSEGWHIS 2 219 238 PGLSPMQTLK VSVSAVTLGLGLIIFSLGVI SWRRAGHSSY 211 Q96KC8 MTAPCSQPAQLPGRR...KLLVELVQKKKQAKS 2 153 175 YRRVRKMSNA ELALLLFIILTVGHYAVVWSIYL EKQLDELLSR 212 P20036 MRPEDRMFHIRAVIL...KSLRSGHDPRAQGTL 2 223 245 EPIQMPETTE TVLCALGLVLGLVGIIVGTVLII KSLRSGHDPR 213 P01903 MAISGVPVLGFFIIA...KGLRKSNAAERRGPL 2 217 239 APSPLPETTE NVVCALGLTVGLVGIIIGTIFII KGLRKSNAAE 214 P13762 MVCLKLPGGSCMAAL...NQKGHSGLQPTGLLS 2 228 250 SARSESAQSK MLSGVGGFVLGLLFLGTGLFIYF RNQKGHSGLQ 215 Q08554 MALASAAPGSIFCKQ...LEPKFRTLAKTCIKK 2 692 714 DVRPNVILGR WAILAMVLGSVLLLCILFTCFCV TAKRTVKKCF 216 Q19T08 MGTAGAMQLCWVILG...NMNNGKQSLSAEKVL 2 123 145 SPTSETVLTV AAFGVISFIVILVVVVIILVGVV SLRFKCRKSK 217 Q9UNE0 MAHVGDCTQTPWLPV...WAGVVPPASQPHAAS 2 188 210 LSGQGHLATA LIIAMSTIFIMAIAIVLIIMFYI LKTKPSAPAC 218 P00533 MRPSGTAGAALLALL...EYLRVAPQSSEFIGA 2 646 668 CPTNGPKIPS IATGMVGALLLLLVVALGIGLFM RRRHIVRKRT 219 P01133 MLLTLIILLPVVSKF...QQRALDPPHQMELTQ 2 1033 1055 RHAGHGQQQK VIVVAVCVVVLVMLLLLSLWGAH YYRTQKLLSK 220 Q6UXG2 MAEPGHSHHLSARVR...SVPLKTSSGGLDMDL 2 908 930 RVTICKTIDF WLKVGISAGTCTAILLTVLTCYF WKKNQKLEYK 221 P0C7U0 MAGRGWGALWVCVAA...DILDYWKGVSAQHKS 2 418 440 PVPSPSTATH YIMTILGCLFGMVLVLGAVYYCL RRRRRQEEKH 222 Q6PCB8 MRALPGLLEARARTP...ENNVPRHRKNESLGQ 2 264 281 VLSYLVPLKP FLVIVAEVILLVATILLC EKYTQKKKKH 223 Q5UCC4 MAAASAGATRLLLLL...QGGGGGGGGGGGSGR 2 221 243 QKAKNPQEQK SFFAKYWMYIIPVVLFLMMSGAP DTGGQGGGGG 224 Q9NPA0 MAAALWGFFPVLLLL...SGSSKTGKSGAGKRR 2 160 182 IKRESWGWTD FLMNPMVMMMVLPLLIFVLLPKV VNTSDPDMRR 225 Q902F9 MNPSEMQRKAPPRRR...NVGKSKRDQIVTVSV 2 633 655 NLNTVTWVKT IGSTTIINLILILVCLFCLLLVY RCTQQLRRDS 226 Q9Y6X5 MKLLVILLFSGLITG...SRLQLQEDDDDPLIG 2 406 428 LVDQWCINLP EAIAIVIGSLLVLTMLTCLIIIM QNRLSVPRPF 227 Q6UW88 MALGVPISVYLLFNA...LKSPYNVCSGERRPL 2 111 133 TSYAVDSYEK YIAIGIGVGLLLSGFLVIFYCYI RKRCLKLKSP 228 Q60750 MERRWPLGLALLLLL...GHQKRILCSIQGFKD 2 549 571 PVSRSLTGGE IVAVIFGLLLGIALLIGIYVFRS RRGQRQRQQR 229 Q9UF33 MGGCEVREFLLQFGF...LRLHMMHIQEKGFHV 2 549 571 SDMAAEQGQI LVIATAAVGGFTLLVILTLFFLI TGRCQWYIKA 230 P29322 MAPARGRLPPALWVV...MRAQLTSTQGPRRHL 2 541 563 TGKPRPRYDT RTIVWICLTLITGLVVLLLLLIC KKRHCGYSKA 231 P19235 MDHLGASLWPQVGSL...IPAAEPLPPSYVACS 2 251 273 SLLTPSDLDP LILTLSLILVVILVLLTVLALLS HRRALKQKIW 232 Q9NQ60 MNFILFIFIPGVFSL...GSDNEMHENDESVTR 2 184 206 DLEDLKIKIM LGISLMTLLLFVVLLAFCSATLY KLRHLSYKSC 233 P04626 MELAALCRWGLLLAL...PTAENPEYLGLDVPV 2 653 675 PAEQRASPLT SIISAVVGILLVVVLGVVFGILI KRRQQKIRKY 234 P21860 MRANDALQVLGLLFS...YWHSRLFPKANAQRT 2 644 666 LVLIGKTHLT MALTVIAGLVVIFMMLGGTFLYW RGRRIQNKRA 235 O14944 MTAGRRMEMLCAGRV...EYERVTSGDPELPQV 2 118 140 LTVHQPLSKE YVALTVILIILFLITVVGSTYYF CRWYRNRKSK 236 Q925F2 MILQAGTPETSLLRV...VPVMVPAQSQAGSLV 2 252 274 LDVMTGSKAA VVAGAVVGTFVGLVLIAGLVLLY QRRSKTLEEL 237 P58658 MLLPGRARQPPTPQP...SGLDTSLPRNMGQFY 2 322 344 FAYIRAHPER AALLFVSSVCIGLALTLCALVIR ESCAKDFRDL 238 P22794 MPTDMEHTGHYLHLA...KDEEGTEKLTNKQIG 2 136 158 CAENNNNMAM LICLIIIAVLFLICTFLFLSTVV LANKVSSLRR 239 P34910 MDPKYFILILFCGHL...QDLNESLPPPPAELL 2 203 225 QTPQKNNYNS IAAILIGVLLTSMLVAIIIIVLW KCLRKPVLND 240 Q6P995 MARLCRRVPCTLLLG...NIWKKREERPLIPIN 2 353 375 EDSKDITAYH TVFLTAILGGTIVIVIGFFAVLL CYCRDKCGTP 241 Q8TBP5 MKASQCCCCLSHLLA...EDDDNTLFDANHPRR 2 124 146 NPGDKPMTQR ALTVLMVVSGAVLVYFVVRTVRM RRRNRKTRRY 242 Q3ZCQ3 MRAVPLPAPLLPLLL...DDEDEDSTVFDIKYR 2 93 115 ILLRDLPTLK AAVIVAFAFTTLLIACLLLRVFR SGKRLKKTRK 243 Q9BVV8 MGPRVLQPPLLLLLL...LDSDEETVFESRNLR 2 74 93 GASGSALTRS FYVILGFCGLTALYFLIRAF RLKKPQRRRY 244 Q9D3R5 MSLAHTTVLLWAWGS...CQSRCCPNFSAQTLL 2 377 399 PASLSDPETR TAIELTLMGYLLITIFFITIHLC RCCCQSRCCP 245 Q17R55 MPPMLWLLLHFAAPA...HPSPGRRSTQVLVVK 2 329 351 ARKALRGRAD SVLKGLKLVLLVVTVLALLGALL KCIHPSPGRR 246 Q15884 MILLVNLFVLLSVVC...ERPHSLIGVIRETVL 2 86 108 CSAVHLLLKK VLFALCALNALTTTVCLVAAALR YLQIFATRRS 247 Q5JX71 MWTLKSSLVLLLCLT...YHVTICEIWGEESSS 2 51 73 HFRIRQNLPE HTQGWLGSKWLWLLFVVVPFVIL QCQRDSEKNK 248 Q14517 MGRHLALLLLLLLLF...EVTIPPLDSQQHTEV 2 4179 4201 QYVSTPWNIG LAEGIGIVVFVAGIFLLVVVFVL CRKMISRKKK 249 Q9NYQ8 MTIALLGFAIFLLHC...DMVESDYGSCEEVMF 2 4049 4071 IQRGDWGQQE LLIITVAVAFIIISTVGLLFYCR RCKSHKPVAM 250 Q8TDW7 MDIIMGHCVGTRPPA...SLHIPFVETQHQTQV 2 4156 4175 GHSYVGKEEL IGIAVVLFVIFILVVLFIVF RKKVFRKNYS 251 Q8WWV6 MPLFLILCLLQGSSF...PAGASLTAPERNPGP 2 451 470 TFPEDESSSR TLAPVSTMLALFMLMALVLL QRKLWRRRTS 252 P12319 MAPAMESPTLLCVAL...FRLLNPHPKPNPKNN 2 203 225 TVIKAPREKY WLQFFIPLLVVILFAVDTGLFIS TQQQVTFLLK 253 P12318 MTMETQMSQNVCPRN...IYLTLPPNDHVNSNN 2 218 240 PSMGSSSPMG IIVAVVIATAVAAIVAAVVALIY CRKKRISANS 254 P08637 MWQLLLPTALLLLVS...WKDHKFKWRKDPQDK 2 207 229 STISSFFPPG YQVSFCLVMVLLFAVDTGLYFSV KTNIRSSTRD 255 P12314 MWFLTTLLLWVPVDG...QLQEGVHRKEPQGAT 2 289 311 QVLGLQLPTP VWFHVLFYLAVGIMFLVNTVLWV TIRKELKRKK 256 P55899 MGVPRPQPWALGLLL...QDADLKDVNVIPATA 2 298 320 VELESPAKSS VLVVGIVIGVLLLTAAAVGGALL WRRMRSGLPA 257 Q96LA6 MLPRLLLLICAPLCE...LRKANITDVDYEDAM 2 308 330 TGARSNHLTS GVIEGLLSTLGPATVALLFCYGL KRKIGRRSAR 258 Q96LA5 MLLWSLLVIFDAVTE...ENKDSQVIYSSVKKS 2 400 422 DGYRRDLMTA GVLWGLFGVLGFTGVALLLYALF HKISGESSAT 259 Q96P31 MLLWLLLLILTPGRE...ENYENVPRVLLASDH 2 572 594 VTGTSRNRTG LTAAGITGLVLSILVLAAAAALL HYARARRKPG 260 Q68SN8 MSGSFSPCVVFTQMW...ESESPRSRCQMAEKK 2 495 517 FDMTKNRSVP MAAGITVGLLIMAVGVFLFYCWF SRKAGGKPTS 261 Q6DN72 MLLWTAVLLFVPCVG...RTLQEPLSDCEEVLC 2 307 329 SQVLFTPASN WLVPWLPASLLGLMVIAAALLVY VRSWRKAGPL 262 P11362 MWSWKCLLFWAVLVT...PRHPAQLANGGLKRR 2 375 397 RPAVMTSPLY LEIIIYCTGAFLISCMVGSVIVY KMKSGTKKSD 263 Q8N441 MTPSPLLLLLLPPLL...SHVEGKVHQHIHYQC 2 377 399 ASSSSATSLP WPVVIGIPAGAVFILGTLLLWLC QAQKKPCTPA 264 P36888 MPALARDGGQLPLLV...MDLGLLSPQAQVEDS 2 542 564 PGPFPFIQDN ISFYATIGVCLLFIVVLTLLICH KYKKQFRYES 265 F2Z333 MRAPPLLLLLAACAP...LMRPALARPGLRRHP 2 183 205 FTAEPAGMQD IVVAMTAVGGSICVMLVVICLLV AYITENLMRP 266 Q8NAU1 MHPGSPSAWPPRARA...STPEHQGGGLLRSKI 2 150 172 TMKEMGRNQQ LRTGEVLIIVVVLFMWAGVIALF CRQYDIIKDN 267 Q9P2B2 MGRLASRPLLLALLS...ETRRERRRLMSMEMD 2 831 853 VKMDVLNAFK YPLLIGVGLSTVIGLLSCLIGYC SSHWCCKKEV 268 Q5SZK8 MHSAGTPGLSSRRTG...PMVPPQSHHNDSSEV 2 3111 3133 ELNSPSSAVS LVTVVGGTTVGLLTICLTVIAVL MCRGKESFRG 269 P09958 MELRPWLLWVVAATG...GRGERTAFIKDQSAL 2 716 738 AGLLPSHLPE VVAGLSCAFIVLVFVTVFLVLQL RSGFSFRGVK 270 P23188 MELRSWLLWVVAAAG...GRGERTAFIKDQSAL 2 713 735 RLQAGLASHL PEVLAGLSCLIIVLIFGIVFLFL HRCSGFSFRG 271 O95866 MAVFLQLLPLLLSRA...TADPADASTIYAVVV 2 143 165 GPTHGSVYPQ LLIPLLGAGLVLGLGALGLVWWL HRRLPPQPIR 272 D7PDD4 MALVLPLLPLLLSKV...TVVSGDASTVYAVVV 2 141 160 GSTHGYEYPK VLIPLLGVGLVLGLGVAGVV WRRRRLSPPP 273 Q9NU53 MEGAPPGSLALRLLL...GPEKRAENLEDKTCI 2 262 284 LCRFWSNVFP VFFQFLNIMVVGITGAAVVITIL KVFFPVSEYK 274 Q8WWB7 MRGSVECTWGWGHCA...LLLHHKKYSEYQSIN 2 372 394 PVDGLSPLVL GIMAVALGAPGLMLLGGGLVLLL HHKKYSEYQS 275 P02724 MYGKIIFVLLLSEIV...PLSSVEIENPETSDQ 2 92 114 QLAHHFSEPE ITLIIFGVMAGVIGTILLISYGI RRLIKKSPSD 276 Q86XS8 MSCAGRAGPARLAAL...IRATASLNANEVEWF 2 195 217 MPPKNFSRGS LVFVSISFIVLMIISSAWLIFYF IQKIRYTNAR 277 P07359 MPLLLLLLLLPSPLH...DLLSTVSIRYSGHSL 2 533 555 PDFCCLLPLG FYVLGLFWLLFASVVLILLLSWV GHVKPQALDS 278 Q99795 MVGKMWPVLWTLCAV...EQRSTGRESPDHLDQ 2 235 257 TVAVRSPSMN VALYVGIAVGVVAALIIIGIIIY CCCCRGKDDN 279 Q14956 MECLYYFLGFLLLAA...EKDPLLKNQEFKGVS 2 497 519 DRDPASPLRM ANSALISVGCLAIFVTVISLLVY KKHKEYNPIE 280 P40197 MLRGTLLCAVLGLLR...IGQLFRKLIRERALG 2 522 544 TGKGQDHSPF WGFYFLLLAVQAMITVIIVFAMI KIGQLFRKLI 281 P25092 MKTLLLDLALWSLLF...EYLQLNTTDKESTYF 2 432 454 NDITGRGPQI LMIAVFTLTGAVVLLLLVALLML RKYRKDYELR 282 Q02846 MTACARRAGGLPDPG...ERRRKLEKARPGQFS 2 465 487 NICGGGLEPG LVFLGFLLVVGMGLAGAFLAHYV RHRLLHMQMV 283 A0A0U1RPR8 MAGLQQGCHFEGQNW...GLAEPRKSGEAGPGP 2 480 502 CIRGVQPLGS LLTLTIACVLALVGGFLAYFIRL GLQQLRLLRG 284 P51841 MFLGLGRFSRLVLWF...FQRRKAERQLVRNKP 2 468 490 KICHGGIDPA FAMMVCLTLLIALLSINGFAYFI RRRINKIQLI 285 B1B212 MAKSSLSLNWSLLVL...KKKKGALCCSSSSTT 2 211 233 QHNSDTQGLS FTWIVIICIGGIVSFMAFMVFAW CMLKKKKGAL 286 Q8TDQ0 MFSHLPFDCVLLLLL...RQQPSQPLGCRFAMP 2 202 224 LRDSGATIRI GIYIGAGICAGLALALIFGALIF KWYSHSKEKI 287 Q99075 MKLLPSVVLKLFLAA...VENEEKVKLGMTNSH 2 162 184 NRLYTYDHTT ILAVVAVVLSSVCLLVIVGLLMF RYHRRGGYDV 288 Q9QUJ0 MDPPGYLLFLLLLPV...AQEDGRVYINMPGRG 2 35 57 CSGCGTLSLP LLAGLVAADAVMSLLIVGVVFVC MRPHGRPAQE 289 A8MVW5 MGQDAFMEPFGDTLG...EVIQHIPAQQQDHPE 2 350 372 EKLAQKGKSL SPLASITGISLFLIISMCLLFLW KKYQPYKVIK 290 Q14CZ8 MKRERGALSRASRAL...IIREQDEAGPVEISA 2 241 263 VKITVYRRSS LYIILSTGGIFLLVTLVTVCACW KPSKRKQKKL 291 E9Q7X6 MATPRAPRWPPPSLL...NPSFISDESRRRDYF 2 1204 1226 GGLNCGNPYQ LITVVIAAAGGGLLLILGVALIV TCCRKSKNDI 292 Q9BQS7 MESGHLLWALLFMQS...RSILDDSFKLLSFKQ 2 1110 1132 IPIKNVEMLA SVLVAISVTLLLVVLALGGVVWY QHRQRKLRRN 293 Q9UM44 MKAQTALSFFLILIT...APDNGEENVPLSGKV 2 346 365 PSQETASHNK GLWILVPSAILAAFLLIWSV KCCRAQLEAR 294 Q75VT8 MPWTILLFASGSLAI...SSPEPPEFSTFRACQ 2 123 145 QVSFPVPTWI LALSLSLAGAVLFSGLVAITVLV RKAKAKNLQK 295 Q8HWB0 MMLLLPLLAVFLVKR...VMYQPTQVNEGSSPS 2 297 319 APRESGDILR VSTISGTTILIIALAGVGVLIWR RSQELKEVMY 296 Q6MZM0 MPRKQPAGCIFLLTF...DYQQVQSCALPTDAL 2 1115 1137 KNLGPTGAKA ALVILFIIGLLLLITTVILSLRL CSAMKQTDYQ 297 Q08334 MAWSLGSWLGGCLLV...DSCSLGTPPGQGPQS 2 224 246 HDETVPSWMV AVILMASVFMVCLALLGCFALLW CVYKKTKYAF 298 Q64385 MSSSCSGLTRVLVAV...LPGIPNLQRTPENFS 2 369 391 LDHRDPLEQV AVLASLGIFSCLGLAVGALALGL WLRLRRSGKD 299 P42701 MEPLVTWVVPLLFLF...TELSLEDGDRCKAKM 2 546 568 RFSIEVQVSD WLIFFASLGSFLSILLVGVLGYL GLNRAARHLC 300 Q99665 MAHTFRGCSLAFMFI...LTLDQLKMRCDSLML 2 623 645 REFCLQGKAN WMAFVAPSICIAIIMVGIFSTHY FQQKVFVLLA 301 Q14627 MAFVCLAIGCLYTFL...PNTYPKMIPEFFCDT 2 344 363 EDLSKKTLLR FWLPFGFILILVIFVTGLLL RKPNTYPKMI 302 O88786 MAFVHIRCLCFILLC...VDLNKEVCAYEDTLC 2 335 357 WEGYTGPDSK IIFIVPVCLFFIFLLLLLCLIVE KEEPEPTLSL 303 Q60819 MASPQLRGYGVQAIP...MTVRASSKEDEDTGA 2 206 228 ISPHSSKMTK VAISTSVLLVGAGVVMAFLAWYI KSRQPSQPCR 304 Q96F46 MGAARSPPSAVPGPL...SGWDTMGSESEGPSA 2 319 341 TPEPIPDYMP LWVYWFITGISILLVGSVILLIV CMTWRLAGPG 305 Q8NAC3 MPVPWFLLSLALGRS...RGVGPGAGPGAGDGT 2 540 559 PMDKYIHKRW ALVWLACLLFAAALSLILLL KKDHAKGWLR 306 Q8BH06 MGSPRLAALLLSLPL...DYQGSTNSPCGFSCL 2 415 437 VLCPDVSHRH LGLLILALLALTALVGVVLVLLG RRLLPGSGRT 307 O95256 MLCLGWIFLWLVAGE...SRTETTGRSSQPKEW 2 360 382 VQLKEKRGVV LLYILLGTIGTLVAVLAASALLY RHWIEIVLLY 308 Q6PHB0 MHTPGTPAPGHPDPP...TRFMEEWGLHVQMES 2 255 277 VQTSAWKAKV IFWYVFLTSVIVFLFSAIGYLVY RYIHVGKEKH 309 Q6UXL0 MQTFTMVLEEIWTSL...TAVMSPEELLRAWIS 2 233 255 CVEVQGEAIP LVLALFAFVGFMLILVVVPLFVW KMGRLLQYSC 310 Q8N6P7 MRTLLTILTVGSLAA...DSLFRGLALTVQWES 2 227 249 CRVKTLPDRT WTYSFSGAFLFSMGFLVAVLCYL SYRYVTKPPA 311 Q6UWB1 MRGGRGAPFWLWPLP...EELGLLGPPRPQVLA 2 517 539 HLPDNTLRWK VLPGILFLWGLFLLGCGLSLATS GRCYHLRHKV 312 P05362 MAPSSPRPALPALLV...QKGTPMKPNTQATPP 2 481 503 TVNVLSPRYE IVIITVVAAAVIMGTAGLSTYLY NRQRKIKKYR 313 P35330 MSSFACWSLSLLILF...AAWRRLPRAFRARPV 2 224 246 VYEPMQDNQM VIIIVVVSILLFLFVTSVLLCFI FGQHWHRRRT 314 O75144 MRLGSPGLLFLLFSS...GAWAVSPETELTGHV 2 254 276 ENPVSTGEKN AATWSILAVLCLLVVVAVAIGWV CRDRCLQHSY 315 Q9JHJ8 MQLKCPCFVSLGTRQ...YTGPKTVQLELTDHA 2 280 299 PQETHNNELK VLVPVLAVLAAAAFVSFIIY RRTRPHRSYT 316 Q9Y6W8 MKSGLWYFFLFCLRI...AVNTAKKSRLTDVTL 2 142 164 ESQLCCQLKF WLPIGCAAFVVVCILGCILICWL TKKKYSSSVH 317 P98153 MVPKADSGAFLLLFL...PGGGRHSRSSLNTVV 2 346 368 DGNSLFDSMA SGMRLVVSCISSFLILSLLLFMV HRLRQRRRER 318 Q8IVU1 MAVQRAASPRRPPAP...RPAAARVTQPAHSEQ 2 639 661 KEEAANQTST TGIVIGIHIGVTCIIFCVLFLLF GQRGRVLLCK 319 Q9H665 MGPGRCLLTALLLLA...LRVLSKLGSSGVCWA 2 165 187 QQAWPNFLPL VVLVLLLTLAVIAILLFILLWHL CWPKEKADPY 320 A8E0Y8 MACILCVASLFLSLT...KTSLQKEAGEESGHY 2 971 993 VSSLICSSGP LLHFLIVCPFVMLLLLATSFLCL YRKARKLSQL 321 O75054 MKCFFPVLSCLAVLG...CLEPPVLSIHPGAID 2 1125 1147 LQSIICSNDA LFYFVFFYPFPIFGILIITILLV RFKSRNSSKN 322 Q7TSN7 MEGSWRDVLAVLVIL...FDIASPQKVRNVTLV 2 239 261 GEEGPALPTW AIILLAVAFSLLLILIIVLIIIF CCCCASRREK 323 O95976 MGTASRSNIARHLQT...NTYENRRVLSNYERP 2 154 176 IKLLSKELRS FLTALVSLLSVYVTGVCVAFILL SKSKSNPLRN 324 Q61098 MHHEELILTLCILIV...PWREESEARSVLSAP 2 326 348 IPDIPGHVFT GGVTVLVLASVAAVCIVILCVIY KVDLVLFYRR 325 Q9NPH3 MTLLWCVVSLYFYGI...SSDEQGLSYSSLKNV 2 360 382 KVPAPRYTVE LACGFGATVLLVVILIVVYHVYW LEMVLFYRAH 326 Q9HBE5 MPRGWAAPLLLLLLQ...VVIPPPLSSPGPQAS 2 233 255 FQTQSEELKE GWNPHLLLLLLLVIVFIPAFWSL KTHPLWRLWK 327 Q5VWK5 MNQVTIQWDAVIALY...NILESHFNRISLLEK 2 354 376 GHLTSDNRGD IGLLLGMIVFAVMLSILSLIGIF NRSFRTGIKR 328 P14784 MAAPALSWRLPLLIL...LSLQELQGQDPTHLV 2 243 265 PAALGKDTIP WLGHLLVGLSGAFGFIILVYLLI NCRNTGPWLK 329 P31785 MLKPSLPFTSLLFLQ...SPYWAPPCYTLKPET 2 262 284 KENPFLFALE AVVISVGSMGLIISLLCVYFWLE RTMPRIPTLK 330 Q8NI17 MMWTWALWMLPSLCK...FLVSEKLPEHTKGEV 2 521 543 KTLSFSVFEI ILITSLIGGGLLILIILTVAYGL KKPNKLTHLC 331 P26952 MAANLWLILGLLASH...EPALEDCEVTPVTDA 2 333 355 CPPEVMPVKT ALVTSVATVLGAGLVAAGLLLWW RKSLLYRLCP 332 P32927 MVLAQGLLSMALLAL...LSLPPWEVNKPGEVC 2 443 465 SWDTESVLPM WVLALIVIFLTIAVLLALRFCGI YGYRLRRKWE 333 Q01344 MIIVAHVLLILLGAT...YIEKPGVETLEDSVF 2 342 361 GNDEHKPLRE WFVIVIMATICFILLILSLI CKICHLWIKL 334 P40189 MLTLQTWLVQALFIF...SYLPQTVRQGGYMPQ 2 620 642 TPKFAQGEIE AIVVPVCLAFLLTTLLGVLFCFN KRDLIKKHIW 335 P16871 MTILGTTFGMVFSLL...QEEAYVTMSSFYQNQ 2 241 263 INNSSGEMDP ILLTISILSFFSVALLVILACVL WKKRIKPIVW 336 Q01114 MALGRCIAEGWTLER...LTLAQPVALPVSSRA 2 269 291 RQGLLVPRWQ WSASILVVVPIFLLLTGFVHLLF KLSPRLKRIF 337 Q86SU0 MAWPKLPAPWLLLCT...RSEKDSSHSGRSVVI 2 163 185 TSGDPDKEVK LIVLHWLTVIFIILGALLLLLLI GVCWCQCCPQ 338 Q01638 MGFWILAILTILMYS...PRKASSLTPLAAQKQ 2 328 350 LSRKNPIDHH SIYCIIAVCSVFLMLINVLVIIL KMFWIEATLL 339 Q9BZV3 MIMFPLFGKISLGIL...PEFAAFVREQQVEEV 2 1102 1124 CEEFVSEPVI IGITIASVVGLLVIFSAIIYFFI RTLQAHHDRS 340 P17181 MMVVLLGATTLVLVA...ESESKTSEELQQDFV 2 437 459 EKTKPGNTSK IWLIVGICIALFALPFVIYAAKV FLRCINYVFF 341 P33896 MLAVVGAAALVLVAG...YLQSPALRTEPALLC 2 427 449 KLCEKTRPGS FSTIWIITGLGVVFFSVMVLYAL RSVWKYLCHV 342 P15260 MALLFLLPLVMQGVS...SLIGYRPTEDSKEFS 2 248 270 IFNSSIKGSL WIPVVAALLLFLVLSLVFICFYI KKINPLKEKS 343 P38484 MRPTLLWSLLLLLGV...ISFPEKEQEDVLQTL 2 248 270 MADASTELQQ VILISVGTFSLLSVLAGACFFLV LKYRGLIKYW 344 Q8IU57 MAGPERWGPLLLCLL...RTEDRGRTLGHYMAR 2 227 249 CFLLEVPEAN WAFLVLPSLLILLLVIAAGGVIW KTLMGNPWFQ 345 Q9WTL4 MAVPALWPWGVHLLM...NGASDYSAPNGGPGH 2 922 944 LEEEDTGGMR IFLTVTPVGFMLLVTLAALGFFY SRKRNSTLYT 346 Q3MIP1 MSVHYTLNLRVFWPL...PPRSQRTQGFLEGEP 2 46 64 ARAEPADGVD GGFPLLKVAVLLLLSYVLL RCRHAVRQRF 347 Q9NZN1 MKAPIPHLILLYATF...LPLLPRETSISSVIW 2 356 378 SVLLHKRELM YTVELAGGLGAILLLLVCLVTIY KCYKIEIMLF 348 P26006 MGPGPSRAPRAPRLM...MKSQPSETERLTDDY 2 992 1014 LVEELPAEIE LWLVLVAVGAGLLLLGLIILLLW KCGFFKRART 349 P13612 MAWEARREPGPRRAA...RRDSWSYINSKSNDD 2 978 1000 HHQRPKRYFT IVIISSSLLLGLIVLLLISYVMW KAGFFKRQYK 350 P20701 MKDSCITVMAMALLS...KPLHEKDSESGGGKD 2 1090 1112 KVDVVYEKQM LYLYVLSGIGGLLLLLLIFIVLY KVGFFKRNLK 351 Q3UV74 MLGQCTLLPVLAGLL...HHVEPVWNQERQGTQ 2 672 694 LVCAEISNTT ILLGVIVGVLLAVIFLLVYCMVY LKGTQKAAKL 352 A2A863 MAGPCCSPWVKLLLL...SGSLSTHMDQQFFQT 2 711 733 LVHKKKDCPP GSFWWLIPLLIFLLLLLALLLLL CWKYCACCKA 353 P26010 MVALPMVLVLLLVLS...TTINPRFQEADSPTL 2 724 746 VRPQEKGADH TQAIVLGCVGGIVAVGLGLVLAY RLSVEIYDRR 354 P26012 MCGSALAFFTAAFVC...DISKLNAHETFRCNF 2 682 704 QTSECFSSPS YLRIFFIIFIVTFLIGLLKVLII RQVILQWNSN 355 Q8IYV9 MGPHFTLLCAALAGC...QTQVPKEKATDSRQQ 2 291 313 QPLQPEKMLA SRLLGLLICGSLALITGLTFAIF RRRKVIDFIK 356 Q9D9J7 MGPHFTLLLAALANC...FNSDYSGDKSEATEN 2 320 342 QNPEKKMKTR LLILLTLGFVVLVASIIISVLHF RKVSAKLKNA 357 Q6UXV1 MPLALTLLLLSGLGA...VSACTYRQNRKLLLQ 2 187 209 QMDSKYPRNQ ALLGILISVSLAVFVFVVIVVSA CTYRQNRKLL 358 Q5VZ72 MGDLWLFLLLPLSAF...GKIDEKEEKDFRLRK 2 178 200 RKAENREIAL FLILLATAVILGSAVLLFHFCIF HRRKMKAIRR 359 P57087 MARRSRHRLLLLLLR...TMSENDFKHTKSFII 2 239 261 RMQVDDLNIS GIIAAVVVVALVISVCGLGVCYA QRKGYFSKET 360 Q80UL9 MLCLLKLIVIPVILA...SPKASSLVRSSVRSK 2 280 302 KGQQGILNGN QLVIIVGIVCATFLLLPVLILIV KKAKWNKSSV 361 O76095 MLAGAGRPGLPQGRH...DRKALEKVRKQIESI 2 109 126 ALMEQRLFWK FEGAVVCVALIFACLVII RQRQLDRKAL 362 Q5VV43 MAPPTGVLSSLLLLV...SIRNGASFSYCSKDR 2 956 978 WDGESNCEWS IFYVTVLAFTLIVLTGGFTWLCI CCCKRQKRTK 363 Q8IYS2 MWLQQRLKGLPGLLS...PGAKPLFRSKEDPSV 2 592 614 HMAQQDPGLP FLFWFSVASLITLFHLFLFKLIY NEYCGPGAKP 364 Q9Y6H6 METTNGTETWYESLH...SDPYHVYIKNRVSMI 2 57 79 RASLPGRDDN SYMYILFVMFLFAVTVGSLILGY TRSRKVDKRS 365 Q9QZ26 MNCSESQRLQTLLNR...AAGALPALAQGAERV 2 59 81 REATSAKGND AYLYILLIMIFYACLAGGLILAY TRSRKLVEAK 366 Q99706 MSMSPTVIILACLGF...SQTQLASSNVPAAGI 2 243 265 FKTGIARHLH AVIRYSVAIILFTILPFFLLHRW CSKKKDAAVM 367 P83555 MLLWFLSLVCSGFFL...SEFSADTIVYMEIMK 2 337 359 DTKTNNYKNL HILTGLLVTMVLVVIIIFYSCYF SKQNKSQKQA 368 Q96J84 MLSLLVWILTLSDTF...SDYGQRFQQRMQTHV 2 497 519 LEEREVLPVG IIAGATIGASILLIFFFIALVFF LYRRRKGSRK 369 Q6UWL6 MLRMRVPALLVLLFC...AAFPTPSHPRLQTHV 2 511 533 GRRDLLPTVR IVAGVAAATTTLLMVITGVALCC WRHSKASASF 370 Q8IZU9 MKPFQLDLLFVCFFL...SDPSRPLQRRMQTHV 2 536 558 GLEAESVPMA VIIGVAVGAGVAFLVLMATIVAF CCARSQRNLK 371 P10721 MRGARGAWDFLCVLL...STASSSQPLLVHDDV 2 521 543 NNKEQIHPHT LFTPLLIGFVIVAGMMCIIVMIL TYKYLQKPMY 372 P05532 MRGARGAWDLLCVLL...SSASSTQPLLVHEDA 2 524 546 NNKEQIQAHT LFTPLLIGFVVAAGAMGIIVMVL TYKYLQKPMY 373 Q96MU8 MAPPAARLALLSAAA...KGQSQQDDRNPLVSD 2 391 413 MGAGSHRVEG WTVYGLATLLILTVTAIVAKILL HVTFKSHRVP 374 A6NMS7 MSSAQCPALVCVMSR...DSEAPTEEEESEALP 2 1582 1604 EVPGYGYTDK LILALIVTGILTILIILFCLIVI CCHRRSLQED 375 Q8BG84 MSLHPVILLVLVLCL...TDMAESSTYAAIIRH 2 143 165 SDTSWLKTYS IYIFTVVSVIFLLCLSALLFCFL RHRQKKQGLP 376 P13473 MVCFRLFPVPGSGLV...YFIGLKHHHAGYEQF 2 378 400 CSADDDNFLV PIAVGAALAGVLILVLLAYFIGL KHHHAGYEQF 377 Q9UQV4 MPRQLSAAAALFASL...YKIRLRCQSSGYQRI 2 380 402 FGNVDECSSD YTIVLPVIGAIVVGLCLMGMGVY KIRLRCQSSG 378 Q9UJQ1 MDLQGRGVPSIDRLR...QVQIPRDRSQYKHMG 2 236 258 PVDEREQLEE TLPLILGLILGLVIMVTLAIYHV HHKMTANQVQ 379 Q6UX15 MRPGTALQAVLLAVL...RSKESGWVENEIYGY 2 236 258 SREAALNLAY ILIPSIPLLLLLVVTTVVCWVWI CRKRKREQPD 380 Q86UK5 MDPSGSRGRPTWVLA...NFLNAKKAMRALGMD 2 299 321 VTVLPHHGLH AAGFFIAFLLSLVLTWAALFLMV RYQCLKGNML 381 P48357 MICQKFCVVLLHWEF...QTHKIMENKMCDLTV 2 840 862 QDDIEKHQSD AGLYVIVPVIISSSILLLGTLLI SHQRMKKLFW 382 P19256 MVAGSDAGRALGVLS...GILKCDRKPDRTNSN 2 216 238 IPSSGHSRHR YALIPIPLAVITTCIVLYMNGIL KCDRKPDRTN 383 P42702 MMDIYVCLKRPSWMV...GGWSFTNFFQNKPND 2 835 857 MYVVTKENSV GLIIAILIPVAVAVIVGVVTSIL CYRKREWIKE 384 Q96FE5 MQVSKRMLAGGVRSM...ISSADAPRKFNMKMI 2 560 582 FPFDIKTLII ATTMGFISFLGVVLFCLVLLFLW SRGKGNTKHN 385 Q6UY18 MDAATAPKQAWPPWP...GDKNSGGNRVTAKLF 2 535 557 FFLDSRGVAM VLAVGFLPFLTSVTLCFGLIALW SKGKGRVKHH 386 O75022 MTPALTALLCLGLSL...EPPAEPSIYATLAIH 2 442 464 PPSTPGLGRY LEVLIGVSVAFVLLLFLLLFLLL RRQRHSKHRT 387 Q8VCD3 MLEIRGLSPSLCLLS...IGVLRRQPISPSMQA 2 439 461 SGWLLGSSTC LHTSIFLFFLLLQTVGFFCYVNF SRQELDKRLQ 388 Q9H0V9 MAATLGPLGSWQQWR...ILYNKWQEQSRKRFY 2 314 336 APLPPLSGLA LFLIVFFSLVFSVFAIVIGIILY NKWQEQSRKR 389 Q12907 MAAEGWIWRWGWGRR...AVVFQKRQERNKRFY 2 323 345 FRSGPLTGWR VFLLLLCALLGIVVCAVVGAVVF QKRQERNKRF 390 A0A1B0GTW7 MLLLLLLLLLLPPLV...ELHSTRVPVRGIREV 2 734 756 TSDHNPSMTH LRLSMGLCLMLLILVGVMGTTAY QKRATLPVRP 391 Q86YD5 MWLLGPLCLLLSSAA...AEPRDSEPSQGTEEV 2 172 194 SENQLVYYPS ITYAIIGSSVIFVLVVALLALVL HHQRKRNNLM 392 Q8TF66 MPLKHYLLLLVGCQA...RSQAVLMQMKAPNEC 2 539 561 VWGMTQAQSG LAIAAIVIGIVALACSLAACVGC CCCKKRSQAV 393 Q9H756 MKVTGITILFWPLSM...IEDKYIDIHELCEEN 2 269 291 RNSEHEPLGK SWAFLVGVVVTVLTTSLLIFIAI KCPIWYNILL 394 Q8N386 MGGTLAWTLLLPLLL...GQAPMDEEEYVIPGH 2 166 188 SCAPGLASAT IGAVVVSGCLLLGLAIAGPVLAW RLWRCRVARS 395 Q2I0M4 MRGPSWSRPRPLLLL...ASPADPGSPAAAAQA 2 265 287 QPLALRDLAV VYTLGPASFLVSLASCLALGSGL TACRARRRRL 396 Q14392 MRPQILLLLALLTLG...CCCVRRQKFNQQYKA 2 629 651 EKGGLKNINL IIILTFILVSAILLTTLAACCCV RRQKFNQQYK 397 Q86YC3 MELLPLWLCLGFHFL...LLQVIKSRCHWSSVY 2 652 674 KWERLDLGLL YLVLILPSCLTLLVACTVIVLTF KKPLLQVIKS 398 Q5VT99 MRPRAPACAAAALGL...APNKDAEDEDEDKDD 2 251 273 FSLSLTDLCI IIFSGVAVSIAAIISSFFLATVV QCLQRCAPNK 399 Q9BTN0 MAILPLLLCLLPLAP...LDCEPWGPGHEPVGP 2 537 559 CGAPHAPFLG GTMIIALGGVIVASVLVFIFVLL MRYKVHGGQP 400 Q96JA1 MARPVRGGLGAPRRS...LPGKQRVPLLLAPKS 2 793 815 AAGCRKDGTT VGIFTIAVVSSIVLTSLVWVCII YQTRKKSEEY 401 Q9P2V4 MRVALGMLWLLALAW...AFGVKGGRRINEYFC 2 530 552 DAENTQQLIN VVVISVAIVIALPLTLLVCCSAL QKRCRKCFNK 402 A6NDA9 MASVFHYFLLVLVFL...DTEGDKEKGGTEDNS 2 463 485 DAGGLEAREH LLHVTVVLCVVLLAVPVGAYAWA AQGPCSCSKW 403 Q3SXY7 MHLFACLCIVLSFLE...QVTFKSEGSRPEYYC 2 581 603 TERVEGDDSQ WSLLLVVTSTACVVILPLICFLL YKVCKLQCKS 404 Q8ND94 MLGSPCLLWLLAVTF...WGCPRRAAARAAGAL 2 198 220 VPPNPRTLVH AAVGVGTALALLSCAALVWHFCL RDRWGCPRRA 405 Q86VZ4 MASVAQESAGSQRRL...ITSEESDYLINGMYL 2 451 473 GGEHPAPETG AVLPLALGLAITALLLLMVACRL RLVKQKLKKA 406 Q9Y561 MACRWSTKESPRWRS...TLKNETSDDEALLLC 2 13 32 CRWSTKESPR WRSALLLLFLAGVYGNGALA EHSENVHISG 407 O75096 MRRQWGALLLGALLC...TGWKHERKLSSESQV 2 1724 1746 VPAAPGEGLH ISYAIGGLLSILLILVVIAALML YRHKKSKFTD 408 Q8WUT4 MRQTLPLLLLTVLRP...NPAFDDYPLGLQTVS 2 683 705 AFTTKPSFAL LLSGLCAASGLLLASTVVLSACL CRRGQTLGLQ 409 Q86UE6 MDFLLLGLCLYWLLR...GSCTCHQQPARECEV 2 428 450 HAENAVQIHK VVTGTMALIFSFLIVVLVLYVSW KCFPASLRQL 410 Q9HBL6 MKGELLLFSSVIVLL...PGKVEEKERFDSSPA 2 286 308 KPRPANLRHA IATVIITGVVCGIVCLMMLAAAI YGCTYAAITA 411 Q5SQ64 MAVLFLLLFLCGTPQ...NIHLARLGPPAHKPR 2 235 257 CAPSTGWDMP WILMLLLTMGQGVVILALSIVLW RQRVRGAPGR 412 O60449 MRTGWATPRRPAGLL...QGVNEDEIMLPSFHD 2 1669 1691 CKVPLGPDYT AIAIIVATLSILVLMGGLIWFLF QRHRLHLAGF 413 Q9HBG7 MVAPKSHTDDWAPGP...NDLEIPESPTYENFT 2 455 476 ICSGPERNTK LWIGLFLMVCLLCVGIFSWCIW KRKGRCSVPA 414 P14151 MIFPWKCQSTQRDLW...LKKGKKSKRSMNDPY 2 333 355 FSMIKEGDYN PLFIPVAVMVTAFSGLAFIIWLA RRLKKGKKSK 415 P16581 MIASQFLSALTLVLL...SLESDGSYQKPSYIL 2 556 578 TCEAPTESNI PLVAGLSAAGLSLLTLAPFLLWL RKCLRKAKKF 416 P16109 MANCQIAILYQRFQR...GTYGVFTNAAFDPSP 2 773 795 AGPLTIQEAL TYFGGAVASTIGLIMGGTLLALL RKRFRQKDDG 417 Q9Y5Y7 MARCFSLVLLLTSIW...KSPSKTTVRCLEAEV 2 236 258 FKNEAAGFGG VPTALLVLALLFFGAAAGLGFCY VKRYVKAFPF 418 P20916 MIFLTALPLFWIMIS...TLTEELAEYAEIRVK 2 511 533 LPFQGAHRLM WAKIGPVGAVVAFAILIAIVCYI TQTRRKKNVT 419 Q5VYJ5 MLFFLDRMLAFPMNE...GTTSGSLETLSHHLK 2 2075 2097 TDFTYAQNNT WTLLGIGLAFLMTHITVAVLCFL ANRKVPIRKT 420 Q9H8J5 MFFGGEGSLTYTLVI...YSRLDYLINGIYVDI 2 386 408 QYGLPFEKWL LIGSLLFGVLFLVIGLVLLGRIL SESLRRKRYS 421 A6NHS7 MHVAEVAVNVILLLS...SLQIKNRNHMKENSS 2 286 308 DEVSVTSKTW LVSVALCTSVIFLGCCIVILASG CCGKQQGQYK 422 Q3UU94 MRAVELLLLLGLASM...RSASGCRRNTLKENS 2 284 306 EPWDGAPASA GVWLACVTLGAAVISLCCRVVLG TSRCCGKRQG 423 Q14703 MKLVNIWLLLLVVLL...PQLMQQVHPPKTPSV 2 999 1021 IMPGRYNQEV GQTIPVFAFLGAMVVLAFFVVQI NKAKSRPKRR 424 P15529 MEPPGRRECPFPSWR...YLTDETHREVKFTSL 2 344 366 PEEGILDSLD VWVIAVIVIAIVVGVAVICVVPY RYLQRRKKKG 425 Q96KG7 MVISLNSCLSFICLL...EDSGGSSSNSSSSSE 2 856 878 STALPADSYQ IGAIAGIIILVLVVLFLLALFII YRHKQKGKES 426 A6BM72 MVLSLTGLIAFSFLQ...VRQSPANGPSQDKQS 2 849 871 SPALGAERHS VGAVTGIMLLLFLIVVLLGLFAW HRRRQKEKGR 427 Q7Z7M0 MALGKVLAMALVLAL...RKGLLSQDNLTSMSL 2 2648 2670 FFRQDQAHID LFVFFSVFFSCFFLFLSLCVLLW KAKQALDQRQ 428 Q9H1U4 MNGGAERAMRSLPSL...NGQLTLTTPIHNYKA 2 515 537 LADVSWTQFN IIILTVIIIVVVLLMGFVGAVYM YREYQNRKLN 429 Q16820 MDLWNLSWFLFLDAL...SSNRPNLTPQNQHAF 2 654 676 EKRGSTRDTI VIAVSSTVAVFALMLIITLVSVY CTRKKYRERM 430 Q9H9K5 MGSLSNYALLQLTLT...AMKGLTTHQYDTSLL 2 490 512 FAKVGDWFRS WGYVLLIVLFCLFIFVLIYVRVF RKSRRSLNSQ 431 O75121 MDRLKSHLTVCFLPS...VTHDKNTCIIYESHV 2 150 172 LRVIFTSGDM GVYYMVVCLVAFTIVMVLNITRL CMMSSHLKKT 432 Q29983 MGLGPVFLLLAGIFP...PLMSDLGSTGSTEGA 2 306 328 PSGKVLVLQS HWQTFHVSAVAAAAIFVIIIFYV RCCKKKTSAA 433 P51512 MILLTFSTGRRLDFV...PRHILYCKRSMQEWV 2 565 587 LDNTASTVKA IAIVIPCILALCLLVLVYTVFQF KRKGTPRHIL 434 Q8TD46 MLCPWRTANLGLLLI...SEALQSEVDTDLHTL 2 244 266 VPGAKKSAKL YIPYIILTIIILTIVGFIWLLKV NGCRKYKLNK 435 Q6Q8B3 MSAPRLLISIIIMVS...GFVFFQRINHVRKVL 2 239 261 RTSGSPALSL LIILYVKLSLFVVILVTTGFVFF QRINHVRKVL 436 Q2M385 MNNFRATILFWAAAA...ATGDTTYQEQGQSPA 2 656 678 HGDGGGLSGG AAAGVTVGVTTILAVVITLAIYG TRKFKKKAYQ 437 P20645 MFPFYSCWRTGLLLL...LGEESEERDDHLLPM 2 188 210 ACSPEISHLS VGSILLVTFASLVAVYVVGGFLY QRLVVGAKGM 438 P11717 MGAAAGRSPHLGPAP...LVSFHDDSDEDLLHI 2 2305 2327 MHKGLSERSQ AVGAVLSLLLVALTCCLLALLLY KKERRETVIS 439 Q3TEW6 MAEAVGAVALIAAPA...INKSESVVYADIRKD 2 162 191 LHVVEIDNLL VFLVWVVVGTVTAVVLGLTLLISLVLVVLY RRKHSKRDYT 440 O60487 MYGKSSTRAVLLLLG...LNQEKKVSVYLEDTD 2 153 175 VHTVRFSEIH FLALAIGSACALMIIIVIVVVLF QHYRKKRWAE 441 Q6UWV2 MQQRGAAGSRGCALF...VRCAECLDSDYEETY 2 159 181 ERGFGTMLSS VALLSILVFVPSAVVVALLLVRM GRKAAGLKKR 442 Q61830 MRLLLLLAFISVIPV...KDLMGNIEQNEHAII 2 1388 1410 MDPQPKGSSK AAGVVTVVLLIVIGAGVAAYFFY KKRHALHIPQ 443 Q13505 MLLGGPPRSPRSGTS...PGTRTLGMAEEDEEE 2 421 443 EEEPYRRRNQ ILSVLAGLAAMVGYALLSGIVSI QRATPARAPG 444 Q9UKN1 MLVIWILTLALRLCA...TELHIQRPEMVASTV 2 5381 5403 EFNIAKSLVY GIVGAVMAVLLLALIILIILFSL SQRKRHREQY 445 Q9H3R2 MKAIIHLTLLALLSV...QNPYSRHSSMPRPDY 2 421 443 SGLDCKDKFQ LILTIVGTIAGIVILSMIIALIV TARSNNKTKH 446 Q8C6Z1 MLTLAKIALISSLFI...DGIPMDAIPPLRPSI 2 235 257 DTPKENKNTG IVFGAILGAILGASLLSLVGYLL CGQRKTDSFS 447 Q8WXI7 MLKPSGLPGSSSPTR...CPGYYQSHLDLEDLQ 2 14453 14475 PLTGNSDLPF WAVILIGLAGLLGVITCLICGVL VTTRRRKKEG 448 Q5SSG8 MKMQKGNVLLMFGLL...VSSIAMEMSGRNSGP 2 480 502 KPGGSLVPWE IFLITLVSVVAAVGLFAGLFFCV RNSLSLRNTF 449 Q04900 MSRLSRSLLWAATCL...LYKFCKSKERNYHTL 2 164 186 QPVRKSTFDA ASFIGGIVLVLGVQAVIFFLYKF CKSKERNYHT 450 Q9ULC0 MELLQVTILFLLPSI...SHESGEHSAQGKTKN 2 191 213 TSATSRSYSS IILPVVIALIVITLSVFVLVGLY RMCWKADPGT 451 Q3MIW9 MAQPVHSLCSAFGLQ...MEQQNLGMGQIPSPR 2 447 469 QMGENDSFPA WAIVIVVLVAVILLLVFLGLIFL VSYMMRTRRT 452 Q9BRK3 MALPSRILLWKLVLL...KYIDLDKGFRKENCK 2 341 363 VPESRAHFFQ QLGYVLATLLLFILLLVTVLLAA RRRRGGYEYS 453 P25189 MAPGAPSSSPSPILA...EKKAKGLGESRKDKK 2 157 179 FEKVPTRYGV VLGAVIGGVLGVVLLLLLLFYVV RYCWLRRQAA 454 Q9UK23 MATSTGRWLLLRLAL...AEKEQPGGAHNPFKD 2 450 472 GELSFFTRTA WLALTLALAFLLLISTAANLSLL LSRAERNRRL 455 P13591 MLQTKDLIWTLFFLG...VPNDATQTKENESKA 2 724 746 SPTSGLSTGA IVGILIVIFVLLLVVVDITCYFL NKCGLFMCIA 456 O35136 MSLLLSFYLLGLLVR...VSNDIIQSKEDDIKA 2 696 718 PKPNIIKDTL FNGLGLGAIIGLGVAALLLILVV TDVSCFFIRQ 457 Q5T1S8 MTTATPLGDTTFFSL...ATVTFSPVDVQVETR 2 29 51 TRGEDFLYKS SGAIVAAVVVVVIIIFTVVLILL KMYNRKMRTR 458 O76036 MSSTLPALLCVGLCL...ASTWEGRRRLNTQTL 2 256 274 HALWDHTAQN LLRMGLAFLVLVALVWFLV EDWLSRKRTR 459 O95944 MAWRALHPLLLLLLL...VARTKISDDDDEHTL 2 193 215 LRPGPAAPIA LVPVFCGLLVAKSLVLSALLVWW GDIWWKTMME 460 Q96NY8 MPLSLGAEMWGPEAW...PTGNGIYINGRGHLV 2 350 372 GKQVDLVSAS VVVVGVIAALLFCLLVVVVVLMS RYHRRKAQQM 461 Q8TDF5 MIHGRSVLHIVASLI...GSLSKHESEYNTTRV 2 343 365 SLLDQLTNTS GTVIGVTSCIVIILIIISVIVQI KQPRKKYVQR 462 Q8NET5 MENQPVRWRALPGLP...RFEDDGELNLVYENL 2 164 186 YREPPQSPQK LLLFGFTGLLSVLSVVGTALLLW NKKRMRGPGK 463 Q92542 MATAGGGSGADPGSR...DVLFIAPREPGAVSY 2 670 692 IFLIASKELE LITLTVGFGILIFSLIVTYCINA KADVLFIAPR 464 O60500 MALGTTLRASLLLLG...LEPDSLPFELRGHLV 2 1064 1086 PSGPSGLPLL PVLFALGGLLLLSNASCVGGVLW QRRLRRLAEG 465 Q68D85 MTWRAAASTCAALLI...PVLSSQPPTLLLPLQ 2 262 284 LSETEKTDNF SIHWWPISFIGVGLVLLIVLIPW KKICNKSSSA 466 O35181 MSEGAAGASPPGAAS...FVLRNEIQRDSVLTK 2 363 385 MESEDVYQRQ VLSISCIIFGIVIVGMFCAAFYF KSKKQAKQIQ 467 Q8WWG1 MPTDHEEPCGPSHKS...VETSSTSAHHSHEQH 2 61 83 PGSSIQTKSN LFEAFVALAVLVTLIIGAFYFLC RKGHFQRASS 468 O14786 MERGLPLLCAVLALV...LKKDKLNTQSTYSEA 2 857 879 PGNVLKTLDP ILITIIAMSALGVLLGAVCGVVL YCACWHNGMS 469 Q96PE5 MSFSLNFTLPANTTS...RRRGLWWLVPRLSLE 2 31 53 GKETDCGPSL GLAAGIPLLVATALLVALLFTLI HRRRSSIEAM 470 Q99650 MALFAVFQTTFFLTL...SLSSITLLDPGEHYC 2 739 761 TKVTTPDEHS SMLIHILLPMVFCVLLIMVMCYL KSQWIKETCY 471 Q86WC4 MEPGPTAAQRRCSLP...LKSSTSFANIQENSN 2 283 305 FNCSVPCSDT VPVIAVSVFILFLPVVFYLSSFL HSEQKKRKLI 472 Q96FE7 MLLAWVQAFLVSNML...EGTTPLMGQAGTPGA 2 169 191 NSKEKKDLGT LGYVLGITMMVIIIAIGAGIILG YSYKRGKDLK 473 Q8NBR0 MAPPPPSPQLLLLAA...LPPTPDSGPEGESSE 2 310 329 ARGPTPRTEE AAWAAMALTFLLVLLTLATL CTRLHRNFRR 474 Q6UWI2 MVYKTLFALCILTAG...YGSWGNYNNPLYDDS 2 258 280 QEVEHALSSG SIAAITVTVIAVVLLVFGVAAYL KIRHSSYGRL 475 Q923D3 MVCKVLIALCIFTAG...YGSWGNYNNPLYDDS 2 244 266 QEVENALSSG SIAAITVTVIAVVLLVFGGAAYL KIRHSSYGRL 476 Q9P2E7 MIVLLLFALLWMVEG...RAPYKPPYLTRKRIC 2 716 738 GGGETSLDLT LILIIALGSVSFIFLLAMIVLAV RCQKEKKLNI 477 Q96QU1 MFRQFYLWTCLASGI...VEGTEKQSHSQSTSL 2 1375 1397 KRGESLGYTE GALLALAFIIILCCIPAILVVLV SYRQFKVRQA 478 Q9HCL0 MHQMNAKMHFRFVFA...LVAEINKLLQDVRQS 2 698 720 MTSVSQASLD VSMIIIISLGAICAVLLVIMVLF ATRCNREKKD 479 Q8TAB3 MESLLLPVLLLLAIL...NKESPGVKRLKDIVL 2 679 701 QESMGSVNLS LIFIIALGSIAGILFVTMIFVAI KCKRDNKEIR 480 Q8N6Y1 MRGRGNARSSQALGV...DLHMRERKPMDISNI 2 889 911 RKVESVSCMP TLVALSVISLGSITLVTGMGIYI CLRKGEKHPR 481 Q9Y5F3 MAGTRRKSLQNRQVG...SQRLEGHDQVSDDYM 2 690 712 HSRKVNPSTK YLVISLVILSFLFLLSVIVIFII HVYQKIKYRE 482 Q9H158 MVGCGVAVLCLWVSC...EKKEKGNSTTDNSDQ 2 682 704 EPGGQLSAQN LYLVIALACISFLFLGCLLFFVC TKLHQSPGCC 483 Q9Y5F7 MLRKVRSWTEIWRWA...GGNGNKKKSGKKEKK 2 690 712 SAPREGESRL TLYLAVSLVAICFVSFGSFVALL SKCLRGAACG 484 O60245 MLRMRTAGWARGWCL...YSKQMRLHPYITVFG 2 878 900 DPSYEISKQR LSIVIGVVAGIMTVILIILIVVM ARYCRSKNKN 485 Q7TSK3 MSPAKRWGSPCLFPL...GSRYVSPKKGINENV 2 748 770 SGPSLQWDTP LIVIIVLAGSCTLLLAAIIAIAT TCNRRKKEVR 486 Q9HC56 MDLRDFYLLAALIAC...KQAGGATESPKEHQL 2 814 836 SQPYQNEDYL TIMIAIIAGAMVVIVVIFVTVLV RCRHASRFKA 487 Q92824 MGWGSRCCCPGRLDL...IDELEYDDESYSYYQ 2 1745 1764 RPATEHFKTA LFITSSMMLVLLLGAAVVVW KKSRGRVQPA 488 Q16549 MPKGRQKVPHLDAPL...HQHLDVPHGKEEQIC 2 13 35 KGRQKVPHLD APLGLPTCLWLELAGLFLLVPWV MGLAGTGGPD 489 Q9EP73 MRIFAGIIFTACCHL...DTSSKNRNDTQFEET 2 238 260 LPATHPPQNR THWVLLGSILLFLIVVSTVLLFL RKQVRMLDVE 490 Q9BQ51 MIFLLLMLSLELQLH...KRPVTTTKREVNSAI 2 221 243 SQMEPRTHPT WLLHIFIPFCIIAFIFIATVIAL RKQLCQKLYS 491 Q15116 MQIPQAPWPVVWAVL...AQPLRPEDGHCSWPL 2 168 190 PSPRPAGQFQ TLVVGVVGGLLGSLVLLVWVLAV ICSRAARGTI 492 Q9NZ53 MGRLLRAARLPPLLS...RDPEDSDVFEEDTHL 2 500 522 RASQVRSDYG TLFVVLVVIGAICIIIIALGLLY NCWQRRLPKL 493 P16284 MQPRWAQGATMWLGV...VESRYSRTEGSLDGT 2 603 625 RVILAPWKKG LIAVVIIGVIIALLIIAAKCYFL RKAKAKQMPV 494 P07202 MRALAVLSVTLVMAC...AGMEGRDTHRLPRAL 2 849 871 VDSGRLPRVT WISMSLAALLIGGFAGLTSTVIC RWTRTGTKST 495 Q8IYJ0 MESRMWPALLLSHLL...GAPAFQLNRIPLVNL 2 181 203 GRGEGVDPQL YVTITISIIIVLVATGIIFKFCW DRSQKRRRPS 496 P01833 MLLFVLTCLLAVFPA...SSTVAAEAQDGPQEA 2 639 661 SSEEQGGSSR ALVSTLVPLGLVLAVGAVAVGVA RARHRKNVDR 497 Q969N2 MAAAMPLALLVLLLL...RLANLIRRARGVPPL 2 522 544 PLLVNLPTPD FSMPYNVICLTCTVVAVCYGSFY NLLTRTFHIE 498 Q9UKJ1 MGRPLLLPLLPLLLP...LKSPQNETLYSVLKA 2 196 218 RSDSWHISLE TAVGVAVAVTVLGIMILGLICLL RWRRRKGQQR 499 Q13018 MLLSPSLLLLLLLGA...LEENILISDLEKSDQ 2 1398 1420 ALPEKGPSHS IIPLAVVLTLIVIVAICTLSFCI YKHNGGFFRR 500 Q3TTY0 MELYPGVSPVGLLLL...VTQDAVSEKRLKAGN 2 1420 1442 LPDKAEEPSN ALYWAVPVAAIGGLAVGILGVML WRTVKPVQQE 501 Q9Z239 MASPGHILALCVCLL...GTFRSSIRRLSSRRR 2 35 57 EPDPFTYDYH TLRIGGLTIAGILFILGILIILS KRCRCKFNQQ 502 O75051 MEQRRPWPRALEVDS...AYKVEQLINAMSIES 2 1238 1260 VISDSLLTLP AIVSIAAGGSLLLIIVIIVLIAY KRKSRENDLT 503 Q9QY40 MLTDFLQAPVMAPWS...LQQVAALVEYKVTDL 2 1244 1266 ESMMSTFPVE AQLGLGMGAAVLIAAVLLLTLMY RHKSKKALRD 504 Q9QZC2 MEVSRRKTPPRPPYP...HVKVLFDEKKKCKWM 2 950 972 LYVEQESVPS TWYFLIALPILLAIVIVVAVVVT RYKSKELSRK 505 Q8TEM1 MAARGRGLLLLTLSV...ASPPSGLWSPAYASH 2 1809 1831 LFQHFLDSYQ VMFFTLFALLAGTAVMIIAYHTV CTPRDLAVPA 506 O00592 MRCALALSALLLLLS...NLTKDDLDEEEDTHL 2 461 483 PEEAEDRFSM PLIITIVCMASFLLLVAALYGCC HQRLSQRKDQ 507 Q8N131 MGLGARGAWAALLLG...RGIRYRTIDEHDAII 2 169 191 KKGSKFDTGS FVGGIVLTLGVLSILYIGCKMYY SRRGIRYRTI 508 P16471 MKENVASATVFTLLL...GLDYLDPACFTHSFH 2 236 258 IPSDFTMNDT TVWISVAVLSAVICLIIVWAVAL KGYSMVTCIF 509 P0DTF9 MCWLRAWGQILLPVF...ETVPIHDRSATVYDE 2 98 120 SIYWLNCKVD MFGIMMLLLIAVLITGFVWYCCA YHFYLQDLNR 510 P18433 MDSWFILVLLGSGLI...VQEYIDAFSDYANFK 2 152 174 DSKDRRDETP IIAVMVALSSLLVIVFIIIVLYM LRFKKYKQAG 511 B2RU80 MLRHGALTALWITLS...NVNPEYHRDAIYSRH 2 1620 1642 ITTESEPLFG VIEGVSAGLFLIGMLVALVAFFI CRQKASHSRE 512 P08575 MTMYLWLKLLAFGFA...HSVNGPASPALNQGS 2 580 602 HSTSYNSKAL IAFLAFLIIVTSIALLVVLYKIY DLHKKRSCNL 513 P23469 MEPLCPLLLVGFSLP...VQDFIDIFSDYANFK 2 47 69 GPPDPGASQP LLAWLLLPLLLLLLVLLLAAYFF RFRKQRKAVV 514 P23470 MRRLLEPCWWILFLK...IADESDPAESMESLV 2 737 759 ISRPAPGRME WIIPLIVVSALTFVCLILLIAVL VYWRGCNKIK 515 Q12913 MKPAAREARLPPRSP...LAPVTTFGKTNGYIA 2 974 996 VSLPQDPGVI CGAVFGCIFGALVIVTVGGFIFW RKKRKDAKNN 516 Q16849 MRRPRRPGGLGGSGG...AVAEEVNAILKALPQ 2 577 599 TAHSTSPMRS VLLTLVALAGVAGLLVALAVALC VRQHARQQDK 517 E9Q612 MGHLPRGTLGGRRLL...QFCISDVIYENVSKS 2 831 853 TMVTEVNPNV VVISVLAILSTLLIGLLLVTLVI LRKKHLQMAR 518 Q9UMZ3 MKKVPIKPEQPEKLR...AMEGDVELEWEETTM 2 1948 1970 GEGLSERTVE IILSVTLCILSIILLGTAIFAFA RIRQKQKEGG 519 Q15256 MRRAVCFPALCLLLN...ALCLYESRLSAETVQ 2 227 249 HEADKIWSKE GFYAVVIFLSIFVIIVTCLMILY RLKERFQLSL 520 Q99M80 MGSLGGLALCLLRLL...YKFVYEVALEYLSSF 2 772 791 KQVDNTVKMA GVIAGLLMFIIILLGVMLTI KRRKLAKKQK 521 Q92729 MARAQALVLALTFQL...CYDVALEYLEGLESR 2 747 769 EVSQRSEEMG LILGICAGGLAVLILLLGAIIVI IRKGRDHYAY 522 P70289 MRPLILLAALLWLQD...CLNSALRNRLPRARK 2 1078 1100 QASISLVAMP LTVMMGTVVGCIIIVCAVLCLLC RRGLKGPRSE 523 P15151 MARAMAAAWPLLLVA...RENSSSQDPQTEGTR 2 345 367 SEHSGISRNA IIFLVLGILVFLILLGIGIYFYW SKCSREVLWH 524 Q9NXS2 MRSGGRGRPRLRLGE...LCRILAVFLAEYLGL 2 33 55 LPPKRRLLPR VRLLPLLLALAVGSAFYTIWSGW HRRTEELPLG 525 Q8TD07 MRRISLTSSPVRLLL...QNGEWQAGLWPLRTS 2 226 248 IHWSSSSLPD RWIILGAFILLVLMGIVLICVWW QNGEWQAGLW 526 O75787 MAVFVVLLALVAGVL...DSIIYRMTNQKIRMD 2 309 331 YNFEYSVVFN MVLWIMIALALAVIITSYNIWNM DPGYDSIIYR 527 P07949 MAKATSGAAGLRLLL...MLSPSAAKLMDTFDS 2 13 32 KATSGAAGLR LLLLLLLPLLGKVALGLYFS RDAYWEKLYV 528 Q68DV7 MSGGHQLQLAALWPW...PGSEEELEELCEQAV 2 199 218 KEPPAWPDYD VWILMTVVGTIFVIILASVL RIRCRPRHSR 529 Q04912 MELLPPLPQSFLLLL...NVRRPRPLSEPPRPT 2 960 982 PDGVPQSTLL GILLPLLLLVAALATALVFSYWW RRKQLVLPPN 530 Q01974 MARGSALPRRPLLCI...CDTLQVDEAQVQLEA 2 403 425 SCSPRDSSKM GILYILVPSIAIPLVIACLFFLV CMCRNKQKAS 531 P08922 MKNIYCLIPKLVNFA...NYACLTHSGYGDGSD 2 1860 1882 LVGDDFWIPE TSFILTIIVGIFLVVTIPLTFVW HRRLKNQKSA 532 P04843 MEAPAAGLFLLLLLG...RQELVTKIDHILDAL 2 440 459 FNKVLMLQEP LLVVAAFYILFFTVIIYVRL DFSITKDPAA 533 Q9HBV2 MSPRGTGCSAGLLMT...TEMPGEDDALSEWNE 2 217 239 MRRSSLPATD AALIFVLTIGVIICVFIIFLLIF IIINWAAVKA 534 Q96BY9 MAAACGPGAAGYCLL...TKTRTASGYGGTRRR 2 172 194 YYKWSSADSC NMSGLITIVVLLGIAFVVYKLFL SDGQYSPPPY 535 P21583 MKKTQTWILTCIYLQ...EISMLQEKEREFQEV 2 215 237 KNPPGDSSLH WAAMALPALFSLIIGFAFGALYW KKRQPSLTRA 536 Q9JL59 MLAYSVTSSGLFPRM...LIHGSPGIPYLTLPP 2 160 182 PDKPPTAVRT EVIIIIAIATTIIITGIGVFVWY KQFPVAPQIQ 537 Q8WVN6 MQTCPLAFPGHVSQA...ELLSPQPLFPYAADP 2 146 168 AEPQSAPDTG FWPVPAVVTAVFILLVALVMFAW YRCRCSQQRR 538 Q7Z5N4 MARGARPSAAGGGGG...GPGARTPLTGFSSFV 2 2008 2030 AQVEAPFYEE WWFLLVMALSSLIVILLVVFALV LHGQNKKYKN 539 Q9UBV2 MRVRIGLTLLLCAVL...APPQQEGPPEQQPPQ 2 739 761 MFTQLDMDQL LGPEWDLYLMTIIALLLGTVIAY RQRQHQDMPA 540 Q14242 MPLQLLLLLILLGPG...EDREGDDLTLHSFLP 2 321 343 APDHISVKQC LLAILILALVATIFFVCTVVLAV RLSRKGHMYP 541 Q9H3S1 MALPALGLDPWSLLG...SDVDADNNCLGTEVA 2 681 703 GAALAAQQSY WPHFVTVTVLFALVLSGALIILV ASPLRALRAR 542 Q92854 MRMCTPIRGLLMALA...VKCELKFADSDADGD 2 734 756 KTMYLKSSDN RLLMSLFLFFFVLFLCLFFYNCY KGYLPRQCLK 543 Q9Z123 MLARAERPRPGPRPP...RLTGAPLATCDETSI 2 665 687 QRGPANRAHT VVGAGLVGFFLGVLAASLTLLLI GRRQQRRRQR 544 Q9NTN9 MWGRLWPLLLSILTA...RKHTQLVEQLDESSV 2 679 701 LAPDVRLLYV LAIAALGGLCLILASSLLYVACL REGRRGRRRK 545 Q13591 MKGTCVIAWLFSSLG...YSNAYFTDLNNYDEY 2 971 993 KRCGEFNMFH MIAVGLSSSILGCLLTLLVYTYC QRYQQQSHDA 546 Q9H2E6 MRSEALLLYFTLLHF...FAPLSTSMKPNDACT 2 648 670 YLKGHDQLVP VTLLAIAVILAFVMGAVFSGITV YCVCDHRRKD 547 Q9H3T3 MQTPRASPPRPALLL...LLPYGGADRTAPPVP 2 602 624 GLVSVNLLVT SSVAAFVVGAVVSGFSVGWFVGL RERRELARRK 548 Q9WTM3 MPRAPHSMPLLLLLL...ASPPQPAPHGGHFNF 2 604 626 SPASASRSIP IPLLLACVAAAFALGASVSGLLV SCACRRANRR 549 Q8NFY4 MRVFLLCAYILLLMV...VPQTPSVRPLNKYTY 2 664 686 ESNQMVHMNV LITCVFAAFVLGAFIAGVAVYCY RDMFVRKNRK 550 Q53EL9 MRPVALLLLPSLLAL...TYETGSLSFAGDERI 2 926 948 AASSTLDAAH IAAAIFLPLVAMVLLVGGVYFYF SRLQGKSSLQ 551 Q16586 MAETLFWTPLLVVLL...PRVDSAQVPLILDQH 2 290 312 EAPDRDFLVD ALVTLLVPLLVALLLTLLLAYVM CCRREGRLKR 552 Q6UWI4 MWGARRSSVSSSWNA...PHTNSEQKMYPAVTV 2 114 136 KDGPDGSAVP IYVPFLIVGSVFVAFIILGSLVA ACCCRCLRPK 553 Q96DD7 MPPAGLRRAAPLTAI...APPPYMPPQPSYPGA 2 86 108 KHCLAFSPKT IAGIASAVILFVAVVATTICCFL CSCCYLYRRR 554 Q6ZSJ9 MALRRLLLLLLLSLE...GHHTCYTASKTEVTV 2 176 195 KYDPEKDKTN FTVYITCGVIAFVIVAGVFA KVSYDKAHRP 555 B8ZZ34 MARAGARGLLGGRRP...GSRYLRTNSKTEVTV 2 141 163 RDPGRERSHT AVYAVCGVAALLVLAGIGARLGL ERAHSPRARR 556 Q3SXP7 MTSCGQQSLNVLAVL...DAHSPPLMTFQSSSA 2 99 121 EGYMHNNYTA LLGVWIYGFFVLMLLVLDLLYYS AMNYDICKVY 557 Q96LC7 MLLPLLLSSLLGGSQ...MPKGTQADYAEVKFQ 2 548 570 PDKKGLISTA FSNGAFLGIGITALLFLCLALII MKILPKRRTQ 558 Q08ET2 MLPLLLLPLLWGGSL...TRCGGPQQSRAERPG 2 359 381 KQQGSWPLVL TLIRGALMGAGFLLTYGLTWIYY TRCGGPQQSR 559 O43699 MQGAQEASASEMLPL...PKVTDTEYSEIKIHK 2 346 368 VHWKPEGRAG GVLGAVWGASITTLVFLCVCFIF RVKTRRKKAA 560 Q9NYZ4 MLLLLLLLPLLWGTK...ACLRNHNPSSKEVRG 2 362 384 TSRPVSQVTL AAVGGAGATALAFLSFCIIFIIV RSCRKKSARP 561 Q5JXA9 MCSTMSAPTCLAHLP...PAGAMNTLAWSKGQE 2 289 311 PATEMSPTGL LVVFAPVVLGLKAITLAALLLAL ATSRRSPGQE 562 Q9Y3P8 MNQADPRLRAVCLWT...SFPDQAYANSQPAAS 2 41 63 LALGIPSITQ AWGLWVLLGAVTLLFLISLAAHL SQWTRGRSRS 563 Q13291 MDPKGLLSLTFVLFL...TNSITVYASVTLPES 2 238 260 CRTDPSETKP WAVYAGLLGGVIMILIMVVILQL RRRGKTNHYQ 564 Q9QUM4 MDPKGSLSWRILLFL...PNPTTVYASVTLPES 2 243 265 KQESSSESSP WMQYTLVPLGVVIIFILVFTAII MMKRQGKSNH 565 Q9UIB8 MAQHHLWILLLCLQT...QDSKPPGTSSYEIVI 2 224 246 DIAMGFRTHH TGLLSVLAMFFLLVLILSSVFLF RLFKRRQGRI 566 Q96DU3 MLWLFQSLLFVFCFG...SKPTFSRATALDNVV 2 226 248 DVKIQYTDTK MILFMVSGICIVFGFIILLLLVL RKRRDSLSLS 567 Q9ET39 MAVSRAPAPDSACQR...QPITLKVNTLINYNS 2 240 262 KGVLTNPPWN AVWFMTTISIISAVILIFVCWSI HVWKRRGSLP 568 Q8BHK6 MARFSTYIIFTSVLC...AKPLVPRSLSFENVI 2 224 246 DAATDLTSLR GILYILCFSAVLILFAVLLTIFH TTWIKKGKGC 569 Q9D3G2 MWSLWSLLLFEALLP...ACTDGVLPETENALV 2 234 256 SGKASYKDVL LVVVPITLFLILAGLFGAWHHGL CSGKKKDACT 570 Q96A28 MCAFPWLLLLLLLQE...RMKLRKEAKPGSSPA 2 240 259 PSTAFCLLAK GLLIFLLLVILAMGLWVIRV QKRHKMPRMK 571 Q96PX8 MLLWILLLETSLCFA...GAHRVYDCGSHSLSD 2 621 643 DTSRVSISVL VPGLLLVFVTSAFTVVGMLVFIL RNRKRSKRRD 572 Q9H156 MLSGVWFLSVLTVAG...DYLEVLEKQTAISQL 2 622 644 LHTEVPLSVL ILGLLVVFILSVCFGAGLFVFVL KRRKGVPSVP 573 Q3V0X1 MKPLKLFCIGLLLCP...SLEMQNMNLIKLFGG 2 83 105 FKNHLSDFFK SSIPPAAIFALFVTTAIMRAAIV NKRLEEPHRQ 574 Q62230 MCVLFSLLLLASVFS...ATKKNTIQEEVVAAL 2 1641 1663 LHQLQLFQRL LWVLGFLAGFLCLLLGLVAYHTW RKKSSTKLNE 575 Q96PQ0 MAHRGPSRASKGPGP...VLSINSREMHSYLVS 2 1078 1100 GTGAEQLGGG GGYWAVVVLFVIGLFAAGAFILY KFKRKRPGRT 576 Q9WU03 MAQLCELRRGRALLA...TADDKEQLVKNTCVL 2 198 220 MHPFLTPGLK AVILVGLFLMVLILLLGTSMVCL IRVVRRKQER 577 P43307 MRLLPRLLLLLLLVF...LPRKRAQKRSVGSDE 2 207 229 IEREDGLDGE TIFMYMFLAGLGLLVIVGLHQLL ESRKRKRPIQ 578 P43308 MRLLSFVVLALFAVT...YSSKRKYDTPKTKKN 2 147 169 REFDRRFSPH FLDWAAFGVMTLPSIGIPLLLWY SSKRKYDTPK 579 Q9NY15 MAGPRGLLPLCLLAF...LEEDFPDTQRILTVK 2 2477 2499 AVLAPEAPPV AAGVGAVLAAGALLGLVAGALYL RARGKPMGFG 580 Q8WWQ8 MMLQHLVIFCLGLVV...SEERQLEGNDPLRTL 2 2461 2483 APVTLTHTGL GAGIFFAIILVTGAVALAAYSYF RINRRTIGFQ 581 Q13586 MDVCVRLALWLLWGL...RKKFPLKIFKKPLKK 2 214 232 LLTRHNHLKD FMLVVSIVIGVGGCWFAYI QNRYSKEHMK 582 Q9UGT4 MKPALLPWALLLLAT...LRRRKGNTHVWGAQP 2 786 808 PKCQPGRSYA VLLGIIFGGLAVVAAVALVYVLL RRRKGNTHVW 583 Q5VX71 MYHGMNPSNGDGFLE...GIDIADEIPLMEEDP 2 317 339 QTWPSTHETL LTTWKIVAFTATSVLLVLLLVIL ARMFQTKFKA 584 O60279 MTAEGPSPPARWHRR...RQARHYHQQIEMEKV 2 576 598 GCPGLSRGPV IATIVTVLCLLLLLAGVGMVWGY RKCQHKSSVY 585 Q9UQF0 MALPYHIFLFTVLLP...SAAQPLLRPNSAGSS 2 455 477 MPWILPFLGP LAAIILLLLFGPCIFNLLVNFVS SRIEAVKLQM 586 Q24JP5 MCARMAGRTTAAPRG...PEELRNYMERIRGSS 2 851 873 QHVTELELGM YALLGVFCVAIFIFLVNGVVFVL RYQRKEPPDS 587 P40200 MEKKWKYCAVYYIIQ...EPNESDLPYHEMETL 2 518 540 TGIVVNKPKD GMSWPVIVAALLFCCMILFGLGV RKWCQYQKEI 588 B6A8C7 MIPKLLSLLCFRLCV...SRNVSPGESEAFKPE 2 234 256 TTSSNYSLGN FVRLGLAAVIVVIMGAFLVEAWY SRNVSPGESE 589 A0A1B0GTY4 MSNQRLPLIFSLLFI...CKLKKKSKEEGARRY 2 76 98 FANMDIFQGC LYLIYNLLQAVFFVLFVLSVHYL WKKWKKHQKK 590 P13726 METPAWPRVPRPETA...GVGQSWKENSPLNVS 2 252 274 MGQEKGEFRE IFYIIGAVVFVVIILVIILAISL HKCRKAGVGQ 591 P37173 MGRGLLRGLWPLHIV...SEEKIPEDGSLNTTK 2 167 189 NPDLLLVIFQ VTGISLLPPLGVAISVIIIFYCY RVNRQQKLSS 592 O43493 MRFVVALVLLNVAAA...RRPKASDYQRLDQKS 2 385 402 SGNGSAESSH FFAYLVTAAILVAVLYIA HHNKRKIIAF 593 Q9UPZ6 MGLQARRWASGSRGA...RLKPLTLAYDGDADM 2 1607 1629 PFGPDGRLKT WVYGVAAGAFVLLIFIVSMIYLA CKKPKKPQRR 594 Q9NS62 MKPMLKDFSNLLLVV...EDETTSTLSVEKLVI 2 414 436 QPQGPVKSNN IVTVTGISLCLFIIIATVLITLW RRFGRPAKCS 595 Q02763 MDSLASLVLCGVSLL...KFTYAGIDCSAEEAA 2 748 770 PADLGGGKML LIAILGSAGMTCLTVLLAFLIIL QLKRANVQRR 596 Q495A1 MRWCLLLIWAQGLRQ...SYRSLGNCSFFTETG 2 142 164 AEHGARFQIP LLGAMAATLVVICTAVIVVVALT RKKKALRIHS 597 Q96H15 MSKEPLILWLMIEFW...DVQHGREDEDGLFTL 2 314 336 MSMKNEMPIS QLLMIIAPSLGFVLFALFVAFLL RGKLMETYCS 598 Q8TB96 MAAAGRLPSSWALFS...REKRQEAHRFHFDAM 2 566 588 SAKLYLTPSN IVLLTAIALIGVCVFILAIIGIL HWQEKKADDR 599 Q6R5P0 MPRMERHQFCSVLLI...EQLKRRLSKAGQERD 2 719 741 FSFLATNCPH GTEFWGFLTSFILLLLLIILPLI SCPKWSWLHH 600 Q15399 MTSIFHFAIIFMLIL...LRAAINIKLTEQAKK 2 582 604 MSELSCNITL LIVTIVATMLVLAVTVTSLCSYL DLPWYLRMVC 601 Q9QUN7 MLRALWLFWILVAIT...QQEVFWVNLRTAIKS 2 588 610 ARPSVLECHQ AALVSGVCCALLLLILLVGALCH HFHGLWYLRM 602 O15455 MRQTLPCIYFWGGLL...RHKLQVALGSKNSVH 2 703 725 TSSCKDSAPF ELFFMINTSILLIFIFIVLLIHF EGWRISFYWN 603 O00206 MMSASRLAGTLIPAM...GTVGTGCNWQEATSI 2 634 656 NITCQMNKTI IGVSVLSVLVVSVVAVLVYKFYF HLMLLAGCIK 604 O60602 MGDHLDLLLGVVLMA...KKDNNIPLQTVATIS 2 644 666 VLKSLKFSLF IVCTVTLTLFLMTILTVTKFRGF CFICYKTAQR 605 Q9NYK1 MVFPMWTLKRQILIL...TDNHVAYSQVFKETV 2 843 865 CELDLTNLIL FSLSISVSLFLMVMMTASHLYFW DVWYIYHFCK 606 Q9NR97 MENMFLQSSMLTCIF...DSRYNNMYVDSIKQY 2 826 848 SLELTTCVSD VTAVILFFFTFFITTMVMLAALA HHLFYWDVWF 607 Q4V9L6 MVSAAAPSLLILLLL...PPESPCACSSVHPSV 2 95 117 DGIVDFFRQY VMLIAVVGSLAFLLMFIVCAAVI TRQKQKASAY 608 Q8N3G9 MAQAVWSRLGRILWL...GLLPPLYKSVKTYTV 2 340 362 IQVWPSRIQP AVFAFPCATLITVMLAFIMYMTL RNATQQKDMV 609 Q6P9G4 MQAPRAALVFALVIA...KEEKESNHNPSDSES 2 76 98 NFAPDENQLE FILMVLIPLILLVLLLLSVVFLA TYYKRKRTKQ 610 Q8WZ59 MLGCGIPALGLLLLL...GTEEGEETEGEEEED 2 82 104 PDENVRRKHM WALVWTCSGLLLLSCSICLFWWA KRRDVLHMPG 611 Q6UWW9 MSRSRLFSVTSAIST...PLGSPPPYEEIVKTT 2 49 71 CVNYNDQHPN GWYIWILLLLVLVAALLCGAVVL CLQCWLRRPR 612 A6NLX4 MAPGPWPVSCLRGGP...ASSEEPPPPPPLPPE 2 44 66 CECSLGLSRE ALIALLVVLAGISASCFCALVIV AIGVLRAKGE 613 A2RRL7 MQRLPAATRATLILS...KLMKLTPDEPKDLQA 2 70 89 ARCCRTGVDE YGWIAAAVGWSLWFLTLILL CVDKLMKLTP 614 Q5T292 MNLGVSMLRILFLLD...RNHNFSKRDAQVIEL 2 37 59 TPGAEIDFKY ALIGTAVGVAISAGFLALKICMI RRHLFDDDSS 615 Q4KMG9 MGVRVHVVAASALLY...PPTEKESTRIVDSWN 2 42 64 EHCLTTDWVH LWYIWLLVVIGALLLLCGLTSLC FRCCCLSRQQ 616 Q9D2R4 MQIQTILLCFSFSFS...QMKHLKDFFIAKKLV 2 183 202 RITSEDTNRN VLWWAFAQILIFISVGIFQM KHLKDFFIAK 617 P0DPE3 MAARTLASALVLTLW...KMGAQSWGSGALDGL 2 220 242 RSPMGWAGPL ALGLLTGFVGALGTGALVVLLTL WITGGDGDRA 618 Q13445 MMAAGAALALALWLL...CTLKRFFQDKRPVPT 2 193 215 RNLQEGNLER VNFWSAVNVAVLLLVAVLQVCTL KRFFQDKRPV 619 Q15363 MVTLAELLVLLAALL...QIYYLKRFFEVRRVV 2 169 191 RAINDNTNSR VVLWSFFEALVLVAMTLGQIYYL KRFFEVRRVV 620 Q9Y3Q3 MGSTVPRSASVLLLL...SFFTEKRPISRAVHS 2 179 201 RARAEDLNSR VSYWSVGETIALFVVSFSQVLLL KSFFTEKRPI 621 Q8WW62 MSPLLFGAGLVVLNL...LFNVPTTTDTKKPRC 2 201 223 FFLIQSNYNY VNWWSTAQSLVIILSGILQLYFL KRLFNVPTTT 622 P49755 MSGLSGPPARRGPFP...VFYLRRFFKAKKLIE 2 186 208 RDTNESTNTR VLYFSIFSMFCLIGLATWQVFYL RRFFKAKKLI 623 Q9P0T7 MKLLSLVAVVGCLLV...QEQRKTVFDRHKMLS 2 90 112 YEERSTTTIK VIIVIYLSVVGALLLYMAFLMLV DPLIRKPDAY 624 O14668 MGRVFLTGEKANSIL...SASAIPMVPVVTTIK 2 84 106 RGSDWFQFYL TFPLIFGLFIILLVIFLIWRCFL RNKTRRQTVT 625 O14669 MRGHPSLLLLYMALT...HDAPPPPYTSLRRPH 2 110 132 GGRGRVDVAS LAVGLTGGILLIVLAGLGAFWYL RWRQHRGQQP 626 Q9BZD7 MAVFLEAKDAHSVLK...PKYEEIVAANPGADK 2 79 101 SVRDPSQSSD AMYVVVPLLGVALLIVIALFIIW RCQLQKATRH 627 Q9BZD6 MFTLLVLLSQLPTVT...KGFRVFKKSMSLPSH 2 118 140 NREKIDVMGL LTGLIAAGVFLVIFGLLGYYLCI TKCNRLQHPC 628 Q8NEW7 MAGWPGAGPLCVLGG...EEDEKNEAKKKKGEK 2 56 78 KETVVFWDMR LWHVVGIFSLFVLSIIITLCCVF NCRVPRTRKE 629 Q9D7L8 MVWKITGPLQACQLL...KLCGKKNDPNSETAL 2 217 239 FHLLVKDKVF VMPAEPIIAACVVVVLTMAFALF SRRKRIMKLC 630 Q96BF3 MGSPGMVLGLLVQIW...TQQPRPKGFPKVGEE 2 150 172 QNRNRIASFP GFLFVLLGVGSMGVAAIVWGAWF WGRRSCQQRD 631 G3X8R9 MEFLLLLSLALFSDA...ASQGPSMVSITLARI 2 151 173 KAVQKAEGSR MSILIICILITSLGIIFIISHLS RGRRSQRNRE 632 Q9DCF1 MELPLSQATLRHTLL...YIYRVSSVSSDEIWL 2 238 260 ATRIEVPLLG IVVAGGLALGTLVGFSTLVACLV CRKEKKTKGP 633 Q9BXS4 MAAPKGSLWVRTQLG...AGPLPTKVNLAHSEI 2 240 262 FLRCLSLNSG WILTTTLVLSVMVLLWICCATVA TAVEQYVPSE 634 Q9D5K1 MKTGAIVFILRSLLS...KALGGTDGSGGRTRL 2 221 243 KPHLPVWQRK VTSALGIGIVAGVVGGVLVSVAV FKALGGTDGS 635 A2RUT3 MLHVLASLPLLLLLV...TQRQIQIKGTSTQSG 2 64 86 CPGYWLGPGA SRIYPVAAVMITTTMLMICRKIL QGRRRSQATK 636 Q6UXU6 MSQAWVPGLAPTLLF...EEYTGDQRGIDNPAF 2 54 76 SCCQENELFP GPVRIFVIIFLVILSVFCICGLA KCFCRNCREP 637 B7ZWI3 MLDTWVWGTLTLTFG...EGPAGQMRGRAYATL 2 64 86 VWDPANDRFR FLVILACIIFPILFICALVSLFC PNCTELQHDV 638 Q3KNT9 MWRLALGGVFLAAAQ...SLLVESHHLQAKSGL 2 146 165 PGSQDLWEAK ILLLSIFGAFLLLGVLSLLV ESHHLQAKSG 639 Q9H3N1 MAPSGSLAVPLAVLV...IRQRSLGPSLATDKS 2 181 203 EDLGLPVWGS YTVFALATLFSGLLLGLCMIFVA DCLCPSKRRR 640 Q9Y320 MAVLAPLIALVYSVP...STPTTVSDGENKKDK 2 103 125 IFMFSKVANT ILFFRLDIRMGLLYITLCIVFLM TCKPPLYMGP 641 O35305 MAPRARRRRQLPAPL...AQTSLHTQGSGQCAE 2 212 234 RRPPKEAQAY LPSLIVLLLFISVVVVAAIIFGV YYRKGGKALT 642 Q9CR75 MASAWPRSLPQILVL...EETGGEGCPGVALIQ 2 79 101 AAPPAHFRLL WPILGGALSLVLVLALVSSFLVW RRCRRREKFT 643 Q92956 MEPPGDWGPPPWRST...VEETIPSFTGRSPNH 2 201 223 KAGAGTSSSH WVWWFLSGSLVIVIVCSTVGLII CVKRRKPRGD 644 Q80WM9 MEPLPGWGSAPWSQA...TEVGFAETEEETASN 2 208 230 STDTTCSSQV VYYVVSILLPLVIVGAGIAGFLI CTRRHLHTSS 645 Q9Y5U5 MAQHGAMGAFRALCG...ERSAEEKGRLGDLWV 2 165 187 PGSPPAEPLG WLTVVLLAVAACVLLLTSAQLGL HIWQLRSQCM 646 P20333 MAPVAVWAALAVGLE...KPLPLGVPDAGMKPS 2 258 280 SPPAEGSTGD FALPVGLIVGVTALGLLIIGVVN CVIMTQVKKK 647 Q93038 MEQRPRGCAAVAAAL...DGCVEDLRSRLQRGP 2 200 222 RCAAVCGWRQ MFWVQVLLAGLVVPLLLGATLTY TYRHCWPHKP 648 P83626 MTRLRLLLLLGLLLR...YCKRGENIQLSSTML 2 163 185 SGSQCFCFSK PLGIVVIIAAFIIIIGAVIILIL KIICYCKRGE 649 P36941 MLLPWATSAPGLAWG...TPSNRGPRNQFITHD 2 226 248 PLPPEMSGTM LMLAVLLPLAFFLLLATVFSCIW KSHPSLCRKL 650 P25942 MVRLPLQCVLWGCLL...QEDGKESRISVQERQ 2 193 215 DVVCGPQDRL RALVVIPIIFGILFAILLVLVFI KKVAKKPTNK 651 P25446 MLWIWAVLPLVLAGS...STPDTGNENEGQCLE 2 170 187 NCRKQSPRNR LWLLTILVLLIPLVFIYR KYRKRKCWKR 652 P28908 MRVLLAALGLLFLGA...EEGKEDPLPTAASGK 2 386 408 STGKPVLDAG PVLFWVILVLVVVVGSSAFLLCH RRACRKRIRQ 653 Q07011 MGNSCYNIVATLLLV...CSCRFPEEEEGGCEL 2 191 213 PGHSPQIISF FLALTSTALLFLLFFLTLRFSVV KRGRKKLLYI 654 Q13641 MPGGCSRGPAAGDGR...NADPRLTNLSSNSDV 2 354 376 CDPILPPSLQ TSYVFLGIVLALIGAIFLLVLYL NRKGIKKWMH 655 P40238 MPSWALFMVTSCLLL...IANHSYLPLSYWQQP 2 491 513 TRVETATETA WISLVTALHLVLGLSAVLGLLLL RWQFPAHYRR 656 Q9BX59 MGTQEGWCLLLCLAL...SHLHEDRTARVSQPS 2 407 426 TQVVPPERRT ALGVIFASSLFLLALMFLGL QRRQAPTGLG 657 O15533 MKSLSLLLAVALGLA...AVYLSTCKDSKKKAE 2 414 436 GLSGPSLEDS VGLFLSAFLLLGLFKALGWAAVY LSTCKDSKKK 658 O00220 MAPPPARVHLGAFLA...FIYLEDGTGSAVSLE 2 240 262 VHKESGNGHN IWVILVVTLVVPLLLVAVLIVCC CIGSGCGGDP 659 O14763 MEQRGQNAPAASGAR...GKFMYLEGNADSAMS 2 209 231 TSSPGTPASP CSLSGIIIGVTVAAVVLIVAVFV CKSLLWKKVL 660 Q9UBN6 MGLWGQSVPTASSAR...LFYEEDEAGSATSCL 2 210 232 TTILGMLASP YHYLIIIVVLVIILAVVVVGFSC RKKFISYLKG 661 Q9JKE1 MSPLLLWLGLMLCVS...QPSKTSKVQGVSEKQ 2 138 160 LAWCQGKPVM VIVLTCGFILNKGLVFSVLFVFL CKAGPKVLQP 662 Q8K558 MDCYLLLLLLLLGLA...EPVQDPPNSQTPPSK 2 174 196 PHEFRRRENS IPLIWGAVLLLALVVVAVVIFAV MARKKGNRLV 663 Q5T2D2 MAPAFLLLLLLWPQG...DPPGRPEPYVEVYLI 2 267 289 SMPSIRHQDV YSTVLGVVLTLLVLMLIMVYGFW KKRHMASYSM 664 Q3LRV9 MAWRYSQLLLVPVQL...TVSHISGYEKKANWY 2 200 222 PGWTSPGLLV SVQYGLLLLKALMLSVFCVLLCW RSGQGREYMA 665 Q5BVD1 MDLAQPSQPVDELEL...CESLGLDPTSLLLYE 2 66 88 CNKNVVGRCK LWMIITSIFLGVITVIIIGLCLA AVTYVDEDEN 666 Q9P2J2 MVWCLGLAVLSLVIS...AYRQPVPHPEQATLL 2 738 760 PGLLPQPVLA GVVGGVCFLGVAVLVSILAGCLL NRRRAARRRR 667 Q9UPX0 MIWYVATFIASVIGT...PAPATSPPERALSKL 2 727 749 DGLARPVLAG IVATICFLAAAILFSTLAACFVN KQRKRKLKRK 668 Q96J42 MVPAAGRRPPRVMRL...SIRWLIPGQEQEHVE 2 324 342 LPSTLIKSVD WLLVFSLFFLISFIMYATI RTESIRWLIP 669 P0DTE4 MLNNLLLFSLQISLI...SCQKFGKIGKKKKRE 2 491 513 LTWFQYHSLD VIGFLLVCVTTAIFLVIQCCLFS CQKFGKIGKK 670 Q8IZJ1 MGARSGARGALLLAL...GKSEMLVAVATDGDC 2 376 398 HLLEASGDAA LYAGLVVAIFVVVAILMAVGVVV YRRNCRDFDT 671 Q80YF6 MVRTRWQPPLRALLL...VGCEPGLDPLPSLSP 2 196 218 IDTWPGRRSG CMIVITSILSALAGLLLLAFLAA STTRFSSLWW 672 B0FP48 MDNSWRLGPAIGLSA...YTTHLAFSTPAEGAS 2 203 225 AVPGPQSPGT VVIIAILSILLAVLLTVLLAVLI YTCFNSCRST 673 Q5DID0 MLRTSGLALLALVSA...FKIQSNNFSYQVFYE 2 1271 1293 PPHAEAGLGA GYVVLIVVAIFVLVAGTATLLIV RYQRMNGRYN 674 O75445 MNCPVLSLGSGFLFQ...SSVTKERTTFTDTHL 2 5041 5063 RSKSTEFYSE LWFIVLMAMLGLILLAIFLSLIL QRKIHKEPYI 675 P29533 MPVKMVAVLGASTVL...MKGSYSLVEAQKSKV 2 699 721 EHNKDYFSPE LLALYCASSLVIPAIGMIVYFAR KANMKGSYSL 676 Q9H7M9 MGVPTALEAGSWRWG...PSLDPVPDSPNFEVI 2 194 216 SSQDSENITA AALATGACIVGILCLPLILLLVY KQRQAASNRR 677 Q96AW1 MRRQPAKVAALLLGL...CNTPPPPYEQVVKAK 2 59 81 RCCVRALSIQ RLWYFWFLLMMGVLFCCGAGFFI RRRMYPPPLI 678 Q8N0Z9 MAAGGSAPEPRVLVC...SEEQSDIVQEEDRPV 2 412 434 WLSVKEPLNI GGIVGTIVSLLLLGLAIISGLLL HYSPVFCWKV 679 Q86XK7 MVFAFWKVFLILSCL...VVEPLSEDEKGVVKA 2 234 256 IDLTSSHPEV GIIVGALIGSLVGAAIIISVVCF ARNKAKAKAK 680 Q96IQ7 MAELPGPFLCGALLG...ASTVTTTKSKLPMVV 2 242 264 LSVTEPSQGR VAGALIGVLLGVLLLSVAAFCLV RFQKERGKKP 681 Q9Z109 MAWPLVGAFLCGHLL...ASTMTTTKSKLSMVV 2 243 265 LSVTDSSEGR VAGTLIGVLLGVLLLSVAAFCLI RFQKERKKEP 682 Q9Y279 MGILLGLLLLGHLTV...PLDYEFLATEGKSVC 2 284 306 TSAGPGKSLP VFAIILIISLCCMVVFTMAYIML CRKTSQQEHV 683 P0DPA2 MRVGGAFHLLLVCLS...DCAEGPVQCKNGLLV 2 265 287 KVSDSRRIGV IIGIVLGSLLALGCLAVGIWGLV CCCCGGSGAG 684 Q8IW00 MRLLALAAAALLARA...TSTVYAQILFEENKL 2 179 201 ETWAFFEDLY VYAVLVCCVGILSILLFMLVIVW QSVFNKRKSR 685 A8MXK1 MRPLPSGRRKTRGIS...KESTTEEIELEDVEC 2 148 170 VSEILYEDLH FVAVILAFLAAVAAVLISLMWVC NKCAYKFQRK 686 P55808 MESWWGLPCLAFLCF...NNRRNCFRTHEPENV 2 143 165 GNPEGNMVAK IVSPIVSVVVVTLLGAAASYFKL NNRRNCFRTH 687 Q9Y493 MVPPVWTLLLLVGAA...LARLVDTDTVLDCAC 2 2756 2778 PRDAPPPRKP ASNLVGVLLGLLVPVVVVLLAVT RECIYRTRRK 688 Q9ULT6 MRPRSGGRPGATGRR...AGPRSHSADSSSPGA 2 216 238 QHRPPRQPTE YFDMGIFLAFFVVVSLVCLILLV KIKLKQRRSQ 689 Q8WWF5 MPLCRPEHLMPRASR...EYTTVSSAPPEAPGQ 2 255 277 DLGCHPVLTV SWVLGCTLALVVSAFFVLNHLWL WAQACCSHRR 690 P60852 MAGGSATTWGYPVAL...LSQTWAQKLWESNRQ 2 602 624 DSNGNSSLRP LLWAVLLLPAVALVLGFGVFVGL SQTWAQKLWE 691 P20239 MARWQRKASVSSPCG...FICYLYKKRTIRFNH 2 684 703 IIAKDIASKT LGAVAALVGSAVILGFICYL YKKRTIRFNH 692 P21754 MELSYRLFICLLLWG...TRRCRTASHPVSASE 2 387 409 EQWALPSDTS VVLLGVGLAVVVSLTLTAVILVL TRRCRTASHP 693 Q12836 MWLLRCVLLCVSLSL...LAVKKQKSCPDQMCQ 2 506 528 EKLRVPVDSK VLWVAGLSGTLILGALLVSYLAV KKQKSCPDQM 694 Q8TCW7 MEQIWLLLLLTIRVL...PTSLVLNGIRNPVFD 2 374 396 PFQLNAITSA LISGMVILGVTSFSLLLCSLALL HRKGPTSLVL And create their feature matrix using the
SequenceFeatuerclass:sf = aa.SequenceFeature() df_parts = sf.get_df_parts(df_seq=df_seq) X = sf.feature_matrix(features=df_feat["feature"], df_parts=df_parts)
Using list comprehension, labels for all three
Distance-based identificationapproaches can be retrieved using thedPULearn().fit()method with differentmetricparameters:dpul = aa.dPULearn() # List with valid distance measures list_metrics = ["manhattan", "euclidean", "cosine"] list_labels = [dpul.fit(X=X, labels=labels, metric=metric, n_unl_to_neg=n_pos).labels_ for metric in list_metrics]
For the
PCA-based identification, use then_componentsparameter:# List with percentage of total variance to be explaiend [0-1] list_pca_var = [0.6, 0.7, 0.8, 0.9, 0.95] list_labels.extend([dpul.fit(X=X, labels=labels, n_components=i, n_unl_to_neg=n_pos).labels_ for i in list_pca_var])
Now, the
dPULearn().eval()anddPULearnPlot().eval()methods can be used:names = list_metrics + [f"pca (var {int(x*100)}%)" for x in list_pca_var] df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names) dpul_plot = aa.dPULearnPlot() dpul_plot.eval(df_eval=df_eval) plt.tight_layout() plt.show()
Extend the analysis by using a dataset of ground-truth negatives:
_df_seq = aa.load_dataset(name="DOM_GSEC") # First 14 entries are ground-truth non-substrates df_neg = _df_seq[_df_seq["label"] == 0].head(14) df_parts_neg = sf.get_df_parts(df_seq=df_neg) X_neg = sf.feature_matrix(features=df_feat["feature"], df_parts=df_parts_neg) # Perform evaluation and visualization df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names, X_neg=X_neg) dpul_plot.eval(df_eval=df_eval) plt.show()
You can effectively utilize the Kullback-Leibler Divergence (KLD) as a complementary measure alongside the adjusted area under the curve (AUC) to evaluate the dissimilarity between sets of identified negatives and reference datasets. These reference datasets include positive samples (‘Pos’), unlabeled samples (‘Unl’), and, when available, ground-truth negative samples (‘Neg’).
df_eval = dpul.eval(X=X, list_labels=list_labels, names_datasets=names, X_neg=X_neg, comp_kld=True) dpul_plot.eval(df_eval=df_eval) plt.show()
The legend can be turned-of by
legend=Falseor shifted along the y-axis usinglegend_y, handy if thefigsizeis changed:dpul_plot.eval(df_eval=df_eval, figsize=(8, 5), legend_y=-0.1) plt.show()
You can customize the list of used
colors:dpul_plot.eval(df_eval=df_eval, colors=["r", "g", "b", "y"]) plt.show()
Customize the x-limits of each subplot using the
dict_xlimsparameter:dict_xlims = {0: (0, 0.2), 3: (0, 0.5)} # Adjust first and fourth subplot dpul_plot.eval(df_eval=df_eval, dict_xlims=dict_xlims) plt.show()