ExPASy logo ExPASy Home page Site Map Search ExPASy Contact us Swiss-Prot
Notice: This page will be replaced with www.uniprot.org. Please send us your feedback!
Search for

UniProtKB/Swiss-Prot entry P33478


[Entry info] [Name and origin] [References] [Comments] [Cross-references] [Keywords] [Features] [Sequence] [Tools]

Note: most headings are clickable, even if they don't appear as links. They link to the user manual or other documents.
Entry information
Entry name POLG_DEN1S
Primary accession number P33478
Secondary accession numbers None
Integrated into Swiss-Prot on February 1, 1994
Sequence was last modified on February 1, 1994 (Sequence version 1)
Annotations were last modified on    November 25, 2008 (Entry version 84)
Name and origin of the protein
Protein name Genome polyprotein
Synonyms None
Contains Protein C
     (Core protein)
     (Capsid protein)
prM
Peptide pr
Small envelope protein M
     (Matrix protein)
Envelope protein E
Non-structural protein 1
     (NS1)
Non-structural protein 2A
     (NS2A)
Non-structural protein 2A-alpha
     (NS2A-alpha)
Serine protease subunit NS2B
     (Non-structural protein 2B)
Serine protease subunit NS3
     (EC 3.4.21.91)
     (Non-structural protein 3)
Non-structural protein 4A
     (NS4A)
Peptide 2k
Non-structural protein 4B
     (NS4B)
RNA-directed RNA polymerase NS5
     (EC 2.7.7.48)
     (EC 2.1.1.56)
     (Non-structural protein 5)
Gene name None
From
Dengue virus type 1 (strain Singapore/S275/1990) (DENV-1) [TaxID: 33741] 
Taxonomy Viruses; ssRNA positive-strand viruses, no DNA stage; Flaviviridae; Flavivirus; Dengue virus group.
Virus hosts Aedes aegypti (Yellowfever mosquito) [TaxID: 7159]
Homo sapiens (Human) [TaxID: 9606]
Protein existence 3: Inferred from homology;
References
[1]
NUCLEOTIDE SEQUENCE [GENOMIC RNA].
DOI=10.1016/0042-6822(92)90560-C; PubMed=1585663 [NCBI, ExPASy, EBI, Israel, Japan]
Fu J., Tan B.H., Yap E.H., Chan Y.C., Tan Y.H.;
"Full-length cDNA sequence of dengue type 1 virus (Singapore strain S275/90).";
Virology 188:953-958(1992).
Comments
  • FUNCTION: Protein C packages viral RNA to form a viral nucleocapsid, and promotes virion budding (By similarity).
  • FUNCTION: prM acts as a chaperone for envelope protein E during intracellular virion assembly by masking and inactivating envelope protein E fusion peptide. prM is matured in the last step of virion assembly, presumably to avoid catastrophic activation of the viral fusion peptide induced by the acidic pH of the trans-Golgi network. After cleavage by host furin, the pr peptide is released in the extracellular medium and small envelope protein M and envelope protein E homodimers are dissociated (By similarity).
  • FUNCTION: Envelope protein E binds cell surface receptor and is involved in membrane fusion between virion and target cell. Synthesized as an homodimer with prM which acts as a chaperone for envelope protein E. After cleavage of prM, envelope protein E dissociate from small envelope protein M and homodimerizes (By similarity).
  • FUNCTION: Non-structural protein 1 is slowly secreted from mammalian cells, but not from mosquito cells. Secreted form elicits protective immune response and plays an essential role in RNA replication. Soluble and membrane-associated NS1 may activate human complement and induce host vascular leakage. This effect might explain the clinical manifestations of dengue hemorrhagic fever and dengue shock syndrome (By similarity).
  • FUNCTION: Non-structural protein 2B is a required cofactor for the serine protease function of NS3 (By similarity).
  • FUNCTION: Serine protease NS3 displays three enzymatic activities: serine protease, NTPase and RNA helicase. NS3 serine protease, in association with NS2B, cleaves the polyprotein at dibasic sites in the cytoplasm: C-prM, NS2A-NS2B, NS2B-NS3, NS3-NS4A, NS4A-2K and NS4B-NS5. NS3 RNA helicase binds RNA and unwinds dsRNA in the 3' to 5' direction (By similarity).
  • FUNCTION: Non-structural protein 4A plays a role in RNA replication. Enhances inhibition of cell antiviral response by non-structural protein 4B (By similarity).
  • FUNCTION: Non-structural protein 4B prevent the establishment of cellular antiviral state by blocking the interferon-alpha/beta (IFN-alpha/beta) and IFN-gamma signaling pathways (By similarity).
  • FUNCTION: RNA-directed RNA polymerase NS5 replicates the viral (+) and (-) genome, and assure the capping of genomes in the cytoplasm. May be involved in methylation of 5'RNA cap structure (By similarity).
  • CATALYTIC ACTIVITY: Selective hydrolysis of -Xaa-Xaa-|-Yaa- bonds in which each of the Xaa can be either Arg or Lys and Yaa can be either Ser or Ala.
  • CATALYTIC ACTIVITY: Nucleoside triphosphate + RNA(n) = diphosphate + RNA(n+1).
  • CATALYTIC ACTIVITY: S-adenosyl-L-methionine + G(5')pppR-RNA = S-adenosyl-L-homocysteine + m7G(5')pppR-RNA.
  • SUBUNIT: prM and envelope protein E form heterodimers in the endoplasmic reticulum and Golgi. Envelope protein E forms homodimers. NS1 forms homodimers as well as homohexamers when secreted. NS1 may interact with NS4A. NS3 and NS2B form an heterodimer. NS3 interacts with unphosphorylated NS5 (By similarity).
  • SUBCELLULAR LOCATION: Protein C: Virion (By similarity).
  • SUBCELLULAR LOCATION: Peptide pr: Secreted (By similarity).
  • SUBCELLULAR LOCATION: Small envelope protein M: Virion membrane; Single-pass type I membrane protein (By similarity).
  • SUBCELLULAR LOCATION: Envelope protein E: Virion membrane; Single-pass type I membrane protein (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 1: Secreted. Endoplasmic reticulum membrane; Peripheral membrane protein; Lumenal side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 2A-alpha: Endoplasmic reticulum membrane (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 2A: Endoplasmic reticulum membrane (By similarity).
  • SUBCELLULAR LOCATION: Serine protease subunit NS2B: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Serine protease subunit NS3: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 4A: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side (By similarity).
  • SUBCELLULAR LOCATION: Non-structural protein 4B: Endoplasmic reticulum membrane; Multi-pass membrane protein (By similarity). Note=The C-terminal transmembrane domain of non-structural protein 4B is presumably reoriented after cleavage on the lumenal side (By similarity).
  • SUBCELLULAR LOCATION: RNA-directed RNA polymerase NS5: Endoplasmic reticulum membrane; Peripheral membrane protein; Cytoplasmic side. Nucleus (By similarity).
  • DOMAIN: Transmembrane domains of the small envelope protein M and envelope protein E contains an endoplasmic reticulum retention signals (By similarity).
  • PTM: Specific enzymatic cleavages in vivo yield mature proteins. The nascent protein C contains a C-terminal hydrophobic domain that act as a signal sequence for translocation of prM into the lumen of the ER. Mature protein C is cleaved at a site upstream of this hydrophobic domain by NS3. prM is cleaved in post-Golgi vesicles by a host furin, releasing the mature small envelope protein M, and peptide pr. Non-structural protein 2A-alpha, a C-terminally truncated form of non-structural protein 2A, results from partial cleavage by NS3 (By similarity).
  • PTM: RNA-directed RNA polymerase NS5 is phosphorylated on serines residues. This phosphorylation may trigger NS5 nuclear localization (By similarity).
  • PTM: Envelope protein E and non-structural protein 1 are N-glycosylated (By similarity).
  • MISCELLANEOUS: The virion is assembled in the endoplasmic reticulum lumen, transported by vesicles to the Golgi, then transported again to the cell membrane where it is released outside the cell.
  • SIMILARITY: Contains 1 helicase ATP-binding domain.
  • SIMILARITY: Contains 1 helicase C-terminal domain.
  • SIMILARITY: Contains 1 peptidase S7 domain [view classification].
  • SIMILARITY: Contains 1 RdRp catalytic domain.
Copyright
Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms. Distributed under the Creative Commons Attribution-NoDerivs License.
Cross-references
Sequence databases
EMBL
M87512; -; NOT_ANNOTATED_CDS; Genomic_RNA.[EMBL / GenBank / DDBJ]
PIR A42551; A42551.
3D structure databases
HSSP Q88653; 1L9K. [HSSP ENTRY / PDB]
SMR P33478; 21-100, 281-673, 1651-2093, 2499-2759.
ModBase P33478.
Protein family/group databases
MEROPS S07.001; -.
Ontologies
GO
GO:0005789; Cellular component: endoplasmic reticulum membrane (inferred from electronic annotation from UniProtKB-SubCell).
GO:0005576; Cellular component: extracellular region (inferred from electronic annotation from UniProtKB-KW).
GO:0016021; Cellular component: integral to membrane (inferred from electronic annotation from InterPro).
GO:0005634; Cellular component: nucleus (inferred from electronic annotation from UniProtKB-KW).
GO:0030529; Cellular component: ribonucleoprotein complex (inferred from electronic annotation from UniProtKB-KW).
GO:0019028; Cellular component: viral capsid (inferred from electronic annotation from InterPro).
GO:0019031; Cellular component: viral envelope (inferred from electronic annotation from InterPro).
GO:0005524; Molecular function: ATP binding (inferred from electronic annotation from InterPro).
GO:0008026; Molecular function: ATP-dependent helicase activity (inferred from electronic annotation from InterPro).
GO:0003725; Molecular function: double-stranded RNA binding (inferred from electronic annotation from InterPro).
GO:0046872; Molecular function: metal ion binding (inferred from electronic annotation from UniProtKB-KW).
GO:0004482; Molecular function: mRNA (guanine-N7-)-methyltransferase activity (inferred from electronic annotation from EC).
GO:0003724; Molecular function: RNA helicase activity (inferred from electronic annotation from InterPro).
GO:0003968; Molecular function: RNA-directed RNA polymerase activity (inferred from electronic annotation from InterPro).
GO:0004252; Molecular function: serine-type endopeptidase activity (inferred from electronic annotation from InterPro).
GO:0005198; Molecular function: structural molecule activity (inferred from electronic annotation from InterPro).
GO:0006355; Biological process: regulation of transcription, DNA-dependent (inferred from electronic annotation from UniProtKB-KW).
GO:0006410; Biological process: transcription, RNA-dependent (inferred from electronic annotation from UniProtKB-KW).
GO:0019079; Biological process: viral genome replication (inferred from electronic annotation from InterPro).
QuickGo view.
Family and domain databases
InterPro IPR014001; DEAD-like_N.
IPR011492; DEAD_Flavivir.
IPR001650; DNA/RNA_helicase_C.
IPR002464; DNA/RNA_helicase_DEAH_CS.
IPR011999; Flav_glyE_cen_dm.
IPR013754; Flav_glyE_dim.
IPR001122; Flavi_capsidC.
IPR000069; Flavi_M.
IPR001157; Flavi_NS1.
IPR000752; Flavi_NS2A.
IPR000487; Flavi_NS2B.
IPR000404; Flavi_NS4A.
IPR001528; Flavi_NS4B.
IPR002535; Flavi_propep.
IPR000336; Flv_glyE_Ig-like.
IPR014412; Gen_Poly_FLV.
IPR014021; Helicase_SF1/SF2_ATP-bd.
IPR001850; Peptidase_S7.
IPR000208; RNA_pol_flaviviral.
IPR007094; RNA_pol_PSvir.
IPR002877; RrmJFtsJ_MeTrfase.
Graphical view of domain structure.
Gene3D G3DSA:2.60.98.10; Flav_glyE_dim; 1.
G3DSA:2.60.40.350; Flv_glyE_Ig-like; 1.
Pfam PF01003; Flavi_capsid; 1.
PF07652; Flavi_DEAD; 1.
PF02832; Flavi_glycop_C; 1.
PF00869; Flavi_glycoprot; 1.
PF01004; Flavi_M; 1.
PF00948; Flavi_NS1; 1.
PF01005; Flavi_NS2A; 1.
PF01002; Flavi_NS2B; 1.
PF01350; Flavi_NS4A; 1.
PF01349; Flavi_NS4B; 1.
PF00972; Flavi_NS5; 1.
PF01570; Flavi_propep; 1.
PF01728; FtsJ; 1.
PF00271; Helicase_C; 1.
PF00949; Peptidase_S7; 1.
Pfam graphical view of domain structure.
PIRSF PIRSF003817; Gen_Poly_FLV; 1.
ProDom PD001496; Flavi_NS1; 1.
[Domain structure / List of seq. sharing at least 1 domain]
SMART SM00487; DEXDc; 1.
SM00490; HELICc; 1.
SMART graphical view of domain structure.
PROSITE PS00690; DEAH_ATP_HELICASE; FALSE_NEG.
PS51192; HELICASE_ATP_BIND_1; 1.
PS51194; HELICASE_CTER; 1.
PS50507; RDRP_SSRNA_POS; 1.
PROSITE graphical view of domain structure (profiles).
BLOCKS P33478.
ProtoNet P33478.
Other
UniRef View cluster of proteins with at least 50% / 90% / 100% identity.
Keywords
ATP-binding; Capsid protein; Cleavage on pair of basic residues; Complete proteome; Endoplasmic reticulum; Envelope protein; Glycoprotein; Helicase; Hydrolase; Membrane; Metal-binding; Multifunctional enzyme; Nucleotide-binding; Nucleotidyltransferase; Nucleus; Phosphoprotein; Protease; Ribonucleoprotein; RNA replication; RNA-binding; RNA-directed RNA polymerase; Secreted; Serine protease; Transcription; Transcription regulation; Transferase; Transmembrane; Viral nucleoprotein; Virion.
Features
SEVIEWER logo Feature table viewer FT aligner logo Feature aligner
KeyFrom    To Length Description FTId
CHAIN   1    100  100     Protein C. PRO_0000037894
PROPEP   101    114  14     ER anchor for the protein C, removed in mature form by serine protease NS3. PRO_0000037895
CHAIN   115    280  166     prM. PRO_0000264654
CHAIN   115    205  91     Peptide pr. PRO_0000264655
CHAIN   206    280  75     Small envelope protein M. PRO_0000037896
CHAIN   281    774  494     Envelope protein E. PRO_0000037897
CHAIN   775   1126  352     Non-structural protein 1. PRO_0000037898
CHAIN   1127   1344  218     Non-structural protein 2A. PRO_0000037899
CHAIN   1127   1315  189     Non-structural protein 2A-alpha. PRO_0000264656
CHAIN   1345   1474  130     Serine protease subunit NS2B. PRO_0000037900
CHAIN   1475   2093  619     Serine protease subunit NS3. PRO_0000037901
CHAIN   2094   2220  127     Non-structural protein 4A. PRO_0000037902
PEPTIDE   2221   2243  23     Peptide 2k. PRO_0000264657
CHAIN   2244   2492  249     Non-structural protein 4B. PRO_0000037903
CHAIN   2493   3396  904     RNA-directed RNA polymerase NS5. PRO_0000037904
TOPO_DOM   1    101  101     Cytoplasmic (Potential). 
TRANSMEM   102    122  21     Potential. 
TOPO_DOM   123    238  116     Extracellular (Potential). 
TRANSMEM   239    259  21     Potential. 
TOPO_DOM   260    265  6     Cytoplasmic (Potential). 
TRANSMEM   266    286  21     Potential. 
TOPO_DOM   287    724  438     Extracellular (Potential). 
TRANSMEM   725    745  21     Potential. 
TOPO_DOM   746    751  6     Cytoplasmic (Potential). 
TRANSMEM   752    772  21     Potential. 
TOPO_DOM   773   1155  383     Extracellular (Potential). 
TRANSMEM   1156   1176  21     Potential. 
TOPO_DOM   1177   1446  270     Cytoplasmic (Potential). 
TRANSMEM   1447   1467  21     Potential. 
TOPO_DOM   1468   2192  725     Lumenal (Potential). 
TRANSMEM   2193   2213  21     Potential. 
TOPO_DOM   2214   2220  7     Cytoplasmic (Potential). 
TRANSMEM   2221   2240  20     Potential. 
TOPO_DOM   2241   2348  108     Lumenal (Potential). 
TRANSMEM   2349   2369  21     Potential. 
TOPO_DOM   2370   2414  45     Cytoplasmic (Potential). 
TRANSMEM   2415   2435  21     Potential. 
TOPO_DOM   2436   2460  25     Lumenal (Potential). 
TRANSMEM   2461   2481  21     Potential. 
TOPO_DOM   2482   3391  910     Cytoplasmic (Potential). 
DOMAIN   1655   1811  157     Helicase ATP-binding. 
DOMAIN   1821   1988  168     Helicase C-terminal. 
DOMAIN   3019   3168  150     RdRp catalytic. 
NP_BIND   1668   1675  8     ATP (Potential). 
MOTIF   1759   1762  4     DEAH box (By similarity). 
ACT_SITE   1525   1525        Charge relay system; for serine protease NS3 activity (By similarity). 
ACT_SITE   1549   1549        Charge relay system; for serine protease NS3 activity (By similarity). 
ACT_SITE   1609   1609        Charge relay system; for serine protease NS3 activity (By similarity). 
SITE   100    101  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   114    115  2     Cleavage; by host signal peptidase (By similarity). 
SITE   205    206  2     Cleavage; by host furin (By similarity). 
SITE   280    281  2     Cleavage; by host signal peptidase (By similarity). 
SITE   774    775  2     Cleavage; by host signal peptidase (By similarity). 
SITE   1126   1127  2     Cleavage; by host (By similarity). 
SITE   1314   1315  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   1474   1475  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   2220   2221  2     Cleavage; by host signal peptidase (By similarity). 
SITE   2243   2244  2     Cleavage; by serine protease NS3 (By similarity). 
SITE   2492   2493  2     Cleavage; by serine protease NS3 (By similarity). 
CARBOHYD   183    183        N-linked (GlcNAc...) (Potential). 
CARBOHYD   347    347        N-linked (GlcNAc...) (Potential). 
CARBOHYD   433    433        N-linked (GlcNAc...) (Potential). 
CARBOHYD   981    981        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2302   2302        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2306   2306        N-linked (GlcNAc...) (Potential). 
CARBOHYD   2458   2458        N-linked (GlcNAc...) (Potential). 
DISULFID   283    310        By similarity. 
DISULFID   340    401        By similarity. 
DISULFID   354    385        By similarity. 
DISULFID   372    396        By similarity. 
DISULFID   465    565        By similarity. 
DISULFID   582    613        By similarity. 
Sequence information
Length: 3396 AA [This is the length of the unprocessed precursor] Molecular weight: 379564 Da [This is the MW of the unprocessed precursor] CRC64: C53E75F3E424367D [This is a checksum on the sequence]
        10         20         30         40         50         60 
MNNQRKKTAR PSFNMLKRAR NRVSTGSQLA KRFSKGLLSG QGPMKLVMAF IAFLRFLAIP 

        70         80         90        100        110        120 
PTAGILARWG SFKKNGAIKV LRGFKKEISN MLNIMNRRKR SVTMLLMLLP TALAFHLTTR 

       130        140        150        160        170        180 
GGEPHMIVSK QEREKSLLFK TSVGVNMCTL IAMDLGELCE DTMTYKCPRI TEAEPDDVDC 

       190        200        210        220        230        240 
WCNATDTWVT YGTCSQTGEH RRDKRSVALA PHVGLGLETR TETWMSSEGA WKQIQRVETW 

       250        260        270        280        290        300 
ALRHPGFTVI ALFLAHAIGT SITQKGIIFI LLMLVTPSMA MRCVGIGSRD FVEGLSGATW 

       310        320        330        340        350        360 
VDVVLEHGSC VTTMAKDKPT LDIELLKTEV TNPAVLRKLC IEAKISNTTT DSRCPTQGEA 

       370        380        390        400        410        420 
TLVEEQDANF VCRRTFVDRG WGNGCGLFGK GSLLTCAKFK CVTKLEGKIV QYENLKYSVI 

       430        440        450        460        470        480 
VTVHTGDQHQ VGNETTEHGT IATITPQAPT SEIQLTDYGA LTLDCSPRTG LDFNEMVLLT 

       490        500        510        520        530        540 
MKEKSWLVHK QWFLDLPLPW TSGASTSQET WNRQDLLVTF KTAHAKKQEV VVLGSQEGAM 

       550        560        570        580        590        600 
HTALTGATEI QTSGTTTIFA GHLKCRLKMD KLTLKGMSYV MCTGSFKLEK EVAETQHGTV 

       610        620        630        640        650        660 
LVQVKYEGTD APCKIPFSTQ DEKGVTQNRL ITANPIVTDK EKPVNIETEP PFGESYIVVG 

       670        680        690        700        710        720 
AGEKALKQCW FKKGSSIGKM FEATARGARR MAILGDTAWD FGSIGGVFTS VGKLVHQVFG 

       730        740        750        760        770        780 
TAYGVLFSGV SWTMKIGIGI LLTWLGLNSR STSLSMTCIA VGMVTLYLGV MVQADSGCVI 

       790        800        810        820        830        840 
NWKGRELKCG SGIFVTNEVH TWTEQYKFQA DSPKRLSAAI GKAWEEGVCG IRSATRLENI 

       850        860        870        880        890        900 
MWKQISNELN HILLENDMKF TVVVGDVVGI LAQGKKMIRP QPMEHKYSWK SWGKAKIIGA 

       910        920        930        940        950        960 
DIQNTTFIID GPDTPECPDD QRAWNIWEVE DYGFGIFTTN IWLKLRDSYT QMCDHRLMSA 

       970        980        990       1000       1010       1020 
AIKDSKAVHA DMGYWIESEK NETWKLARAS FIEVKTCVWP KSHTLWSNGV LESEMIIPKI 

      1030       1040       1050       1060       1070       1080 
YGGPISQHNY RPGYFTQTAG PWHLGKLELD FDLCEGTTVV VDEHCGNRGP SLRTTTVTGK 

      1090       1100       1110       1120       1130       1140 
IIHEWCCRSC TLPPLRFKGE DGCWYGMEIR PVKEKEENLV KSMVSAGSGE VDSFSLGLLC 

      1150       1160       1170       1180       1190       1200 
ISIMIEEVMR SRWSRKMLMT GTLAVFLLLI MGQLTWNDLI RLCIMVGANA SDRMGMGTTY 

      1210       1220       1230       1240       1250       1260 
LALMATFKMR PMFAVGLLFR RLTSREVLLL TIGLSLVASV ELPNSLEELG DGLAMGIMIL 

      1270       1280       1290       1300       1310