Annotation

Target annotation for nCoVDock2 (COVID-19 Docking Server 2.0)

1. Nonstructural proteins

    Nonstructural protein 1 (nsp1): Nsp1 is shown to promote cellular mRNA degradation, block host cell translation, and inhibit the innate immune response to virus infection. The crystal structure of nsp1 globular domain (residue 13-121) was downloaded from the PDB database with code of 7K7P and prepared for peptide/antibody docking.

    Nonstructural protein 2 (nsp2): Nsp2 is a protein containing three zinc fingers, indicating a RNA binding site. The full length structure suggesting its role in linking viral transcription within the replication-transcription complexes (RTC) to the translation initiation of the viral message. The full length structure from 7MSW (residue 1-638) and the N-terminal zinc ion binding domain from 7EXM (residue 1-277) are provided for peptide or antibody docking.

    Nonstructural protein 3 (nsp3): Nsp3 is large multi-domain protein. ADP-ribose phosphatase domain (ADRP; also known as the macrodomain, MacroD) involves in the host immune response. The crystal structure of nsp3 MacroD in complex with Adenosine Monophosphate (AMP) and 2-(N-morpholino)-ethanesulfonic acid (MES) was downloaded from the PDB database with the code of 6W6Y (1). As there are two ligands binding in different regions of the protein, two binding sites are defined for small molecule docking as AMP_site and MES_site. A complex structure of macrodomain with lower resolution vaule (pdb code: 5RSF) is provided for AMP_site small molecular docking. And the AMP_site and MES_site are provided based on the structure of 6W6Y. The protein structure of macrodomain(5RSF), ubiquitin-like domain 1 (UBL1, from 7KAG), nucleic acid binding domain (NAB, from 7LGO), Y3 domain (from 7RQG) are provided for peptide or antibody docking on the server.

    Papan-like protease (PLpro): PLpro cleaves the nsp1/2, nsp2/3 and nsp3/4 boundaries. It works with Mpro to cleave the polyproteins into nsps. The crystal structures of wild type and C111S mutant of PLpro in complex with a compound were downloaded from the PDB database with code of 7RZC and 7SQE. Both of the structures are prepared as aforementioned for small molecule docking. For peptide/antibody docking, both the wild type and the C111S mutant are provided on the server.

    Main protease (Mpro, nsp5): It is also named as chymotrypsin-like protease (3CLpro). Mpro cleaves most of the sites in the polyproteins and the products are nonstructural proteins (nsps) which assemble into the replicase-transcriptase complex (RTC). The room-temperature X-ray structure of Mpro in complex with the approved drug, PF-07321332 (pdb code: 7SI9), is prepared and provide for small molecular docking. The protein structure is extracted and also provided for peptide and antibody docking.

    Nonstructural protein 6 (nsp6): No experimental structure currently available. Computational predicted structure downloaded from the Zhang’s lab in the University of Michigan and prepared for peptide or antibody docking (2).

    Nonstructural protein12/7/8 (nsp12/7/8, RNA-dependent RNA polymerase, RdRp): Nsp12 is the polymerase which bounds to its essential cofactors, nsp7 and nsp8. It is important in replication and transcription of the viral genome. The structure of RdRp in complex with RNA and triphosphate form of Remdesivir (RTP) was downloaded from PDB database with code of 7BV2 (3). We defined one site in this structure for small molecule docking: the RTP binding site (RTP site). Recently, an old drug suramin is identified as RdRp inhibitor and it is 20-fold more potent than Remdesivir. The structure of RdRp in complex with suramin (pdb code: 7D4F) reveals the compound binds in the RNA binding site of the protein. Thus 7D4F is also prepared as aforementioned for small molecule docking to the RNA binding site (RNA site). For peptide or antibody docking, four structures are provided: the complex form of nsp12/7/8, and the single chain of nsp12, nsp7, and nsp8.

    Nonstructural protein9 (nsp9): It may act as ssRNA-binding protein in viral replication. Littler et al. resolved the crystal structures of nsp9 in dimer and monomer form with PDB code of 6WXD and 6W9Q, respectively (4). The N terminal residues in dimer present as anti-parallel β-sheet conformation while these residues form extended loop conformation in the monomer. Thus two states of nsp9 are provided for peptide or antibody docking. Small molecules including nature product are also found to bind with the oligomerization interface (pdb code: 7KRI) or the conserve site near the C-terminal GxxxG-helix (pdb code: 7N3K), indicating the possibility to develop anti-viral compounds to inhibit the function of Nsp9. Two set of small molecule docking files with different binding sites are prepared based on 7KRI and 7N3K.

    Nonstructural protein10 (nsp10): Nsp10 plays as a stimulator for the 3'-to-5' exoribonuclease and the 2'-O-methlytransferase activities of nsps 14 and 16, by forming nsp10/14 or nsp10/16 complex. Fragment based screening identified compounds bound to the interface of nsp10/14 and nsp10/16 (pdb code: 7ORR, 7ORU) (5). Small molecule docking files are prepared for two sites, nsp10/14 interface and nsp10/16 interface, based on the structure of 7ORR and 7ORU. The unbound form of nsp10 (pdb code: 6ZPE) is also provide for peptide or antibody docking (6).

    Nonstructural protein10/14 (nsp10/14): Nsp14 is a bifunctional enzyme composed of two major domains: the N terminal domain acting as 3’-5’ exoribonuclease (ExoN) and the C terminal domain acting as mRNA cap guanine-N7 methyltransferase (N7-MTase). The binding of nsp10 stabilizes the conformation of ExoN active site and stimulates the enzyme activity. Small molecule docking files for the ExoN site and the N7-MTase site are prepared based on the Cryo-EM structure of nsp10/14 in complex with RNA (pdb code: 7N0D). The chapso binding site in 7N0D is prepared for selection as well. The complex structure of nsp10/14 and the protein structure of nsp14 extracted from 7N0D are provided for peptide and antibody docking.

    Nonstructural protein16/10 (nsp16/10, 2'-O-methyltransferase): Nsp16 is a S-adenosylmethionine (SAM) dependent nucleoside-2’-O methyltransferase. It is only active with the binding of nsp10. The structure of nsp16/10 in complex with 7-methyl-GpppA (GTA), S-Adenosylmethionine (SAM), and 7-methyl-guanosine- 5'-triphosphate (MGP) was downloaded from the PDB database with code of 6WVN (7). A structure of Nsp16/10 in complex with SAM, 6W4H, with better resolution is also downloaded. SAM_site is prepared for small molecule docking based on 6W4H. GTA_site and MGP_site are prepared based on 6WVN. The complex form of nsp16/10, and the single chain of nsp16, nsp10 are provided for peptide or antibody docking.

    Nonstructural protein 13 (nsp13, helicase): The helicase catalyzes the unwinding of duplex oligonucleotides into single strands in an NTP-dependent manner. It is also an ideal target to develop anti-viral drugs due to its sequence conservation in all CoV species. The crystal structure of nsp13 in complex with ATP analog (ANP) is used to prepare the small molecule docking files for the ANP binding site (pdb code: 7NN0). Fragment based screening identified another site different from the ANP site. We also prepared files for small molecule docking against this site (fragment binding site, pdb code: 5RML). The apo form of nsp13 (pdb code: 7NIO) and the bound form extracted from 7NN0 are both provided for peptide and antibody docking(8).

    Nonstructural protein15 (nsp15, Uridylate-specific endoribonuclease): Nsp15 is a uridylate-specific endoribonuclease and is considered to interfere the innate immune response. Recently, a FDA approved drug Tipiracil is found to bind in the active site of nsp15 and block the infection of SARS-CoV-2 in cell based assays. The complex structure of nsp15 with Tipiracil was downloaded from the PDB database with code of 7K1L (9). The target is prepared as aforementioned for small molecule docking. The monomer form (pdb code: 6VWW) (10) and hexamer form (pdb code: 7N06) (11) of nsp15 are also provided for peptide and antibody docking.

    Open Reading Frame (ORF) 3A: The dimer form of accessory protein ORF3A was downloaded from the PDB database with the code of 7KJR and provided for peptide and antibody docking (12).

    Open Reading Frame (ORF) 7A: The crystal structure of accessory protein ORF7A was downloaded from the PDB database with the code of 7CI3 (13). The target is provided for peptide or antibody docking on the server.

    Open Reading Frame (ORF) 8: The crystal structure of accessory protein ORF8 was downloaded from the PDB database with the code of 7JX6 and provided for peptide and antibody docking.

    Open Reading Frame (ORF) 9B: The crystal structure of accessory protein ORF9b was downloaded from the PDB database with the code of 6Z4U and provided for peptide and antibody docking.

2. Structure proteins

    Spike protein (S protein): The surface spike glycoprotein is consisting of three S1-S2 heterodimers. The receptor binding domain (RBD) locates on the head of S1 and binds with the cellular receptor angiotensin-converting enzyme 2 (ACE2), initiating the membrane fusion of the virus and host cell. The structure of spike RBD of SARS-CoV-2 in complex with human ACE2 was released by Wang and Zhang’s group in Tsinghua University with PDB code of 6M0J (14). The full length structure of spike protein was also determined by using the electron microscopy method. It is shown that the spike protein forms trimer and presents two differential conformations: open state and close state (15). Thus for spike protein, we provided the RBD domain from 6M0J, the open state of trimer from 6VYB, the close state of trimer from 6ZGE for peptide or antibody docking.
    The structural changes on the S protein mutants are pivotal in understanding the molecular basis for the immune evasion of SARS-CoV-2 variants. The S protein structures of different variants including Alpha, Beta, Gamma, Delta, Epsilon and Omicron are provided for peptide or antibody docking.

    S2 of S protein: It is the post-fusion state of S2 segment of spike protein, acting as viral fusion protein to mediate the membrane fusion of virus and cells. Typical HR1/HR2 6-helices complex were formed as post-fusion state of SARS-CoV-2, similar to the fusion step of HIV-1 virus. It is a potential target for entry inhibitor development. The 6-helices post fusion conformation of S2 was downloaded from the PDB database with the code of 7COT. A trimer structure was prepared by deleting HR2 peptides from the 6-helices structure and used as receptor for peptide or antibody docking.

    N-terminal domain (NTD) of S protein: New researches suggest that the NTD is a dominant epitope for antibody binding. The structures of wild type (pdb code: 7B62) and Kappa variant NTD (pdb code:7SOD) are provided for peptide or antibody docking (16).

    Envelop small membrane protein (E protein): It forms pentamer and functions as ion channel, also named as E channel. The modeled structure of E protein was downloaded from the Zhang’s lab and prepared for peptide or antibody docking (2).

    Membrane protein (M protein): The M protein involves in most of protein-protein interactions required for assembly of coronaviruses and it is also determined as a protective antigen in humoral responses (17). The structure of M protein (pdb code: 8CTK) is prepared for peptide or antibody docking (18).

    Nucleocapsid protein (N protein): N protein plays multiple roles in the virus replication cycle and forms a ribonucleo protein complex with the viral RNA through the N protein's N-terminal domain (N-NTD). It buds the viral genomes into the membrane of the endoplasmic reticulum-Golgi intermediate compartment (ERGIC) containing the viral structure proteins to form the mature virions finally (17). Recently, the N-terminal RNA-binding domain, C-terminal dimerization domain, and full length structure of N protein are released in the PDB database. 6YI3 (N-terminal domain), 6YUN (C-terminal domain), and 8FD5 (full length) are prepared and provided for peptide or antibody docking on the server (18,19). The ribonucleotide-binding site (NCB site) of N protein was built based on the complex structure of N protein from Human coronavirus OC43 with the PDB code of 4KXJ with sequence identity of 47.0% and similarity of 62.0%, and prepared as aforementioned for small molecule docking (20).

    Angiotensin-converting enzyme 2 (ACE2): The structure of human ACE2 was extracted from the complex structure of SARS-CoV-2 spike RBD and human ACE2 released by Wang and Zhang’s group in Tsinghua University with PDB code of 6M0J and prepared for peptide or antibody docking (14).