What virus is responsible for the COVID-19 pandemic?
The COVID-19 disease is caused by a virus called SARS-Cov-2. It belongs to a family of viruses known as coronaviruses.
The SARS-Cov-2 virus is a tiny sphere about 100 nm in diameter with protrusions on its surface called spike proteins.
The coronavirus is essentially a protective shell that contains a long strand of RNA. The RNA is made up of about 30,000 bases that carry the genetic code.
The coronavirus needs a human cell to produce more viruses. First, the virus RNA must enter the cell, and then it commandeers the cell’s own machinery to make more viruses.
The model below shows how the coronavirus RNA (green) enters the human cell (pink) through a channel produced by its spike proteins.
The virus has about 80 spike proteins on its surface which are about 20 nm in length.
Scientists are very interested in the spike proteins as changes in the spike protein molecules can affect how easily humans are infected by the virus, the severity of the illness, whether a person will catch COVID-19 again, the reliability of diagnostic tests and importantly the long term effectiveness of vaccines.
Figure 2: Model of a coronavirus on the surface of a human cell
How has coronavirus become more infectious over time?
When a virus replicates it doesn’t always manage to produce an exact copy of itself. A change in the genetic sequence is called a mutation. Mutations are common in viruses, scientists reckon that a coronavirus acquires around two single-letter mutations in its RNA each month. Viruses with new mutations are called variants.
Occasionally, these random changes give the virus some advantage that makes it easier to spread between people. Over time those variants that are transmitted more easily begin to dominate. In May 2021 the World Health Organization (WHO) assigned Greek letters to these key variants of concern.
In September 2020, a new variant of the coronavirus called the UK or Alpha variant was detected. It appears to be twice as infectious as the original virus. This resulted in a second wave of infection in the UK.
Subsequently, the Delta variant, first detected in India, spread across the globe. This is about twice as infective as the Alpha variant. This variant caused the third wave of infection in the UK.
An increase in the transmission rate of the virus has been linked to changes (mutations) in the amino acids in the spike protein sequences.
The location of the mutations in the Alpha and Delta variants are highlighted on a coronavirus spike protein in the model below.
Figure 3: Model of the spike protein
Exercise 1: View the mutations in the spike protein sequence
A team of researchers at the University of Dundee have developed a software called Jalview, that scientists can use to study proteins, RNA and DNA. It reads files directly from public biological databases and views sequences, 3D structures and evolutionary trees.
JalviewJS may take a few seconds to open and load the sequence in the adjacent web browser window.
In the Jalview viewer, the coronavirus spike protein sequence originating from China can be compared to the Alpha (UK) and Delta (India) sequences. Each letter represents different amino acids in the protein chain.
Feel free to play! For example, place the mouse cursor on the sequence and move the mouse across the different amino acids. Move the scroll bars at the side of the window to view its full length.
Look for differences in the amino acid letters in the sequence columns. Q. Can you spot the differences between the sequences?
The ‘Conservation’ row below the sequences may help.
Figure 4: The original SARS-Cov-2 spike protein sequence aligned with the Alpha and Delta variants
Exercise 2: View the coronavirus spike protein sequence alongside its 3D structure
JalviewJS may take a few seconds to open and load the sequence in the adjacent web browser window.
Feel free to play! For example, rotate the 3D structure by placing the mouse cursor on the structure, then move the mouse. Or place the mouse cursor on the sequence and move the mouse over the different letters. Look at the status bar in the lower left-hand side of the alignment window as it contains extra information.
Exercise 3: View the entire coronavirus virus genome
The RNA genome is made up of 4 different bases with letters A (adenine), C (cytosine), G (guanine ) or U (uracil), coloured green, yellow, red and blue.
In the Jalview viewer, place the mouse cursor on a letter and look at the status bar (lower left-hand corner of the alignment window), it will tell you the name of the nucleotide bases and the ID number.
Scroll to the end of the genome using the mouse wheel or the scroll bar on the right-hand side.
Q. What is the last letter?
Click on the last letter C in the sequence, look at the status bar (in the lower left-hand corner of the window). Q. How many bases are in the coronavirus virus genome?