How to Design Plasmids from Scratch

Engineering molecular transformers to turn jellyfish into underwater Roombas and more!

15 min readApr 3, 2021

Table of Contents
Intro
Plasmid Anatomy
Designing in Benchling
My project — ocean-cleaning jellyfish

So you’re a yeast cell, floating around, making bread *supple*. All of a sudden, you’re attacked by millions of little circles like a real-life zerg rush.

Pre-plasmid yeast vs. post-plasmid yeast

Those circles have a name — we call them plasmids. Plasmids are circular pieces of DNA separate from the cell’s chromosomal DNA.

You may have heard that DNA is the instruction manual for all cellular activities and it’s just that — the instructions. The cell reads them and carries the instructions out through activities like synthesizing proteins, cell division and of course, producing energy in the mitochondria.

So what happens when we sneak some extra instructions in with plasmids?

The cell follows them!

With plasmids, we can now turn cells into mini robots and get them to do virtually anything we want them to do!

yeast + plasmids coding GFP protein = green bread, medusoid + plasmids coding ODP enzyme = underwater organophosphate hunter! (will be explained in more detail later)

Absolutely flabbergasting. I could use some more arm muscles…

So you’re getting ideas and now want to build some plasmids for yourself. But before I get into that, let’s go over the basic anatomy of a plasmid so we can understand what we’re actually working with later.

Every plasmid has 8 main elements: the origin of replication, restriction sites, a promoter, terminator, inserted gene, antibiotic resistance gene (prokaryotic) or auxotrophic selection marker (eukaryotic) and start & stop sequences for translation. Let’s break them all down.

For clarification, everything above is all just DNA (A’s, C’s, G’s, and T’s). In real life, they’re not colour-coded or labelled, and cells can still identify everything perfectly.

Origin of Replication

Plasmids need to reproduce — make copies of themselves — to survive and live on. The origin of replication or ori is a specific region within the plasmid’s DNA sequence that tells the host cell’s replication machinery to “bind here!” and replicate the entire plasmid.

The DNA is unzipped by helicases (a protein) and primase (another protein) attaches a short RNA-chain called a primer. The primer is like a pull tab for a zipper, enabling DNA Polymerase to bind and synthesize the rest of the nucleotides by zipping down the DNA. Primase takes the replicated plasmid from 0 → 1, DNA Polymerase takes it from 1 → 100(00…). See it in action below.

Watch this video if you want to learn more about DNA replication.

Restriction Sites and Restriction Enzymes

You can think of restriction enzymes as DNA ninjas with one sole mission. Wielding their molecular katana swords, they search the genome for a specific complementary 4–8 bp sequence called a restriction site. Once found, they go *hiyah* and form a double-stranded break at that site. Each restriction enzyme has its own unique restriction site.

Now that there’s a gap in the DNA, researchers can insert a gene into that gap and form a recombinant plasmid.

Plasmids are actually naturally found in bacteria and enhance their survival. However, in nature, it’s a dog-eat-dog world out there. Chances of successful plasmid mutation and infiltration are at an all-time low. Everything is left to chance and rarely goes right. But could you imagine if we brought the wilderness into the lab and just hoped for everything to go as intended?

To prevent lab work from being even more expensive than it already is, scientists modify plasmid backbones to optimize for success. In plasmids engineered for commercial use (for researchers, companies, etc.), restriction sites are often clumped together in multiple cloning sites (MCS).

Multiple cloning sites are short segments of DNA that contain up to 20 restriction sites. They enable scientists to insert a piece of DNA without disrupting the rest of the plasmid. Having a variety of restriction sites also makes it more convenient for scientists to choose a perfect one (based on cut sites, availability of enzymes, cost of enzymes, etc.)

However, as a plasmid designer, the most important property you have to keep in mind is the number of cut sites. Cut sites are parts along the genome that your chosen restriction enzyme can bind to and cut at. Make sure that your enzyme only has 1 cut site (in the multiple cloning site) or else your plasmid will become sliced like a loaf of bread and disintegrate.

In fact, I have a bracelet that perfectly illustrates this.

Fun fact: Artificial insulin is also made using plasmids and MCS. After genetically modifying a plasmid with the gene for human insulin, it is put into a bacterial host to divide and multiply.

Now to meet the rest of the party, we need to understand how proteins are made from DNA (aka the central dogma). Here’s a quick explanation:

Central Dogma of Biology

Like I mentioned before, DNA needs to be transformed into a protein to be functional. The transformation occurs by converting DNA to mRNA and then to a protein.

Step 1: Transcription: DNA → mRNA

Let’s imagine DNA as an assignment on a whiteboard. To complete the assignment, students need to first copy down the instructions into their notebooks and then go home to work on it.

In the real world, instead of copying the instructions down as words on a page, it’s copied down (transcribed) as mRNA or messenger RNA. Instead of students, they are RNA polymerases.

mRNA and DNA are different forms of the same language. Both are built using nucleotides (adenine, cytosine, guanine, however, in RNA, thymine is replaced by uracil). DNA is a double-stranded helix but RNA is single-stranded.

RNA polymerase unwinds the DNA and binds to the promoter region (a site along the DNA that marks the starting point for transcription). It goes down the DNA strand making an RNA copy of the DNA. Unlike DNA polymerase, RNA polymerase doesn’t need a primer. Once RNA polymerase reaches a region called the terminator sequence, it will detach, release the mRNA copy and allow the DNA double helix to reform.

Now if you’re working with prokaryotes (bacteria), transcription would be done once the mRNA is created. Bacteria is a simpleton. They’re the type to go to Starbucks and order a basic vanilla latte.

Eukaryotes, on the other hand, would be the type to go and order a venti, half-soy milk, half-almond milk, 1 shot espresso, 2 shots matcha, no whipped cream frappuccino with cane sugar, 1 pump vanilla syrup, 3 short sprinkles of cinnamon and a pinch of pink Himalayan sea salt (phew).

After eukaryotes make an RNA copy (they call it pre-mRNA), it has to go through some modifications before becoming mature mRNA.

Post-transcriptional modifications (only in eukaryotes):

Pre-mRNA has to go through a dangerous journey from the nucleus to the cytoplasm. To protect the genetic information from degradation, enzymes in the cell add a 5' cap (guanine on the 5' end) and a poly-A tail (50–250 adenines on the 3' end).

2. Alternative mRNA Splicing
When you’re in art class and using construction paper, there will be the part you cut out that you need and there will also be scraps of leftover construction paper. Likewise, for making proteins in eukaryotes, we don’t need all the pre-mRNA.

The pre-mRNA is divided into alternating sections. There are introns that don’t code for anything and are spliced out by a spliceosome. That leaves the exons which are joined together and move on to be translated. Intron excision allows a single gene to encode for multiple different proteins.

from https://www.ncbi.nlm.nih.gov/ Molecular Biology Review

You likely won’t have to worry about post-transcriptional modifications in your plasmid-designing as the cell’s machinery takes care of it, but it’s still good to know :)

Step 2: Translation: mRNA → Protein

Going back to our school assignment analogy, once the students finish copying the assignment down, they go home, read and complete it. That’s what happens during translation but instead of reading words, they read codons; instead of going home, the mRNA goes to a ribosome; and instead of doing schoolwork, proteins are made!

At the most basic level, proteins are made using amino acids. Each amino acid is coded by a corresponding codon (a triplet of nucleotides on the mRNA). There are 64 possible codons as shown in the table below.

from https://openoregon.pressbooks.pub/ yellow → amino acid, purple → codon

Let’s do a quick exercise. If a strand of mRNA had the code
5'-AUGGCCCUUCUCGAACGCAGAUAA-3', what amino acid chain would it code for?

Answer: met-ala-leu-leu-glu-arg-arg-[release factor]

The conversion of codon → amino acid is facilitated by tRNA or transfer RNA. tRNA are pieces of RNA shaped into a cloverleaf structure floating throughout the cell. On one hand, it contains an anticodon — a complementary sequence to a specific codon. If the anticodon on the tRNA was AAG, it would bond to a UUC codon on the mRNA. On tRNA’s other hand, there’s an amino acid attachment site where the tRNA would grab and carry a corresponding amino acid hanging out in the cell.

And now to tie it all together, enter the ribosome! mRNA goes through the ribosome where each codon is matched with a tRNA and its amino acid is released from the tRNA and added to the growing polypeptide chain (the protein). This process is repeated up to 33 000 times.

The brown t-shaped objects are the tRNA containing an anticodon at the bottom and a corresponding amino acid at the top. The ribosome (yellow oval) joins the amino acids together to form the polypeptide chain.

The flow of translation is like a waltz in 3/4 time.

It’s also important to note a few special codons: start & stop codons. These codons tell ribosomes when to start and stop translating. In the table above, AUG is the start codon (which also codes for methionine) and UAA, UGA, UAG are the stop codons for eukaryotes. They may be different for prokaryotes depending on what organism you’re working with.

These codons calibrate the correct reading frame to start and stop translation. Without them, an amino acid sequence such as “bob the bug ate the dog” may be translated as “obt heb uga tet hed og”. It would make the cell very confused. This is also why 1 single mutation like an addition/removal of a nucleotide can be so deadly.

Ribosomal binding sites

For the ribosomes (which are freely floating around in the cell) to find the start codon and initiate translation, we need another sequence. Keep in mind that ribosomes don’t have eyes and can’t look around for AUGs. In prokaryotes, this sequence is that Shine-Dalgarno sequence — AGGAGG. It is located 8bp prior to the start codon in the ribosomal binding site. The SD sequence contained nucleotides complementary to the DNA in the ribosome — ACCUCCUUA (C matches with G, U matches with A, etc.), allowing the ribosome to *click* on and start translating.

In eukaryotes, the same logic applies but the sequence is called the Kozak sequence — GCCGCCACCAUGG. It already contains the start codon when designing, you can just paste it before your gene insert.

This awesome video shows what DNA transcription and translation look like in real life

Now that we’re back from our not-so-short central dogma tangent, let’s continue our plasmid’s anatomy episode.

Inserted Gene

This is every plasmid’s raison d’être. Without this unique gene, there is no point in designing a specialized plasmid. The inserted gene is like the supreme commander of the plasmid, everything exists to serve the wishes of this gene.

This gene can be anything, from GFP that colours cells neon green to the heme protein used to make Impossible burgers. This gene is the DNA segment that is replicated, transcribed, translated into a functional protein and harvested.

from addgene.org | FYI restriction enzymes are not actually scissors shaped

After your chosen restriction enzyme cleaves at its restriction site, DNA ligase inserts the gene insert and seals the gaps.

Antibiotic Resistance Gene

Cold email response rates are around 10%… which is not a comfortably high number. But did you know that only 1 cell in about 10,000 or more cells becomes competent to take up the foreign DNA. That’s only 0.001%. How can we get rid of the 99.999% of untransformed cells?

We kill them :D

And scientists thought of a genius way to do so.

When working with bacteria, we include an antibiotic-resistant gene in the plasmid so that only the transformed cells (cells that have uptaken the plasmid) will be able to survive in an environment with antibiotics. The antibiotic-resistant gene acts as an antidote to the poison (the antibiotic). A common example is Ampicillin. The ampicillin-resistant gene (amp) codes for an enzyme (b-lactamase). This enzyme catalyzes the hydrolysis of amp (aka destroys the antibiotic in the surrounding area).

When the host cell divides, the plasmid it contains will also replicate and identical copies are distributed to each daughter cell. That way, colonies of bacteria with antibiotic resistance will grow and all the bacteria we don’t need will die!

Auxotrophic Selection Marker

An auxotrophic selection marker has the same job as an antibiotic resistance gene but works for eukaryotic host cells instead. Scientists first modify the host cell to not be able to produce an organic compound required for its growth (ex. an essential amino acid). Then we give organic compound-producing abilities to the cells that have uptaken the plasmid and boom, molecular hunger games where only the transformed cells survive.

A common example is using LEU2, a selection marker for the amino acid leucine. LEU2 encodes beta-isopropylmalate dehydrogenase, the enzyme that catalyzes the third step in leucine biosynthesis (ex. enabling the cell to manufacture leucine and survive!)

Productizing the Plasmid

Now that we have the technical knowledge, we can design our very own plasmids. I’ll be going over what I did for my plasmid project as it’s quite universal and uses everything you just learned.

My plasmid was an expression plasmid with the end goal to create the ODP enzyme. However, plasmids can also be cloning plasmids to use the cell’s machinery to make copies of DNA fragments.

Step 1: Make a Benchling account.

Benchling. Google docs for plasmid design.

To start painting, an artist needs a blank canvas. As DNA artists, we also need a DNA blank canvas to start displaying our artistry.

Step 2: Go to www.addgene.org/collections/empty-backbones/ and find a plasmid backbone. The 2 common backbones are for e. coli and yeast as they are the cheapest. However, it varies based on your project. You’ll see that the empty backbone is actually not empty at all! In fact, it looks like there’s whole a lot going on. But that’s okay because there are only a few key elements that we need to worry about and we already learned about them above.

I chose this plasmid for yeast but as long as your backbone has the necessary elements, it should be fine to use

For my project, I used the pXP420 yeast backbone to create a plasmid that would code for the ODP enzyme. This enzyme would be able to break down organophosphates, a common toxic chemical in pesticides. By inserting these plasmids into rat heart cells situated upon jellyfish scaffolds (medusoids) and running an electric current through everything, hypothetically, it would start contracting and relaxing like a real jellyfish in water! And be able to clean the water at the same time.

Learn more about my project below:

Step 3: Import the plasmid sequence into Benchling by clicking + > DNA Sequence > New DNA Sequence > Search External Databases > [paste the addgene link]

You should get something that looks like this

To decode all the random letters,

ori = e. coli origin of replication, 2u ori= yeast origin of replication

because oftentimes yeast transports e. coli and depending on which kind of host organism/cell machinery is used, the origin of replication will be different

HIS3 = auxotrophic selection marker that catalyzes the sixth step in histidine biosynthesis

AmpR = ampicillin resistance

For my project, as I intend to insert this plasmid into yeast, 2u ori and HIS3 would be used, but the cell will take care of that by itself.

TEF1 promoter = promoter

CYC1 terminator = terminator

Step 4: Pick a suitable restriction site.

The letters along the circumference are all restriction sites. We want to make a cut at a multiple cloning site so look for groups of multiple restriction sites together (ex. the ones that say +9, +6, etc.).

The specific restriction site you pick would be based on the price, availability of restriction enzymes, etc. but as we’re modelling this online, picking a site with 1 cut site is all you

need. I chose SacI because it indeed had only 1 cut site.

The find the number of cut sites, click the scissors on the right sidebar and search for your enzyme.

Now it’s time to get cooking 🧑‍🍳

First ingredient — the promoter.

The promoter and other elements would be assembled, or inserted, at a lab/company using a procedure called Gibson Assembly.

Gibson Assembly is a procedure to join multiple strands of DNA. See below.

Gibson Assembly eats away DNA at the restriction site and so to simulate the insertion, we can…

Step 5: copy the TEF1 Promoter, delete SacI and paste in its place, the TEF1 Promoter.

Now, this might be a bit confusing but essentially what would happen is that we would send the empty backbone (pXP420) to the lab, also send the finished product (our designed plasmid), tell them that we used SacI to cut the DNA and they will know what to do.

Step 6: As we’re working with eukaryotic cells, after the promoter we need to insert the Kozak sequence (GCCGCCACCAUGG) right after the promoter and annotate it.

Step 7: Now we’re ready for the 🥁🥁🥁

GENE INSERT!

This is where it’s super personalized. I copied in the ODP gene as that was the gene proven to code for an enzyme that would break down organophosphates. SnapGene has a bunch of gene sequences. If not, google will yield results just as nicely. And don’t forget to annotate it!

🚨 Important: make sure your gene insert does not contain the restriction site you chose or else ✂️✂️. SnapGene viewer can help you with that.

After pasting in your gene insert, the important work is done. We now need to tie up the loose ends by stopping transcription & translation.

Step 8: Right after the gene insert, paste in & annotate the stop codon (TGA) which will act as a release factor and stop the ribosome from translation the mRNA.

Step 9: Copy the CYC1 terminator, paste it after the stop codon, annotate it, and delete the previous annotation. This will halt transcription.

Step 10: Create a blank entry and write down what we did — “We cut this plasmid using SacI and we inserted ODP”

And that’s it!

View my finished project here.

Of course the real world is more complicated so be sure to check with a professional before deciding to get your plasmid manufactured and actually experiment with it but those are the basic steps you will need.

Learn more about plasmids here: Plasmids 101.

Here’s a really nice video going into more detail on designing plasmids in Benchling.

Secrets of the Lab

After designing your picture-perfect plasmid and placing an order at a lab to get it produced, you now have a vial of custom plasmids. Now what?

We need to help them infiltrate some cells. Learn how to here and here.

You can also watch my video from when I performed the procedure and transformed my kitchen into a mini-lab:

Plasmids’ Career Path Choices

Plasmids have an endless amount of uses, here are just a few.

To transfer genes into the cells of bacteria, plants, animals, or other living organisms to improve their growth rates or other traits.
To farm proteins in large amounts (ex. insulin).
To introduce therapeutic genes into target cells to regulate DNA expression.

TLDR

Everything is just floating around in the cell, it’s not a factory where molecules wait in line
Plasmids = giving the cell an extra set of instructions to follow
Every plasmid has 8 main elements: the origin of replication, restriction sites, a promoter, terminator, inserted gene, antibiotic resistance gene (prokaryotic) or auxotrophic selection marker (eukaryotic) and start & stop sequences for translation.
Resources to use when designing plasmids: Benchling, Addgene, SnapGene, this video
Plasmids are introduced to cells through cell transformation
Plasmids are used to ameliorate cells, farm proteins & gene therapy

Thanks for making it to the end of this article! 💞

If you have any questions, feel free to reach out to me on LinkedIn or Twitter. If there is anything in my article that needs to be fixed or added upon, please let me know as well, I’m still learning!

Want to hear about my personal highlights every month? Subscribe to my monthly newsletter here.