= How to use an Oxford Nanopore MinION to extract DNA from river water and determine which bacteria live in it {c} {numbered} {scope} This article gives an idea of how this kind of biological experiment feels like to a who has never done any like . = Experiment background {parent=oxford nanopore river bacteria} https://www.puntseq.co.uk/[PuntSeq] is a side project led by a few [PhDs] that aims to determine which are present in the https://en.wikipedia.org/wiki/River_Cam[River Cam]. In July 2019, the PuntSeq team got together with the awesome https://biomake.space[Cambridge Biomakespace], an awesome biology makerspace open to all, to create a two day science outreach activity showing their procedures. The data collected in this experiment, together with other collection sessions done by the organizers actually led to a publication on : https://elifesciences.org/articles/61504 "Freshwater monitoring by nanopore sequencing" by Lara Urban et al. (2021), so it is awesome to see that were are actual being part of "real science". Ciro knows nothing about biology, but since he is [very curious about it], he jumped at this opportunity, and decided to document things as well as his limited knowledge would allow. All participants chipped in some money to help cover the experiment's costs. Ciro suspects that this activity was done partially to help crowdfund the experiment, but it was a worthy investment! The impressions you get from the experiment as a software engineer will be: * OMG, this is so labour intensive, why haven't they automated this * OMG, this is frightening, all the 8 hours of work I've just done are present in that tiny plastic tube * Amazing! Look at that apparatus! And the bio people are like: I've used this a million times, it's cheap and every lab has one, just work faster and don't break you piece of junk! = Overview of the experiment {parent=oxford nanopore river bacteria} For those that know biology and just want to do the thing, see: {full}. The PuntSeq team uses an [DNA sequencer] made by to sequence the <16S ribosomal RNA>[16S] region of bacterial , which is about 1500 nucleotides long. This kind of "decode everything from the sample to see what species are present approach" is called "". This is how the MinION looks like: {full}. \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/5/57/Oxford_Nanopore_MinION_top_cropped.jpg/392px-Oxford_Nanopore_MinION_top_cropped.jpg] {title=Oxford Nanopore MinION top} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/6/6e/Oxford_Nanopore_MinION_side_cropped.jpg/191px-Oxford_Nanopore_MinION_side_cropped.jpg] {title=Oxford Nanopore MinION side} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/0/0a/Oxford_Nanopore_MinION_top_open_cropped.jpg/110px-Oxford_Nanopore_MinION_top_open_cropped.jpg] {height=500} {title=Oxford Nanopore MinION top open} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/0/0f/Oxford_Nanopore_MinION_side_USB_cropped.jpg/597px-Oxford_Nanopore_MinION_side_USB_cropped.jpg] {title=Oxford Nanopore MinION side USB} The 16S region codes for one of the pieces that makes the https://en.wikipedia.org/w/index.php?title=Ribosome&oldid=912600990#Bacterial_ribosomes[bacterial ribosome]. Before [sequencing the DNA], we will do a with primers that fit just before and just after the 16S DNA, in well conserved regions expected to be present in all bacteria. The PCR replicates only the DNA region between our two selected primers a gazillion times so that only those regions will actually get picked up by the sequencing step in practice. [Eukaryotes] also have an analogous ribosome part, the 18S region, but the PCR primers are selected for targets around the 16S region which are only present in prokaryotes. This way, we amplify only the 16S region of bacteria, excluding other parts of bacterial genome, and excluding eukaryotes entirely. Despite coding such a fundamental piece of RNA, there is still surprisingly variability in the 16S region across different bacteria, and it is those differences will allow us to identify which bacteria are present in the river. The variability exists because certain base pairs are not fundamental for the function of the 16S region. This variability happens mostly on https://en.wikipedia.org/wiki/Stem-loop[RNA loops as opposed to stems], i.e. parts of the RNA that don't base pair with other RNA in the https://en.wikipedia.org/wiki/Nucleic_acid_secondary_structure[RNA secondary structure] as shown at: {full}. `` A-U / \ A-U-C-G-A-U-C-G C | | | | | | | | | U-A-G-C-U-A-G-C G \ / U-A | || | +-------------++----+ stem loop `` {title=RNA stem-loop structure} This is how the 16S RNA secondary structure looks like in its full glory: {full}. \Image[https://upload.wikimedia.org/wikipedia/commons/a/a6/16S.svg] [height=800] {height=500} {title=16S RNA secondary structure} Since loops don't base pair, they are less crucial in the determination of the secondary structure of the RNA. The variability is such that it is possible to identify individual species apart if full sequences are known with certainty. With the experimental limitations of experiment however, we would only be able to obtain https://en.wikipedia.org/wiki/Family_(biology)[family] or https://en.wikipedia.org/wiki/Genus[genus] level breakdowns. = Why Oxford Nanopore was used instead of Illumina for the sequencing {parent=Overview of the experiment} At the time of the experiment, equipment was cheaper per base pair and dominates the sequencing market, but it required a much higher initial investment for the equipment (TODO how much). The reusable Nanopore device costs just https://web.archive.org/web/20190717141155/https://store.nanoporetech.com/starter-packs/[about 500 dollars], and https://web.archive.org/web/20190911092809/https://store.nanoporetech.com/flowcells.html[about 500 dollars (50 unit volume)] for the single usage flow cell which can decode up to 30 billion base pairs, which is about 10 human genomes 1x! Note that 1x is basically useless for one of the most important of all applications of sequencing: detection of , since the error rate would be too high to base clinical decisions on. Compare that to Illumina which is currently doing about an 1000 dollar human genome at 30x, and a bit less errors per base pair (TODO how much). Other advantages of the MinION over Illumina which didn't really matter to this particular experiment are: * portability for e.g. to do analysis on the field near infections outbreaks. Compare that to the smallest Illumina sequencer currently available in 2019, the iSeq 100: {full}. \Image[https://web.archive.org/web/20190922113448if_/https://www.illumina.com/content/dam/illumina-marketing/images/systems/v2/web-graphic/iseq-100-demo-video-thumbnail-web-graphic.jpg] {title=Illumina iSeq 100 DNA sequencer} {source=https://www.illumina.com/systems/sequencing-platforms/iseq.html} * long reads which can be necessary for long repetitive regions, see also: {full} = Sample collection {parent=oxford nanopore river bacteria} As you would expect, not much secret here, we just dumped a 1 liter glass bottle with a rope attached around the neck in a few different locations of the river, and pulled it out with the rope. And, in the name of science, we even wore gloves to not contaminate the samples! \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/3/33/River_water_sample_collection_swans.jpg/800px-River_water_sample_collection_swans.jpg] {title=Swans swimming in the river when during sample collection} {description=Swam poo bacteria?} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/a/a9/River_water_sample_collection_tie_rope_to_bottle.jpg/360px-River_water_sample_collection_tie_rope_to_bottle.jpg] {height=400} {title=Tying rope to bootle for river water sample collection} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/9/9b/River_water_sample_collection_get_sample.jpg/360px-River_water_sample_collection_get_sample.jpg] {height=400} {title=Dumping the bottle into the river to collect the water sample} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/7/75/River_water_sample_collection_measure_temperature.jpg/360px-River_water_sample_collection_measure_temperature.jpg] {height=400} {title=Measuring the river water sample temperature with a } \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/4/4f/River_water_sample_collection_read_PH_strip.jpg/360px-River_water_sample_collection_read_PH_strip.jpg] {height=400} {title=Measuring the river water sample with a } {description=The strip is compared with the color of a that gives the pH for a given strip color.} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/0/0a/River_water_sample_collection_identify_bottle.jpg/360px-River_water_sample_collection_identify_bottle.jpg] {height=400} {title=Noting sample collection location on the water bottle} \Video[https://upload.wikimedia.org/wikipedia/commons/transcoded/b/bb/River_water_sample_collection_with_a_bottle_and_string.ogv/River_water_sample_collection_with_a_bottle_and_string.ogv.480p.vp9.webm] {height=400} {title=Dumping the bottle into the river to collect the water sample} {description=That was fun.} = DNA extraction {parent=oxford nanopore river bacteria} The first thing we had to do with the sample was to extract the DNA present in the water in a pure form for the PCR. We did that with a . As you would expect, this consists of a purification procedure with several steps. In each step we take a physical or chemical action on the sample, which splits it into two parts: the one with the DNA and the one without. We then take the part with the DNA, and throw away the one without the DNA. The first steps are coarser, and finer and finer splits are done as we move forward. = Filtration with vacuum pump {parent=DNA extraction} The first thing we did was to filter the water samples with a membrane filter that is so fine that not even bacteria can pass through, but water can. Therefore, after filtration, we would have all particles such as bacteria and larger dirt pieces in the filter. From the 1 liter in each bottle, we only used 400 ml because previous experiments showed that filtering the remaining 600 ml is very time consuming because the membrane filter gets clogged up. Therefore, the filtration step allows us to reduce those 400 ml volumes to more manageable volumes: {full}. Reagents are expensive, and lab bench centrifuges are small! \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/Microcentrifuge_tube_in_hand.jpg/640px-Microcentrifuge_tube_in_hand.jpg] {title=An } {description=They are small, convenient and disposable.} \Image[https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Labelled_Eppendorf_microcentrifuge_tubes_on_rack.jpg/640px-Labelled_Eppendorf_microcentrifuge_tubes_on_rack.jpg] {title=Labelled Eppendorf tubes on a rack} Since the filter is so fine, filtering by gravity alone would take forever, and so we used a vacuum pump to speed thing up! For that we used: * * \Image[https://upload.wikimedia.org/wikipedia/commons/6/6e/Vacuum_pump_filter_peel_filter.png] {height=400} {title=Peeling the vacuum pump filter protection peel before usage} \Image[https://upload.wikimedia.org/wikipedia/commons/7/78/Vacuum_pump_filter_place_filter.png] {title=Placing the vacuum pump filter} \Video[https://upload.wikimedia.org/wikipedia/commons/transcoded/3/3f/Vacuum_pump_filter_pour_sample_and_turn_on.webm/Vacuum_pump_filter_pour_sample_and_turn_on.webm.480p.vp9.webm] {height=400} {title=Pouring the water sample into the vacuum tube and turning on the vacuum pump} = Post filtration purification {parent=DNA extraction} After filtration, all DNA should present in the filter, so we cut the paper up with scissors and put the pieces into an Eppendorf: