- Level Foundation
- Duration 12 hours
- Course by Johns Hopkins University
-
Offered by
About
We will learn computational methods -- algorithms and data structures -- for analyzing DNA sequencing data. We will learn a little about DNA, genomics, and how DNA sequencing is used. We will use Python to implement key algorithms and data structures and to analyze real genomes and DNA sequencing datasets.Modules
Welcome
6
Readings
- Welcome to Algorithms for DNA Sequencing
- Pre Course Survey
- Syllabus
- Setting up Python (and Jupyter)
- Getting slides and notebooks
- Using data files with Python programs
Module 1: DNA sequencing, strings and matching
19
Videos
- Module 1 Introduction
- Lecture: Why study this?
- Lecture: DNA sequencing past and present
- Lecture: Genomes as strings, reads as substrings
- Lecture: String definitions and Python examples
- Practical: String basics
- Practical: Manipulating DNA strings
- Practical: Downloading and parsing a genome
- Lecture: How DNA gets copied
- Optional lecture: How second-generation sequencers work
- Optional lecture: Sequencing errors and base qualities
- Lecture: Sequencing reads in FASTQ format
- Practical: Working with sequencing reads
- Practical: Analyzing reads by position
- Lecture: Sequencers give pieces to genomic puzzles
- Lecture: Read alignment and why it's hard
- Lecture: Naive exact matching
- Practical: Matching artificial reads
- Practical: Matching real reads
Quiz
1
Assignment
- Module 1
Programming Homework
1
Assignment
- Programming Homework 1
1
Readings
- Programming Homework 1 Instructions (Read First)
Module 2: Preprocessing, indexing and approximate matching
15
Videos
- Week 2 Introduction
- Lecture: Boyer-Moore basics
- Lecture: Boyer-Moore: putting it all together
- Lecture: Diversion: Repetitive elements
- Practical: Implementing Boyer-Moore
- Lecture: Preprocessing
- Lecture: Indexing and the k-mer index
- Lecture: Ordered structures for indexing
- Lecture: Hash tables for indexing
- Practical: Implementing a k-mer index
- Lecture: Variations on k-mer indexes
- Lecture: Genome indexes used in research
- Lecture: Approximate matching, Hamming and edit distance
- Lecture: Pigeonhole principle
- Practical: Implementing the pigeonhole principle
Quiz
1
Assignment
- Module 2
Programming Homework
1
Assignment
- Programming Homework 2
1
Readings
- Programming Homework 2 Instructions (Read First)
Module 3: Edit distance, assembly, overlaps
13
Videos
- Module 3 Introduction
- Lecture: Solving the edit distance problem
- Lecture: Using dynamic programming for edit distance
- Practical: Implementing dynamic programming for edit distance
- Lecture: A new solution to approximate matching
- Lecture: Meet the family: global and local alignment
- Practical: Implementing global alignment
- Lecture: Read alignment in the field
- Lecture: Assembly: working from scratch
- Lecture: First and second laws of assembly
- Lecture: Overlap graphs
- Practical: Overlaps between pairs of reads
- Practical: Finding and representing all overlaps
Quiz
1
Assignment
- Module 3
Programming Homework
1
Assignment
- Programming Homework 3
1
Readings
- Programming Homework 3 Instructions (Read First)
Module 4: Algorithms for assembly
13
Videos
- Module 4 introduction
- Lecture: The shortest common superstring problem
- Practical: Implementing shortest common superstring
- Lecture: Greedy shortest common superstring
- Practical: Implementing greedy shortest common superstring
- Lecture: Third law of assembly: repeats are bad
- Lecture: De Bruijn graphs and Eulerian walks
- Practical: Building a De Bruijn graph
- Lecture: When Eulerian walks go wrong
- Lecture: Assemblers in practice
- Lecture: The future is long?
- Lecture: Computer science and life science
- Lecture: Thank yous
Homework
1
Assignment
- Programming Homework 4
Quiz
1
Assignment
- Module 4
Post Course Survey
1
Readings
- Post Course Survey
Auto Summary
"Algorithms for DNA Sequencing" is a foundational course in Data Science & AI offered by Coursera. It focuses on computational methods for analyzing DNA sequencing data, covering key algorithms and data structures. Learners will gain insights into DNA, genomics, and practical applications of DNA sequencing using Python. The course spans 720 minutes and is available with a Starter subscription, ideal for those beginning their journey in this domain.

Ben Langmead, PhD

Jacob Pritt