6 min read

Capture rate analysis

Home >> Nanopore training course >> Analyzing your data >> Capture rate analysis

Capture rate
Analyzing capture rate with Nanolyzer
Controlled counting

The examples in this tutorial were done using Nanolyzer™, Northern Nanopore's data analysis software.

1. Capture Rate

One of the claims in nanopore literature is that the rate at which molecules are detected by a nanopore is proportional to the concentration of the molecule. While that's true if you average enough, on an individual pore level capture rate is a very noisy metric, varying significantly between electrically identical pores for a given sample and even over time on the same pore. It's also one of the most difficult metrics to extract from a nanopore, with many pitfalls that can bias and distort the value in non-intuitive ways. In this section we'll discuss some of these challenges and how to mitigate them. If you want more detail on practical capture rate analysis on nanopore systems, a more detailed report is available through one of our recent publications.

1.a. Capture rate statistics

Unless you are operating in very high molecular concentrations or in a geometry where molecules interact with one another prior to translocating the nanopore, molecular capture by a nanopore is an exponentially distributed random variable. This means that molecular translocations are uncorrelated in time with one another, and do not affect at all the next molecular translocation event. In systems where the concentration is very high near the pore (i.e. many molecules per capture volume on average) this picture breaks down, so bear that in mind when working with complex sample mixtures that have high concentrations of molecules, or in systems where translocation time is comparable to inter-event time. Here, we focus only on the case where concentration is sufficiently low to consider translocations to be uncorrelated in time, and there is never more than one molecule in or near the pore.

1.b. Capture rate by time-averaging

The simplest possible way to extract capture rate is to count the number of events that occur in a given time interval, and divide by the length of the interval. Simple enough, but prone to a host of errors that make it difficult to use in practice. For example, what if the pore was partially clogged at some point during your time interval? Do you still count events that happen during the partial clog, or are they fundamentally different? Do you subtract the duration of the clog from your total time?

In a well-behaved pore this method works very well, but you need to be very careful to check that it really was well-behaved for the entire duration. Automatic clog detection is a feature you can expect in a future version of Nanolyzer, but for the time being, this method is not implemented.

1.c. Capture rate by inter-event time distribution

Since we know the exponential distribution that governs translocation statistics, we can fit this distribution using the experimentally measured probability distribution of the inter-event times. Simply build up a histogram of the inter-event times and fit to a single-parameter exponential decay. This is the method implemented in Nanolyzer, for a few reasons.

First, it is robust against outliers. For example, consider a clogged pore that does not allow molecular passage for some time. There will be at least one inter-event time that is very long, at least as long as the clog. But this will contribute only a single, distant outlier to the histogram of inter-event times and will have little effect on the fit.

Second, it allows for multiple mixed subsets of molecules to be reliably fitted. As long as we are confident that we are capturing all of and only events in our subset, we can consider the inter-event times in a subset of events to be independent, and fit it separately from any other events in the sample.

1.d. Capture Rate Traps

However, things get tricky when events are rejected from analysis, either by Nanolyzer or through subsetting. For example, imagine that Nanolyzer has rejected an event for some reason. Are the inter-event times that needs to be considered for the capture rate the times between the rejected event and that immediately preceding and following? Or should we pretend that the rejected event was never there and just consider the time between the two "good" events? The former route will give us a larger measured capture rate, since the inter-event times we consider will be shorter. The figure below illustrates the situation. If we have rejected the event marked in red, do we include t3 and t4 in the capture rate analysis, or do we consider their sum, t34?

When I was first presented this problem, my immediate thought was to ignore all inter-event times that contain a rejected event, and thereby bypass the ambiguity entirely. But this again falls prey to a subtle (or maybe not so subtle) trap: because events are uncorrelated in time, so are events that are rejected. The longer an inter-event time happens to be, the higher the probability that a rejected event will have occurred within it, and so when we reject inter-event times that contain a rejected event inside them, we are preferentially rejecting longer inter-event times, and therefore measuring a higher capture rate! In fact, you can show that under the assumption that inter-event times are exponentially distributed and rejected events are randomly distributed through your experiment, rejecting inter-event times that contain a rejected event increases the measured capture rate to exactly the same number that you would have measured had you just considered all the inter-event times without thinking about it at all. Needless to say, realizing that I would have been better off not thinking about the problem in the first place was not exactly a satisfying conclusion.

There is no general answer to this problem, unfortunately. It is simply something that you must be acutely aware of when performing capture rate analysis, particularly in datasets that have a high proportion of rejected events. Nanolyzer will include in its capture rate fits the inter-event times of all events that are in the subset in question. Which means that practically, the number that Nanolyzer will report to you is not the "capture rate" for your experimental system, but rather it is the rate at which molecules both pass through the pore, and pass your quality control filters. Whether those are the same thing is a judgement call that can only be made by the person performing the analysis, though there are some tricks discussed below that can help you make it.

2. Analyzing capture rate with Nanolyzer

Capture rate analysis with Nanolyzer is in active development. For the time being, the interface is very simple: once you have defined your subsets, simply navigate to the Capture Rate tab and press "Fit Capture Rate" and it will report the capture rates for the chosen subset, subject to the caveats above. It also gives a visual representation of the fits.

Future versions of Nanolyzer will include uncertainty of capture rate fits based on uncertainty in clustering events into subsets, as well as more detailed controls for the fitting process.

3. Controlled Counting

Given the difficulties in capture rate analysis, it can often be difficult to decide whether the variability that you will observe in your capture rates is due to analysis challenges or to physical causes. Luckily, the physical causes are easier to address. In a recent publication, we showed that when physical issues are the driver of capture rate variability, using an internal control molecule and measuring two capture rates simultaneously can reduce the variability in the capture rate ratio as compared to the relative variability of the capture rate of either species alone, since any influence of the nanopore and experimental system on the capture rate will equally affect both (with the caveat, of course, that those two molecules have the same capture physics). [1]

When performing measurements that are intended to report capture rate, we strongly recommend using controlled counting where possible as opposed to raw capture rates. If you do this, and you still see significant pore to pore variability or variability over time in the ratio of capture rates, it points to analysis artefects as the cause. And if you find yourself there, and you need a second opinion, we're always here to help.

Analysis Table of Contents

Previous Topic: Event database filtering

Next Topic: Analysis example: double-stranded DNA

4. References

[1] M. Charron, K. Briggs, S. King, M. Waugh, and V. Tabard-Cossa, “Precise DNA Concentration Measurements with Nanopores by Controlled Counting,” Anal. Chem., vol. 91, no. 19, pp. 12228–12237, 2019, doi: https://www.doi.org/10.1021/acs.analchem.9b01900.

Last edited: 2021-08-02