The basis of feature spaces in deep networks

In a new article on Distill, Olah et al. write up a very readable and useful summary of methods to look into the black box of deep networks by feature visualization. I had already spent some time with this topic before (link), but this review pointed me to a couple of interesting aspects that I had not noticed before. In the following, I will write about one aspect of the article in more depth: whether a deepnetwork encodes features rather on a neuronal basis, or rather on a distributed, network basis.

‘Feature visualizations’ as discussed here means to optimize the input pattern (the image that is fed into the network) such that it maximizes the activity of a selected neuron somewhere in the network. The article discusses strategies to prevent this maximization process from generating non-naturalistic images (“regularization” techniques). On a sidenote, however, they also asks what happens when one optimizes the input image not for a single neuron’s activity, but for the joint activity of two or more neurons.

Colah1

Joint optimization of the activity of two neurons. From Colah et al., Distill (2017) / CC BY 4.0.

Supported by some examples, and pointing at some other examples collected before by Szegedi et al., they write:

Individual neurons are the basis directions of activation space, and it is not clear that these should be any more special than any other direction.

It is a somehow natural thought that individual neurons are the basis of coding/activation space, and that any linear combination could be used for coding equally well as any single neuron-based representation/activation. In linear algebra, it is obvious that any rotation of the basis that spans the coding space does not change anything about the processes and transformations that are taking place in this space.

However, this picture breaks down when switching from linear algebra to non-linear transformations, and deep networks are by construction highly non-linear. My intuition would be that the non-linear transformation of inputs (especially by rectifying units) sparsens activity patterns with increasing depth, thereby localizing the activations to fewer and fewer neurons, without any sparseness constraint during weight learning. This does not necessarily mean that the preferred input images of random directions in activation space would be meaningless; but it would predict that the activation patterns of to-be-classified inputs are not pointing into random directions of activation space, but have an activation direction that prefers the ‘physical’, neuronal basis.

I think that this can be tested more or less directly by analyzing the distributions of activation patterns across layers. If activation patterns were distributed, i.e., pointing into random directions, the distribution would be rather flat across the activation units of each layer. If, on the other hand, activation directions were aligned with the neuronal basis, the distribution would be rather skewed and sparse.

Probably this needs more thorough testing than I’m able to do by myself, but for starters I used the Inception network, trained on the ImageNet dataset, and I used this Python script on the Tensorflow Github page as a starting point. To test the network activation, I automatically downloaded the first ~200 image hits on Google for 100×100 JPGs of “animal picture”, fed it into the network and observed the activation pattern statistics across layers. I uploaded a Jupyter Notebook with all the code and some sample pictures on Github.

The result is that activation patterns are sparse and tend to become sparser with increasing depth of the layers. The distribution is dominated by a lot of zero activations, indicating a net input less or equal to zero. I have excluded the zeros from the histograms and instead given the percentage of non-zero activations as text in the respective histogram. The y-axis of each histogram is in logscale.

figure_16

It is also interesting that the sparseness decreases with depth, but reaches a bottleneck at a certain level (here from ‘mixed_7’ until ‘mixed_9’ – the mixed layers are inception modules) and becomes less sparse afterwards when approaching the (small) output layer.

A simple analysis (correlation between activation patterns stemming from different input images) shows that de-correlation (red), that is, a decrease of correlation between activations by different input images, is accompanied by sparsening of the activation levels (blue):

figure_23

It is a bit strange that the network layers 2, 4 and 6 generate sparser activations patterns than the respective previous layers (1, 3 and 5), accompanied by less decorrelated activity. It would be interesting to analyze the correlational structure in more depth. For example, I’d be curious to understand activation patterns of input patterns that lead to the same categorization in the output layer, and to see from which layer on they start to exhibit correlated activations.

Of course there is a great body of literature in neuroscience, especially theoretical neuroscience, that discusses local, sparse or distributed codes and the advantages and disadvantages that come with it. For example, according to theoretical work by Kanerva, sparseness of memory systems helps to prevent different memories from interfering too much with each other, although it is unclear until now whether something similar is implemented in biological systems (you would find many experimental papers with evidence in favor and against it, often for the same brain area). If you would like to read more about sparse and dense codes, Scholarpedia is a good starting point.

Posted in machine learning, Network analysis, Neuronal activity | Tagged , , , , | 1 Comment

All-optical entirely passive laser scanning with MHz rates

Is it possible to let a laser beam scan over an angle without moving any mechanical parts to deflect the beam? It is. One strategy is to use a very short-pulsed laser beam: A short pulse width means a finite spectral width of the laser (->Heisenberg). A dispersive element like a grating can then be used to automatically diffract the beam into smaller beamlets which in turn can somehow be used to scan or de-scan an object. This technique is called dispersive fourier transformation, although there seem to be different names for only slighly different methods. (I have no experience in this field and am not aware of the current state of the art, but I found this short introductory review useful as a primer.)

Recently, I stumbled over an article that describes a similar scanning technique, but without dispersing the beam spectrally: Multi-MHz laser-scanning single-cell fluorescence microscopy by spatiotemporally encoded virtual source array. First I didn’t believe this could be possible, but apparently it is. In simple words, the authors of the study have designed a device that uses a single laser pulse as an input and outputs several laser pulses, separated in time and with different propagation directions – which is scanning.

Wu et al. from the University of Hong Kong describe their technique in more detail in an earlier paper in Light Science & Applications, and in even more detail in its supplementary information, which I found especially interesting. First, it looked like a Fabri-Pérot interferometer to me, but it is actually completely different and is not even based on wave optics.

The idea is to shoot an optically converging pulsed beam (e.g. coming from an ultra-fast Ti:Sa laser) into an area that is bounded by two mirrors that are almost parallel, but slightly misaligned by an angle α<1°. The authors call these two misaligned mirrors a ‘FACED device’. Due to the misalignment, the beam will be reflected multiple times, but come back once it hits the surface orthogonally (see e.g. the black light path below). Therefore, the continuous spectrum of incidence angles will be automatically translated into a discrete set of mini-pulses coming out of this device, because either a part of the beam gets reflected 14 times, or 15 times – obviously, there is no such thing as 14.5 reflections, at least in ray optics. This difference of 1 in number of reflections makes the 15-reflection beam spend more time in the device, Δt ≈ 2S/c, with S being the separation of the two mirrors, and c the speed of light.

It took me some time to understand how this works and how these pulselets coming out of the FACED device look like, but I have to admit that I find it really cool. The schematic drawings in the supplementary information, especially figures S1 and S5, are very helpful for understanding what is going on.

ScanSchemeSchematic drawing (adapted) from Wu et al., LS&A (2016) / CC BY 4.0.

As the authors note (without showing any experiments), this approach could be used for multi-photon imaging as well. It is probably true that there are some hidden difficulties and finite size-effects that make an implementation of this scanning technique challenging in practice, but let’s imagine for one minute how this could look like.

Ideally, we want laser pulses that are spaced with a temporal distance of the flourescence lifetime (ca. 3 ns) in order to prevent temporal crosstalk during detection. This would require the two FACED mirrors to be spaced by S = 50 cm, according to the formula mentioned above. Next, we want to resolve, say, 250 points along this fast-scanning axis, which means that the FACED device would need to split the original pulse into 250 delayed pulselets. The input pulsed beam therefore would need to have a pulse repetition rate of ca. 1.3 MHz (which is then also the line scanning frequency), and each of those pulses would need enough power for the whole line scan.

How long would the FACED mirrors need to be? This is difficult to answer, since the answer depends on the divergence angle of the input pulsed beam that hits the FACED device, but I would guess that it needs to be a couple of meters long, given the spacing of the mirrors (50 cm) and the high number of pulselets that are desired (250). (In a more modest scenario, one could envision to split up one pulse of 80 Mhz in only 4 pulselets, thereby achieving multiplexing of additional regular scanning similar to approaches described before.)

However, I would also ask myself whether the created beamlets are not too much dispersed in time, thereby precluding the two-photon effect. And I also wonder how all this behaves like when transitioning from geometric rays to wave optics. Complex things might happen in this regime. – Certainly a lot of work is required to transition this from an optical table to a biologist’s microscope, but I hope that somebody accepts this challenge and maybe, maybe replaces the kHz scanners of typical multi-photon microscopes by a device that achieves MHz scanning in a couple of years.

Posted in Calcium Imaging, Imaging, Microscopy | Tagged , , | Leave a comment

The most interesting machine learning AMAs on Reddit

It is very clear that Reddit is part of the rather wild zone of the internet. But especially for practical questions, Reddit can be very useful, and even more so for anything connected to the internet or computer technology, like machine learning.

In the machine learning subreddit, there is a series of very nice AMAs (Ask Me Anything) with several of the most prominent machine learning experts (with a bias for deep learning). To me, as somebody who is not working directly in the field, but nevertheless curious about what is going on, it is interesting to read those experts talking about machine learning in a less formal environment, sometimes also ranting about misconceptions or wrong directions of research attention.

Here are my top picks, starting with the ones I found most interesting to read:

  • Yann LeCun, director of Facebook AI research, is not a fan of ‘cute math’.
    .
  • Jürgen Schmidhuber, AI researcher in Munich and Lugano, finds it obvious that ‘art and science and music are driven by the same basic principle’ (which is ‘compression’).
    .
  • Michael Jordan, machine learning researcher at Berkeley, takes an opportunity ‘to exhibit [his] personal incoherence’ and describes his interest in Natural Language Processing (NLP).
    .
  • Geoffrey Hinton, machine learning researcher at Google and Toronto, thinks that the ‘pooling operation used in convolutional neural networks is a big mistake’.
    .
  • Yoshua Bengio, researcher at Montreal, suggests that the ‘subject most relevant to machine learning’ is ‘understanding how learning proceeds in brains’.
    .

And if you want more of that, you can go on with Andrew Ng and Adam Coates from Baidu AI, or Nando de Freitas, a scientist at Deepmind and Oxford. Or just discover the machine learning subreddit yourself.

Enjoy!

P.S. If you think that there might be similarly interesting AMAs with top neuroscientists: No, there aren’t.

Posted in Data analysis, machine learning | Tagged , , , | Leave a comment

How deconvolution of calcium data degrades with noise

How does the noisiness of the recorded calcium data affect the performance of spiking-inferring deconvolution algorithms? I cannot offer a rigorous treatment of this question, but some intuitive examples. The short answer: If a calcium transient is not visible at all in the calcium data, the deconvolution will miss the transient as well. It seems that if the signal-to-noise drops below 0.5-0.7, the deconvolution quickly degrades.

To make this a bit more quantitative, I used an algorithm based on convolutional networks (developed by Stephan Gerhard and myself; you can find it on Github, and it’s described here) and a small part of the Allen Brain Observatory dataset.

I assumed that the standard deviation of the raw calcium traces measures ‘Signal’ (a reasonable approximation), and I took the standard deviation of the Gaussian noise that I added on top as ‘Noise’. Then I deconvolved both noisified and unchanged calcium traces and computed the correlation of the spiking traces of calcium+noise vs. calcium alone. If the correlation (y-axis) is high, the performance of the algorithm is not much affected by the noise. The curve is dropping steeply at a SNR of 0.5-0.7.

CorrVsSNR

To get some intuition, let’s give some examples, left the calcium trace plus Gaussian noise, right the deconvolved spiking probabilities (numbers to the left indicate SNR and correlation to ground truth, respectively):

Neuron3

The next example was perturbed with the same absolute amount of noise, but due to the larger signal, the spike inference remained largely unaffected for all but the highest noise levels.

Neuron20

The obvious thing to note is the following: When transients are no longer visible in the calcium trace, they disappear in the deconvolved traces as well. I’d also like to note that both calcium timeseries from the examples above are from the same mouse, the same recording, and even the same plane, but the SNR of the recordings is a lot different. Therefore, lumping together neurons of the same recording, but of different recording quality combines different levels of detected detail. An alternative way would be to set a SNR threshold for the neurons to be included – depending on the precision required from the respective analysis.

Posted in Calcium Imaging, Data analysis, electrophysiology, Imaging, machine learning, Neuronal activity | Tagged , , , , | Leave a comment

A convolutional network to deconvolve calcium traces, living in an embedding space of statistical properties

As mentioned before (here and here), the spikefinder competition was set up earlier this year to compare algorithms that infer spiking probabilities from calcium imaging data. Together with Stephan Gerhard, a PostDoc in our lab, I submitted an algorithm based on convolutional networks. Looking back at the few days end of April when we wrote this code, it was a lot of fun to work together with Stephan, who brought in his more advanced knowledge on how to optimize and refactor Python code and how to explore hyper-parameter spaces very efficiently. In addition, our algorithm performed quite well and ranked among the top submissions. Other high-scoring algorithms were submitted by Ben BolteNikolay Chenkov/Thomas McColganThomas DeneuxJohannes FriedrichTim MachadoPatrick MineaultMarius PachitariuDario Ringach, Artur Speiser and their labs.

The detailed results of the competition will be covered and discussed very soon in a review paper (now on bioRxiv), and I do not want to scoop any of this. The algorithm, which is described in the paper in more detail, goes a bit beyond a simple convolutional network. In simple words, the algorithm creates a space of models. Then, the algorithm chooses a location in this space for the current task based on statistical properties of the calcium imaging data that are to be analyzed. The idea of this step is that it will allow the model to generalize to datasets that it has not seen before.

The algorithm itself, which we wrote in Python3.6/Keras, should be rather straightforward to test with the Jupyter notebook that is provided or the plain Python file. We do not intend to publish the algorithm in a dedicated paper since everything will be described in the review paper, and the algorithm is already published on Github. It should be pretty self-explanatory and easy to set up (if not, let me know!).

So if you have some calcium imaging data that you would like to deconvolve, and if you want to get some hands-on experience with small-scale deep learning methods, this is your best chance …

I also was curious how some random calcium imaging traces of mine would look like after deconvolution based on my network. Sure, there is no spiking ground truth for these recordings, but one can still look at the results and immediately see whether it is a complete mess or something that looks more or less realistic. Here is one example from a very nice calcium recording that I did in 2016 in the dorsal telencephalon of an adult zebrafish using this 2P microscope. The spiking probabilities (blue) seem to be realistic and very reliable, but the recording quality was also extremely good.

stripe1

I was also curious about the performance of the algorithm for somebody else’s data as input. The probably most standardized calcium imaging dataset for mice that there is can be retrieved from the Allen Brain Observatory. Fluorescence traces can be accessed via the Allen SDK (yet another Jupyter notebook to start with). I deconvolved 20 traces each 60 min of recording @ 30 Hz framerate, which took me in total ca. 20 min (on a normal CPU, no GPU!). Let me show you some examples of calcium traces (blue) and the corresponding deconvolved spiking probability estimates (orange) for a couple of neurons; x-axis is time in seconds, y-axis is scaled arbitrarily:

Overall, the deconvolved data clearly seem to be less noisy than most of the predictions from the Spikefinder competition, probably due to better SNR of the calcium signal. False positives from the baseline are not very frequent. There are still some small most likely unwanted bumps, depending on the noisiness of the respective recorded neuron. For very strong calcium responses, the network sometimes tends to overdo the deconvolution, leading to a kind of slight negative overshooting of the spiking probability, or, put differently, ringing of the deconvolution filter. This could have been fixed by forcing the network to give back only positive values, but the results also look pretty fine without this fix.

Of course, if you want to try out your own calcium imaging data with this algorithm, I’d be happy to see the results! And if you are absolutely not into Python yet and don’t want to install anything before seeing some first results, you can also send me some of your calcium imaging traces for a quick test run.

Posted in Calcium Imaging, Data analysis, electrophysiology, Imaging, machine learning, Neuronal activity | Tagged , , , , | Leave a comment

A short report from a Cold Spring Harbor lab course

One of the best things of being a PhD student is that one is supposed to learn new things. As part of this mission, I attended a two-week laboratory course in the Cold Spring Harbor Laboratories on ‘Advanced Techniques in Molecular Neuroscience’ (ATMN), a field of neuroscience to which I had been exposed only passively before.

Ausblick auf den Harbor

Overview of neuroscience methods courses

Before writing about the course in more detail, here’s a brief overview alternatives for high-quality and hands-on methods courses that could be relevant for neuroscience PhD students and PostDocs.

  • Cold Spring Harbor in proximity of New York hosts a variety of different courses, most of them very dense and practical and typically 2-3 weeks long.
  • The Marine Biological Lab in Woods Hole, Massachusetts, offers also a variety of specialized courses. In addition, there are some more general and longer (up to two months!) ‘discovery courses’, that might be ideal e.g. for computer scientists, physicists or biochemists who transition to neuroscience without previous exposure.
  • More recently (starting in 2017), FENS has set up a program named CAJAL that consists of a couple of practical courses. The courses take place at the Champalimaud centre in Portugal or in Bordeaux/France. I do not know how good these courses are, but the schedules look very promising and very similar to the two alternatives above.
  • A very interactive and hands-on course on constructing hardware for experimental neurophysiology, especially for imaging, is TENSS in Transylvania, Romania.
  • There are other courses that seem to be interesting and hands-on, but I do not have any first- or reliable second-hand experience: a neuroscience course in Paris, France; and a course on imaging at the Max Planck Institute in Florida.

If you have any comments or if I forgot something, let me know! – I did not include computational neuroscience courses on purpose, because there are many – probably they are easier to organize since they do not require reagents and hardware apart from computers. I guess that any computational neuroscience course or summer school would be announced via the Comp Neuro mailing list.

But back to a short review of Advanced Techniques for Molecular Neuroscience:

ATMN review: The application

Nothing difficult here. I do not know anything about the acceptance rate, but the organizers tend to put an emphasis on diversity (different backgrounds, different countries of origin). Recommendations from two PIs are required. Together with the course application, an application for a scholarship that covers part of the fees can be filed (and it does not require a lot of effort to do so). There are dedicated scholarships for people coming from developing countries.

ATMN review: The location

Cold Spring Harbor is beautifully located on Long Island, an hour drive from New York City. Coming from central Europe, I especially enjoyed the lively nature and the diversity of species. On the campus, which is basically a village sitting in the middle of nowhere, squirrels and chipmunks are omnipresent. Horseshoe crabs sit on the sand shores, and during nighttime, fireflies are blinking everywhere once you leave the streets. I was housed together with another participant of my course in a small, but nice room in a wooden cabin (see picture below), with showers shared among 6 persons. You can go to the campus gym, go running, swimming, kayaking, use the tennis court or the beach volleyball fields – if there is time leftover. The food that is provided is good, also for people who do not enjoy typical american style.

Sandburg

ATMN review: The labwork

The main focus of the course is on bench work. The day starts at 9 a.m. with a lecture or an introduction to the next experiment. Then the experiments start, interrupted only by lunch, dinner and possibly further short lectures/instructions, until 7, 8, 9 p.m. or even longer. There is basically no free time, except before 9 a.m. and during some afternoons.

In total, there were 16 students (half of them female; half of them non-US; roughly half of them PostDocs or beyond). 8 pairs were formed that worked together on a single bench for the whole duration of the course. The equipment is great: High-end confocal or brightfield microscopes, PCR and qPCR machines, tape stations, nanodrops, centrifuges, dissection stations with large demonstration screens, etc.

As typical for molecular biology, there is a lot of waiting, pipetting, washing, shaking and centrifuging involved, but the organizers interleaved different modules in order to minimize the idle time. Sometimes it was challenging to follow several modules running in parallel, because the modules were using an overlapping set of techniques.

After each module, the results (this can be images, gel pictures, qPCR results) are presented by the teams to the whole group (+instructors) using the white board or power point. All of this is very loosely organized and improvised, since the duration of each experimental step cannot be predicted easily.

The instructors (who can be PIs, PostDocs, PhD students or TAs of the lab that organizes the respective module) were not from Cold Spring Harbor research groups, but from universities all over the US. All of them were very kind and helpful and extremely patient, sometimes explaining a procedure five times in a row to ensure that everybody understood it well. They were also always competent for any technical questions like ‘What does the addition of enzyme X do, and why do we increase the temperature to 43°C?’

ATMN review: The content

The 10 modules of the course each take 2-5 days:

  • CRISPR
  • BACs
  • Brainbow and multispectral imaging
  • In utero-electroporation
  • Translating ribosome affinity purification (TRAP)
  • Cross-linking immunoprecipitation (CLIP)
  • Slice and primary cortical cultures
  • FISH and IHC
  • Lentivirus and stereotactic injection
  • Clearing techniques
  • Single-cell electroporation

For example, in the CRISPR module, instructed by Le Kong from the Broad Institute, a couple of lectures covered the principles of gRNA and vector design. The lab work started with the generation of a small DNA fragment that was then cloned into a CRISPR backbone vector and transformed into E. coli cultures. Then, a set of different CRISPR activation systems and controls were transfected into mammalian cells. Efficiency of the gene insertions was checked using the T7E1 surveyor assay,  a reporter gene (imaging) and qPCR of mRNA from transfected mammalian cells. Overall, this took 4 days, running in parallel with other modules. For me, all of this was new, and I was glad to learn all these steps by doing them.

To give another example, the clearing module was instructed by Jennifer Treweek and Ryan Cho from the lab of Viviana Gradinaru. Whereas the CRISPR module had been a protocol that had to be followed precisely by each of the 8 groups, here we could go ahead and choose between a variety of protocols. I and my lab partner tried out PACT and ePACT (decribed here), both passive clearing techniques with (ePACT) and without (PACT) expansion of the tissue, on Thy1-GFP mouse brain slices. We used slices instead of whole brains due to the limited time available during the course. Other groups additionally combined the clearing methods with in situ labeling, using a so-called hybridization chain reaction for RNA labeling.

ATMN review: The participants

The course was attended by a wide variety of different backgrounds. Only two of the 16 (including me) were mostly interested in systems or circuit neuroscience. Some were more into epigenetics, genomics, or other fields that rely more strongly on molecular rather than physiological methods. I guess that the networking component might have been more important for other participants who are going to work precisely in the field of some of their ATMN instructors. But if I for example were to set up a clearing protocol in my home institute, I would not hesitate a single second to write back to the course instructors in case I encountered technical problems.

ATMN review: Summary

My motivation to take this course was, coming from a physics background, to learn some basic (and advanced) molecular biology, and the course clearly exceeded my expectation in terms of what I could get out of it. I can therefore only recommend the course (or other courses in Cold Spring Harbor) to anyone! Two weeks of time is not a lot, and the amount of new knowledge that I (and others) learnt from this course is huge.

I can easily recommend this course (or similar courses) to any neuroscience PhD student. Often, it appears as if there is no time at hand to go somewhere and learn new things unrelated to one’s PhD project. But if I’m honest, compared to the many days, weeks or months that I have spent with failed experiments or following up on ideas that turned out to be wrong, two or three weeks of time is not such a big deal!

Posted in Neuronal activity | Tagged , | Leave a comment

Whole-cell patch clamp, part 3: Limitations of quantitative whole-cell voltage clamp

Before I first dived into experimental neuroscience, I imagined whole-cell voltage clamp recordings to be the holy grail of precision. Directly listening to the currents that take place inside of a living neuron! How beautiful and precise, compared to poor-resolution techniques like fMRI or even calcium imaging! I somehow thought that activation curves and kinetics as recorded by Hodgkin and Huxley could be easily measured using voltage clamp, without introducing any errors.

Coming partially from a modeling background, I was especially attracted by the prospect of measuring both inhibitory and excitatory inputs (IPSCs and EPSCs) that would allow me to afterwards combine them in a conductance-based model of a single neuron (or even a network of such neurons). Here I will write about the reasons why I changed my mind about the usefulness of such modeling efforts, with a focus on whole-cell recordings of the small (5-8 μm diameter) zebrafish neurons that I have been recording from during the last year.

Let’s have a look at the typical components (=variables) of the conductance-based neuron model.

C_m dV(t)/dt = g_0 (V_0 - V(t)) + g_I(t) (V_I - V(t)) + g_E(t) (V_E - V(t))

Measured quantities: Membrane capacity C_m, resting conductivity (inverse of resting membrane resistance R_m) g_0, reversal potentials for inhibitory and excitatory conductances, V_I and V_E. From the currents measured with the voltage clamped to V_I and V_E, respectively, the conductivity changes over time g_E(t) and g_I(t) can be inferred. Alltogether, this results in the trajectory for V(t), the time course of the membrane potential. Then, the spiking threshold V_{thresh} would allow to see from the simulated membrane potential when action potentials occur.

hh2-e1494344258850.png

Unfortunately, a simple order of magnitude estimate of the parameters is not good enough to make an informative model. Therefore I will in the following try to understand: When measuring these variables, how precise are these measurements and why?

First of all, it took me a long time to understand that there is a big difference between the famous voltage-clamp experiments performed by Hodgkin and Huxley and those done in the whole-cell configuration. H&H inserted the wire of the recording electrode into the giant axon (picture to the left, taken from Hodkin, Huxley and Katz, 1952). In this configuration, there is basically no resistance between electrode and cytoplasm, because the electrode is inside of the cell.

HH

In whole-cell recordings, however, the electrode wire is somewhere inside of the glass pipette (picture on the right side). The glass pipette is connected to the cell at a specific location via a tiny opening that allows voltages to more or less equilibrate between cytoplasm and pipette/electrode. This is the first problem:

1. Series resistance. Series resistance R_s is the electrical resistance between cell and pipette, caused by the small pipette neck and additional dirt that clogs the opening (like cell organelles…). The best and easiest-to-understand summary of series resistance effects that I found has been written by Bill Connelly (thanks to Labrigger for highlighting this blog).

Series resistance makes voltage clamp recordings bad for two reasons: First, the signals are lowpass-filtered with a time constant given by R_s*C_m. Second, the resistance prevents the voltage in the cell from adapting to the voltage applied in the micropipette. Depending on the ratio R_s/R_m, the clamped voltage is more or less systematically wrong.

2. Space clamp. There is only one location which is properly voltage-clamped in the whole-cell mode, and this is the pipette itself. The cell body voltage is different from the pipette potential because of the series resistance (Ohm’s law). The voltage clamp in the dendrites is even more impaired by electrical resistance between the dendrites and the soma. Therefore, voltage clamp at a membrane potential close to the ‘resting potential’ (-70 mV … -50 mV) is more or less reliable; whereas voltage clamp for recording of inhibitory currents (0 mV … +20 mV) is less reliable for the dendritic parts, especially if the dendrites are small. To make things worse, the resistance between soma and dendrites is not necessarily constant over time. Imagine a case where inhibitory inputs open channels at the proximal dendrite. In this case, the electric connection between soma and the distal end of the dendrite will be impaired, and voltage clamp will be worsened as well. Notably, this worsening of voltage clamp would be tightly correlated to the strength of input currents.

In neurons that are large enough to record with patch pipettes both from soma and dendrites, one can test the space clamp error experimentally. And it is not small.

If there are active conductances involved, the complexity of the situation increases even further. In a 1992 paper on series resistance and space clamp in whole-cell recordings, Armstrong and Gilly conclude with the following matter-of-fact paragraph:

“We would like to end with a message of hope regarding the significance of current measurements in extended structures, but we cannot. Interpretation of voltage clamp data where adequate ‘space clamping’ is impossible is extremely difficult unless the membrane is passive and uniform in its properties and the geometry is simple. (…)

3. Seal resistance. The neuron’s membrane potential deviates from the pipettes potential by a factor that depends on series resistance. Both can be measured and used to calculate the true membrane potential. But there is a second confounding resistance, the seal resistance. Usually, it is neglected, because everything is fine if the series resistance remains constant over the duration of the experiment. But if one needs absolute and not only relative measurements of membrane resistance, firing threshold etc., seal resistance needs to be considered, especially in very small cells. In a good patch, the seal resistance is around 10 GΩ or more. But sometimes it is also a little bit less, maybe 5-8 GΩ. For small neurons, the membrane resistance can be in the same order of magnitude, for example 2-3 GΩ (and yes, I’m dealing with that kind of neurons myself). Seal resistance and membrane resistance therefore divide the applied voltage, leading to voltage errors and wrong measurements of membrane resistance (see also this study). With R_m = 2 GΩ and R_seal = 8 GΩ, this would lead to a false measurement of R_m = 1.6 GΩ. This error is not random, but systematic. Again, this could be corrected for by measuring seal resistance before break-in and calculating the true membrane resistance afterwards. But it is – in my opinion – unlikely that the seal resistance remains unchanged during such a brutal act as a break-in. The local membrane tension around the seal becomes inhomogeneous after the break-in, and it sounds likely to me that this process creates small leaks that were sealed by dirt during the attached configuration. This is an uncertainty which can not be quantified (to my best knowledge) and which makes quantitative measurements of R_m and the true membrane potential in small cells very difficult.

4. The true membrane potential. The recorded membrane potential (say, the ‘resting membrane potential’, or the spiking threshold) is not necessarily the true membrane potential. First, series resistance introduces a systematic error – this is ok, it can be understood and accounted for. It is more difficult to correct for the errors induced by seal resistance, as mentioned. One way to avoid leaks due to break-in is perforated-patch recordings, which is however rather difficult, and probably impossible to combine with small pipette tips that are required for the small neurons I’m interested in.

In addition, I always asked myself how well does my internal solution match the cytoplasm of my neurons in terms of relevant properties. Of course there are differences. But do these differences affect the membrane potential? I don’t see how this could be found out.

5. Correlation of inhibitory and excitatory currents. During voltage-clamp, one can measure either inhibitory or excitatory currents, but not both at the same time. If everything was 100% reproducible, repeating the measurements in separate trials would be totally ok, but this is not the case. Instead, fluctuations of inhibitory and excitatory currents are typically correlated, although it is unclear to which degree. A way to navigate around this problem is to ignore it and simply work with averages over trials (as I did in the simulation at the beginning of my post). Another solution is to use highly reproducible and easy-to-time stimuli (electrical or optogenetic stimuli, acoustic stimuli) that lead to highly reproducible event-evoked currents. However, this also cannot help understand trial-to-trial-co-variability of excitation and inhibition and similar aspects that take place on fast timescales. There are studies that patch two neighbouring neurons at the same time, measuring excitatory currents from the one and inhibitory from the other neuron, but this is not exactly what one wants to have.

There is actually a lack of a techniques that would allow to observe inhibitory and excitatory currents in the same neuron during the same trial, and this lack is creating a lot of uncertainty about how neurons and neuronal networks operate in reality.

All in all, this does not sound like good news for whole cell voltage clamp experiments. One problem I’m particularly frustrated about is that for many of these systematic errors there is no ground truth available, and it is totally unclear how large the errors actually are or how this could be found out.

However, for many problems, whole-cell patch clamp is still the only available technique, despite all its traps and uncertainties. I’d like to end with the final note of the previously cited paper by Armstrong and Gilly:

“[A]nd the burden of proof that results are interpretable must be on the investigator.”

At this point, a big thank you goes to Katharina Behr from whom I learned quite some of what I wrote down in this blog post. And of course any suggestions on what I am missing here are welcome!

Posted in Data analysis, electrophysiology, Neuronal activity, zebrafish | Tagged , , | 3 Comments