Some timing tests of the Gatan K3

Introduction

So while we were testing our new Krios Beta, we found ourselves a little perplexed on what the expected throughput would be with the K3 within EPU.

Krios Beta is a G3i with AFIS as well as the Fringe Free Imaging (FFI) upgrade both of which should improve throughput (for more info refer to this TFS M&M 2019 poster: here), as well as the K3 detector with its faster counting rate and frame to disk speed.

However there are many options that can be expected to affect the end throughput. Some of these are straightforward with a little work such as determining how many images per stage movement for a particular magnification and grid type. When it came to saving movies in TIFF versus MRC, or Gain-Normalized versus unprocessed, it did not initially seem so straight forward.

So I wasted a weekend of my time to try and get a better grasp on how these options affect acquisition speed. You can access this timing data as well as a little applet I made to visualize the data a bit more manageably: https://github.com/DustinMorado/K3_Timing_Viz

Test setup

Unfortunately, EPU really is a giant black-box and I had no ability to test timings without running actual sessions, which is not a good isolation of the variables. If anyone has any idea how to debug the acquisitions from EPU I would be very much interested in how :sweat_smile:.

Therefore I did all of these timings with SerialEM (which is installed on our K3 PC) with the debug set as such to give timing output in both SerialEM and Digital Micrograph (GMS). I expect that the GMS timings would be consistent across both SerialEM and EPU. In contrast, since SerialEM retrieves the final information over a local socket (at a rate of around 500-750MBps), while EPU receives the same information over a 1Gbps connection (at a rate of around 100MBps), the acquisiton time should be slower in EPU compared to the numbers here.

Test 1: Fixed exposure / Vary number of frames

The first test was to see how the following affect the acquisition time for a fixed exposure:

  • Filetype
    • MRC
    • TIFF (with LZW compression)
  • Binning
    • Super Resolution
    • Counted (Hardware Bin 2)
  • Processing
    • Gain Normalized
    • Unprocessed (more accurately: still largely processed just not normalized)
  • Number of frames written to disk

I decided on an exposure time of 1.6 seconds, which at 75 fps to the K3 PC corresponds to a maximum of 120 frames. This was for the large number of divisors of 120 (60, 40, 30, 24, 20, 15, 12, 10, 8, 6, 5, 4, 3, 2). I tested the 8 main categories and would decrease the number of output frames until the acquisition time no longer decreased. In each test I took 30 images, which was a bit overkill :man_facepalming:t5:.

Test 1: Results




So I could see that for each choice of filetype, binning, and processing, the time to write out a single frame to disk was generally linear at around 0.075 - 0.2 seconds per frame. The first frame is always returned a little bit faster and the last frame is always returned a little bit slower.

For saving MRC the choice of gain-normalized or unprocessed was hardly different for a small number of frames (~20), especially for counted data, but at large numbers of frames eventually the time of processing begins to create a noticeable difference. For TIFF data, there is a much larger decrease in acquisition time by saving unprocessed. This makes sense since the compression is more difficult in the more value varied normalized frames.

Interestingly, it was faster to save TIFF data versus MRC at super-resolution, but it was faster to save MRC data versus TIFF data at the physical pixel size. Thus the time of compression is faster than the I/O rate of writing 11Kx8K frames, yet slower than the rate of writing 6Kx4K frames.

Test 2: Fixed framerate / Vary exposure

The second test was to see how the same factors in the first test affect the acquisition time when the framerate to disk was fixed at the maximum of 75 fps and exposure time was varied instead.

I decided on testing from a single frame (0.0133s exposure) up to 40 frames (0.533s exposure), as 40 frames is commonly used sized of movie, and I was thinking possibly that the speed at very small number of frames would be largely different. In this case I took 10 exposures of each condition. Here I was able to use a SerialEM macro although to keep the logs organized it was still very much manual :sob:.

Test 2: Results


The results are not so exciting, and similar to the first test. The rate is largely linear in terms of frames per second independent of the number of frames saved. Again the first frame is returned faster than the others and the last frame is return slower than the others.

Returning to EPU

When I tried to apply these results to EPU, at first my results did not match up at all. Counted, which in SerialEM was considerably faster than Super-Resolution, was hardly any faster than Super-Resolution :thinking:

It turns out that the nomenclature in EPU for binning is a bit misleading (I would call it bass-ackwards):

I had assumed that ‘Counted Bin 1’ in EPU would correspond to hardware bin 2 performed by GMS (as it was in the previous GMS Camera palette). While ‘Counted Super Resolution Bin 2’ and ‘Counted Bin 2’ would correspond to binning that could be done by EPU. I don’t understand why EPU does the binning itself at all given that both antialias aware binning and Fourier cropping are both available from GMS? Yet even still this configuration is not what I had expected.

Selecting ‘Counted Super Resolution Bin 2’ (corresponding to hardware bin 2) means that a 6Kx4k movie is transferred over the 1Gbps connection to EPU as opposed to the 11Kx8K movie, and this has been a difference between ~250 movies/hr ('Counted Bin 1) and 500+ movies/hr (‘Counted Super Resolution Bin 2’) within EPU.

DQE Effects of hardware binning

There is an obvious cost for the increase in speed gained by simply binning the super-resolution frames though, and that cost is a considerable decrease in DQE as shown in the slide below (from the lecture here):

Methods such as Fourier cropping or real-space downsampling with an antialiasing filter reduces this lost DQE:

But we see that this comes with a considerable time penalty.

Discussion

In the end data collection parameters need to be thoughtfully chosen by the end-user taking into account their sample, microscope parameters, and reasonable resolution targets. I can understand the decisions made in EPU to favor DQE over speed, yet it is also important to remember that many-many structures were solved using the Counting mode on the Gatan K2 camera equivalent to the ‘Counting Super Resolution Bin 2’ option for the K3. For hetergeneous samples, higher throughput and thus more particles should be of chief concern.

I hope this information has been helpful, and that others may find the raw data useful. Then at least it was not an entire waste of a weekend :upside_down_face:

Hi Dustin,

I just took time to read this entirely, thank you for writing it up.

If I read it correctly, the parameter that has the largest influence on collection throughput (number of movies recorded per hour) is whether we use bin=1 or bin=2, correct? Especially if we consider recording 40 frames (the time difference starts making a big difference past 40 frames).

What is unclear to me is whether the benefit from (roughly) twice as many particles is worth the degraded DQE at all spatial frequencies? As I understand it, collecting at higher magnification would compensate for the degradation of DQE caused by using bin=2, but it would also somewhat cancel the increase in throughput (or should I call this “output”?), since higher mag means less particles per field of view.

What would be even more helpful is if we knew a few typical combinations of settings and the compromises they involve. Something like “at the usual 40 frames per movie, in compressed TIFF with no gain correction, using bin=X will give you ~Y movies per hour”, for two or three typical sets of (X, Y) values that you guys normally use. This way, we could just pick one knowing that it’s been working well in your hands (or we could come up with something else, but knowing which kind of compromise we’re making).

Thank you again.

Hi Guillaume,

Thank you for your reply!

Yes, although specifically (which is important), It is selecting “Counted Super Resolution” Mode in EPU and then deciding between Binning 1 (Super-Resolution) or Binning 2 (Physical pixel).

This because the camera timings alone cannot answer this question for you. It is going to depend on how homogeneous your sample is and eventual B-factor of your structure, and of course this is extremely sample specific. However, let’s do some back of the envelope approximations to try and answer your question.

First thing though is that I want to comment on the initial post. The main comment I saw was regarding the DQE curves that I posted from a lecture given by Chris Booth at Gatan. There is nothing specific mentioned in the figure that this is an actual comparison between physical pixel and super resolution DQE curves for the K3 although it is easy to interpret it as such. Gatan has been extremely mum about any sort of actual DQE curves for any mode or accelerating voltage for reasons discussed in that lecture (charging, edge creep, etc.). However, a comparison was done for the K2 detector and Gatan has not issued any statements that make me think the actual sensor technology has changed between the two generations (although I could be wrong here and am open to any corrections of course). With this in mind let’s focus on the following information from this paper by Rachel Ruskin at the Grigorieff lab.

Here are the DQE values at DC, half Nyquist and Nyquist as provided by Gatan for the K2:

Accelerating Voltage (keV) Operating Mode DQE (0.0) DQE (0.5) DQE (1.0)
200 Physical Pixel 0.82 0.56 0.13
200 Super-Resolution 0.84 0.54 0.20
300 Physical Pixel 0.94 0.55 0.18
300 Super-Resolution 0.76 0.55 0.23

And here are the same measurements as determined by FindDQE for the K2:

Accelerating Voltage (keV) Operating Mode DQE (0.0) DQE (0.5) DQE (1.0)
200 Physical Pixel 0.75 0.47 0.15
200 Super-Resolution 0.79 0.46 0.22
300 Physical Pixel 0.81 0.54 0.18
300 Super-Resolution 0.76 0.55 0.31

Note: It is not explicit what the dose-rate is for these measurements

So here the main takeaway is that the DQE is likely not degraded fully across spatial-frequencies but only degrades slightly slower from half Nyquist to Nyquist when using super-resolution.

Back to the rough calculations on whether the benefit to be gained is collecting much faster in physical pixel mode or from collecting slower but better images in super-resolution. I made a small spreadsheet to calculate given a pixel size, a particle size, a b-factor, an exposure rate and operating mode how long it would take to reach a target resolution. Then I added three boxes where if the particle size, b-factor, and target resolution are identical it will compare the timings to the first taking into account the change in DQE.

The spreadsheet can be found here, and I think that since it is view-only you are able to make a copy of it that you can then modify to use on your own.

The assumptions I am making for the approximations are as follows:

  • The particles completely pack and fill the field of view as spheres of the given diameter
  • The number of particles required to reach a given resolution is given as: n_{ptcls} = e^{\frac{-B}{2d^2}}
    • This follows from ResLog plots however there is often a y-intercept involved here that I am ignoring
  • DQE is linear but truncated to DQE(0)
  • The number of particles needed in comparison changes as a factor of the increase of calculated DQE
    • i.e. if the DQE goes from 0.25 in box 1 to 0.5 in a comparison box the number of needed particles halves
    • This is from the form of DQE = \frac{SNR_{out}}{SNR_{in}}

So feel free to play around with this and see how the various changes would very roughly affect collection times needed. With the physical pixel acquisition running about twice as fast, it seems that it is always better to acquire in this mode, especially when the particle is small and many can be fit at a high magnification where the fraction of Nyquist of the target is small.

This will take me quite some more time to calculate, however the screen shot where physical pixel is around 500-700 and super-resolution is around half of that 200-400, has roughly been my experience at the microscope so far, but of course depends on many things such as grid type, shots per hole, AFIS etc.

Hope this helps answer your questions and that you and maybe others find it useful. I will work at some point to eventually collect some estimates of collection speeds for various conditions.

1 Like

Very helpful, thank you so much. It confirms what I was thinking, that there is no straightforward compromise.
I have no more questions at the moment, it will take me some time to digest all this.

@dustin.morado Very insightful discussion. Thank you, both.

I have a somewhat naive practical question. SerialEM provides the option through SEMCCD to perform either antialias reduction or binning. The former is documented to benefit from an appreciable DQE gain at the cost of 0.1s/frame of overhead.

Question is, is EPU’s “hardware bin 2” behaviour akin to the former or the latter? I was told that it uses DM’s antialias-aware filtering, but I get different impressions reading different references.