Some timing tests of the Gatan K3

Introduction

So while we were testing our new Krios Beta, we found ourselves a little perplexed on what the expected throughput would be with the K3 within EPU.

Krios Beta is a G3i with AFIS as well as the Fringe Free Imaging (FFI) upgrade both of which should improve throughput (for more info refer to this TFS M&M 2019 poster: here), as well as the K3 detector with its faster counting rate and frame to disk speed.

However there are many options that can be expected to affect the end throughput. Some of these are straightforward with a little work such as determining how many images per stage movement for a particular magnification and grid type. When it came to saving movies in TIFF versus MRC, or Gain-Normalized versus unprocessed, it did not initially seem so straight forward.

So I wasted a weekend of my time to try and get a better grasp on how these options affect acquisition speed. You can access this timing data as well as a little applet I made to visualize the data a bit more manageably: https://github.com/DustinMorado/K3_Timing_Viz

Test setup

Unfortunately, EPU really is a giant black-box and I had no ability to test timings without running actual sessions, which is not a good isolation of the variables. If anyone has any idea how to debug the acquisitions from EPU I would be very much interested in how :sweat_smile:.

Therefore I did all of these timings with SerialEM (which is installed on our K3 PC) with the debug set as such to give timing output in both SerialEM and Digital Micrograph (GMS). I expect that the GMS timings would be consistent across both SerialEM and EPU. In contrast, since SerialEM retrieves the final information over a local socket (at a rate of around 500-750MBps), while EPU receives the same information over a 1Gbps connection (at a rate of around 100MBps), the acquisiton time should be slower in EPU compared to the numbers here.

Test 1: Fixed exposure / Vary number of frames

The first test was to see how the following affect the acquisition time for a fixed exposure:

  • Filetype
    • MRC
    • TIFF (with LZW compression)
  • Binning
    • Super Resolution
    • Counted (Hardware Bin 2)
  • Processing
    • Gain Normalized
    • Unprocessed (more accurately: still largely processed just not normalized)
  • Number of frames written to disk

I decided on an exposure time of 1.6 seconds, which at 75 fps to the K3 PC corresponds to a maximum of 120 frames. This was for the large number of divisors of 120 (60, 40, 30, 24, 20, 15, 12, 10, 8, 6, 5, 4, 3, 2). I tested the 8 main categories and would decrease the number of output frames until the acquisition time no longer decreased. In each test I took 30 images, which was a bit overkill :man_facepalming:t5:.

Test 1: Results




So I could see that for each choice of filetype, binning, and processing, the time to write out a single frame to disk was generally linear at around 0.075 - 0.2 seconds per frame. The first frame is always returned a little bit faster and the last frame is always returned a little bit slower.

For saving MRC the choice of gain-normalized or unprocessed was hardly different for a small number of frames (~20), especially for counted data, but at large numbers of frames eventually the time of processing begins to create a noticeable difference. For TIFF data, there is a much larger decrease in acquisition time by saving unprocessed. This makes sense since the compression is more difficult in the more value varied normalized frames.

Interestingly, it was faster to save TIFF data versus MRC at super-resolution, but it was faster to save MRC data versus TIFF data at the physical pixel size. Thus the time of compression is faster than the I/O rate of writing 11Kx8K frames, yet slower than the rate of writing 6Kx4K frames.

Test 2: Fixed framerate / Vary exposure

The second test was to see how the same factors in the first test affect the acquisition time when the framerate to disk was fixed at the maximum of 75 fps and exposure time was varied instead.

I decided on testing from a single frame (0.0133s exposure) up to 40 frames (0.533s exposure), as 40 frames is commonly used sized of movie, and I was thinking possibly that the speed at very small number of frames would be largely different. In this case I took 10 exposures of each condition. Here I was able to use a SerialEM macro although to keep the logs organized it was still very much manual :sob:.

Test 2: Results


The results are not so exciting, and similar to the first test. The rate is largely linear in terms of frames per second independent of the number of frames saved. Again the first frame is returned faster than the others and the last frame is return slower than the others.

Returning to EPU

When I tried to apply these results to EPU, at first my results did not match up at all. Counted, which in SerialEM was considerably faster than Super-Resolution, was hardly any faster than Super-Resolution :thinking:

It turns out that the nomenclature in EPU for binning is a bit misleading (I would call it bass-ackwards):

I had assumed that ‘Counted Bin 1’ in EPU would correspond to hardware bin 2 performed by GMS (as it was in the previous GMS Camera palette). While ‘Counted Super Resolution Bin 2’ and ‘Counted Bin 2’ would correspond to binning that could be done by EPU. I don’t understand why EPU does the binning itself at all given that both antialias aware binning and Fourier cropping are both available from GMS? Yet even still this configuration is not what I had expected.

Selecting ‘Counted Super Resolution Bin 2’ (corresponding to hardware bin 2) means that a 6Kx4k movie is transferred over the 1Gbps connection to EPU as opposed to the 11Kx8K movie, and this has been a difference between ~250 movies/hr ('Counted Bin 1) and 500+ movies/hr (‘Counted Super Resolution Bin 2’) within EPU.

DQE Effects of hardware binning

There is an obvious cost for the increase in speed gained by simply binning the super-resolution frames though, and that cost is a considerable decrease in DQE as shown in the slide below (from the lecture here):

Methods such as Fourier cropping or real-space downsampling with an antialiasing filter reduces this lost DQE:

But we see that this comes with a considerable time penalty.

Discussion

In the end data collection parameters need to be thoughtfully chosen by the end-user taking into account their sample, microscope parameters, and reasonable resolution targets. I can understand the decisions made in EPU to favor DQE over speed, yet it is also important to remember that many-many structures were solved using the Counting mode on the Gatan K2 camera equivalent to the ‘Counting Super Resolution Bin 2’ option for the K3. For hetergeneous samples, higher throughput and thus more particles should be of chief concern.

I hope this information has been helpful, and that others may find the raw data useful. Then at least it was not an entire waste of a weekend :upside_down_face: