Creating Optics Groups from EPU AFIS data and more

dustin.morado · 20 June 2020 20:32

So you recently used “Faster Acquisition” in EPU for your session, and you also want to try the newest features in RELION which allow you to correct for higher-order aberrations and anisotropic magnification. However, you quickly see this requires you to sort the data into “Optics Groups” before you begin processing. From the publication:

[…] in RELION-3.1. To facilitate this, we have introduced the concept of optics groups : partitions of the particle set that share the same optical properties, such as the voltage or pixel size (or the aberrations and the magnification matrix).

So the question becomes “How do I sort my AFIS data into Optics Groups?”

The short answer is use this: GitHub - DustinMorado/EPU_group_AFIS: Makes RELION 3.1 Optics Groups from EPU AFIS data

Figure_11220×856 40.3 KB

However if you want to know a bit more of how we got here you can keep reading.

Purpose

Traditionally SPA data is collected using the stage to move from hole to hole. This is because the alternative would be to use what is called beam-image shift, which for high-resolution TEM work was non-ideal.

The reason for why this is can be largely considered as two-fold:

Off-axis use of the objective lens
Coma and astigmatism due to beam-tilt

Off-axis collection

The ray diagram of a hole exposed using beam-image shift looks something like this:

What you will notice is the beam does not go through the center of the objective lens, but “off-axis” to it. There is a common phrase that EM-lenses are equivalent to using the bottom of a Coca-Cola bottle for your microscope. This is not really the case anymore, but what is true is that EM-lenses are plagued by strong aberrations and we must take care to minimize these.

Spherical aberration (C_s) is a particularly strong defect in EM-lenses and it reduces the spatial coherence of our imaging system. We do all sorts of things to reduce the effects of Spherical aberration which range from the simple use of a small condenser aperture, to the exorbitantly expensive half-meter tall C_s-correctors installed on several Krioses around the world. Only the central tens of microns of an EM-lens is truly usable for focusing electrons, and the performance degrades extremely quickly as we move away from the optical axis of the objective lens, which is the most important and the strongest lens in the microscope.

From the ray diagram we see the displacement away from the optical axis entering the upper-objective. We can also see that the half-angle of collection exiting the lower-objective is much greater. This greatly decreases our resolution as showed in the figure below from source:

We see that the finite size of our source is governed by the cube of the angle, not good at all! In addition above we can see again how the effects of the spherical aberration increase as we move out from the lens center.

All of this together show why for so long we have always collected our micrographs “on-axis”, and limited exposures collected with beam-image shift to low-dose focusing and tracking images. Beam-image shift, unfortunately in my case, is often used in tomography data where we need to collect off-axis to align the offset between tilt-axis and the center of the detector.

Beam-tilt induced coma and astigmatism

The stronger error-inducing artifact of beam-image shift use is that it introduces both coma and astigmatism into our imaging system. While astigmatism is no longer the issue it had been previously, improvement in detectors increased the signal-to-noise ratio of diffractograms (2-D power spectra of micrographs that we use for CTF-fitting) allowing for more accurate measurements, Coma is a particularly nasty error and breaks a lot of the assumptions we use in our forward-model processing algorithms for refinement (before developments like RELION 3.1 that is .)

Coma, and its effects on phase-contrast data, is a rather complicated topic deserving of its own forum post. For now I will just put a link here to a very good review by Anchi Cheng and Bob Glaeser.

Now this coma as I mentioned is caused by beam-tilt, but in our beam-image shift ray diagram there is no tilt shown, both imaging beams pass through our sample as parallel beams. However this is an ideal case, and it requires that the deflection coils function perfectly and that the pivot points are well aligned. The pivot points ensure that beam-shift introduces minimal beam-tilt and vice versa. The image below helps put some visual context to pivot points:

Pivot points need to be well aligned, but that is our job! Adjusting the pivot points regularly decreases the stability of the microscope for the next user. DON’T TOUCH THEM!

We can see from above that if we change the height of the specimen (say away from the mechanical Eucentric height due to a bent grid) the pivot point is no longer perfect. If we change the strength of the objective lens (defocus) then the back-focal plane changes height (very slightly but it does change) and again the pivot point is no longer perfect. So we can only do our best to set the alignments at Eucentric focus and minimize the effects. Additionally, the deflector coils are not exact to begin with, nor are they situated perfectly orthogonal to each other which we need to handle the 2-D situation we are actually considering.

Again with all of this considered it is no surprise that we have always avoided beam-image shift exposures for high-resolution data collection. But a number of things have changed recently that have led to beam-image shift collection to become more attractive:

Refinement algorithms have improved to become more sophisticated
- This is also largely tied to the detector advancements which are essential to increase the SNR needed for these software devolpments
We are tackling more heterogeneous samples which demand huge number of particles foremost
Microscope access is limited and so high-throughput is essential

This has led of course for Thermo-Fisher to develop

Aberration Free Image Shift

source

AFIS attempts to further minimize the effects of coma and astigmatism induced by beam-image shift. The idea is fairly simple and is setup (AFIS alignment (service)) starting from a well-aligned microscope with coma and astigmatism already corrected for the on-axis condition:

Use beam-image shift using the X-axis deflector coils a decent amount (4um)
Calculate a Zemlin tableau to measure and correct the induced coma (as is done in Sherpa or EPU Coma-free autofunction)
Calculate the diffractogram to measure and correct the induced astigmatism.
Note the adjustments to the objective stigmator and beam deflection coils
Repeat 1 - 4 in 4um steps up to 20um and then in -4um steps to -20um
Repeat 1 - 5 but now using the Y-axis deflector coils
Finally a combination of X and Y deflector coils is used to account for perpendicular correction

With this data the system can now compensate the coma and astigmatism implied by the imperfect beam-image shift system. The correction is not perfect though and so you will find that even though calibration is done out to 20um, EPU restricts beam-image shift to 6um. data collected with AFIS has been shown to substantially reduce the amount of accumulated beam-tilt (in the published case found to be 0.19 mrad/um shift) and improve the final achieved resolution (source.)

However it is important to note AFIS does not reduce the aberration caused by:

Beam-tilt induced by changes in sample height / defocus
Any beam-tilt existing in the on-axis condition
Spherical aberration
Off-axial imaging

Furthermore since we can estimate aberrations due to beam-tilt and beyond in software post-collection we should continue to do so. This requires the data to be grouped into sets of similar aberration conditions, which requires dividing the AFIS data based on the amount of beam-image shift applied.

Unfortunately with the development of EPU and RELION being anything but synced, this division is not particularly straightforward. However I hope the code and Python notebook linked to at the start of the post will prove to be useful in getting the data split appropriately. I truly hate writing Python, but I hope it makes the process a little more transparent compared to early iterations written in BASH.

Please let me know as always if you have any comments, suggestions or questions about this topic. I am always happy to talk further. I would like to hope that our community will more actively engage in discussion when we have time to read and digest the topics, and when there is maybe not the same stage-fright as speaking up in the bi-weekly meeting

Marta_Carroni · 21 June 2020 17:53

Super nice Dustin, thanks!
I am a bit confused about what would exactly happened in the presence of a Cs corrector. There will be still a problem induced by beam tilt I guess. Is this correct?
xxx

dustin.morado · 21 June 2020 23:52

Yes! C_s\text{-corrected} microscopes have to use the corrector to manage all of the other aberrations, which with no remaining C_s, increase in their impact on imaging (There’s these slides from EMBO 2015 which is useful). This means that AFIS, as far as I am aware of is not really usable in these scopes. In the TFS poster I think that’s shown with Note 2

Coma correction via beam tilt is not applied in Cs image-corrected systems

I even think that the corrected microscopes cannot use hardly any beam-image shift because of off-axial aberrations the system is not suited for, but this could be changing in the future with developments which account for off-axial effects: WO2019133433A1 - Method, device and system for reducing off-axial aberration in electron microscopy - Google Patents

dustin.morado · 29 June 2020 14:39

So it was brought to my attention from the cryoSPARC documentation that template positions are indicated in the filenames of EPU movies. However, upon further inspection EPU is not creating different tags for different holes (hopefully this is something they will be adding soon):

github.com/DustinMorado/EPU_group_AFIS

EPU can indicate image shift group in file names

opened 11:11AM - 29 Jun 20 UTC

closed 02:31PM - 29 Jun 20 UTC

Guillawme

Hi, This is very helpful, thank you! I especially like the visualization of i…mage shift groups. But it seems a bit crazy that one has to do all this to figure out which movies belong to which image shift group. Surely, EPU must be able to indicate the image shift group in filenames? It would then be easy to match files belonging to only one group during import in RELION, to import each group independently. So, I tried to find whether the image shift group information is already present in the file name, and according to [CryoSPARC's documentation](https://cryosparc.com/docs/tutorials/ctf-refinement#exposure-groups) this is indeed the case: the first number after `Data` in the file name supposedly encodes image shift group. I checked in a bunch of movies I have that were recorded with EPU, and this first number after `Data` indeed only ever takes 4 different values across all my movies, so this is consistent with an image shift group (although not obvious at all, since these numbers look completely arbitrary and are big). I am not sure this is done by default, maybe it was set this way in the EPU session that produced this particular set of movies I have? But I would suggest you check what your EPU does, because it seems you can maybe make your life a lot easier to assign image shift groups.

Here’s the link to the cryoSPARC information in case you don’t have the patience to read through the issue

But regardless, this was news to me and so I wanted to share the information to you all as well in case you find it useful.

dustin.morado · 26 January 2021 11:20

When K-means fails

One problem I recently ran into sorting optics groups from EPU AFIS is when the holes are not uniformly visited.

Let’s look at this example where the K-means did not properly cluster the holes:

The holes at the top and the bottom are clustered with the neighbouring holes and because we asked for 13 groups, the centre and another hole were split into two groups.

What is going wrong

The problem here is that the points are not of uniform density, with many more micrographs taken from the central 9 holes and much fewer from the surrounding 4. In this case there are only 21 micrographs at the top hole and 14 at the bottom one, while the neighbouring holes have 791 and 812 shots respectively. This throws off the centroid of each cluster used in K-means clustering.

Hierarchical Ascendant Clustering

So I made an addition to the code and added an option --algorithm to select Hierarchical Ascendant Classification. This is an alternative clustering strategy that builds a dendrogram of points based on their distances and then links leaves of this tree based on selected condition. The traditional linkage condition is the Ward criterion, which is a minimum of variance within a branch while maximizing the variance to other branches. However, this is subject to the same problems caused by the non-uniform density. So here we use a “complete” linkage which only maximizes distance between clusters. Applying this new clustering method to the same data we see now we separate the holes successfully.

Wrapping up

So if you run into this issue with your data, please download the latest version of the code and try again passing the option ‘–algorithm hac’.

Hope you find it useful!

YYang · 28 June 2023 07:09

Hi Dustin,
Thanks for all this detailed information and the very useful EPU_group_AFIS.py script. I am wondering if it is also possible to create optics groups from the .gtg files generated by Gatan Latitude S Software? I am currently processing several dataset collected using the Gatan Latitude S software. I am pretty sure the data collection utilized a strategy similar to EPU AFIS, but I don’t know how to read the beam shift data from the .gtg file associated with each movie file. It would be great if the EPU_group_AFIS.py can also extract such information for datasets collecting using Gatan Latitude S.

Guillaume · 7 November 2023 11:33

With EPU 3.6 this will become much easier: simply matching file names will generate the correct groups.

Guillaume · 5 April 2024 09:56

In case this is helpful to others: I could get the correct grouping without the XML files, using the new field in file names as described in the EPU manual, with the following parameters in the cryoSPARC job “Exposure Group Utilities” (I ran it after Patch CTF, but before Curate Exposures).

Input Selection: exposure
Action: split
Field to use to split Dataset: movie_blob/path
Token Creation Strategy: string_split
File path separator: _
Split Group Index: 5

benis_froms · 19 May 2024 15:17

This information could indeed be helpful to others working in structural biology and using cryo-electron microscopy for their research. It’s always beneficial to share knowledge and tips within the scientific community.

benis_froms · 21 May 2024 20:52

Lawatson · 10 June 2024 05:48

Hi, thanks for sharing your method. As a beginner I found it really helpful.

Topic		Replies	Views
About the Cryo-EM category Cryo-EM	0	596	2 June 2020
Thinking about CTF Cryo-EM	8	8500	28 August 2020
Some timing tests of the Gatan K3 Cryo-EM	5	4383	12 June 2024
On the benefits of patch-tracking Cryo-EM	1	2223	28 February 2021
Molecular weight information derived from cryo-em micrographs? Cryo-EM	2	220	11 June 2024