Information


Documentation



ambigator: Resolve indexing ambiguities

Synopsis

ambigator input.stream [-o output.stream] [options]
ambigator --help

Description

This program resolves indexing ambiguities using a simplified variant of the clustering algorithm described by Brehm and Diederichs, Acta Crystallographica D70 (2013) p101.

The algorithm starts by making a random indexing assignment to each crystal. The indexing assignment is a flag which indicates whether the crystal should be re-indexed according to the ambiguity operator.

The algorithm proceeds by calculating the individual correlation coefficients between the intensities from one crystal and those from each of the other crystals in turn. The mean correlation coefficient, f, is taken over all crystals which have the same indexing assignment as the current pattern. Separately the mean correlation coefficient g is taken over all crystals which have indexing assignments opposite to the current crystal. The indexing assignment for the current crystal is changed if g > f. Every crystal is visited once in turn, and the pass over all the crystals repeated several times.

Only one indexing ambiguity can be resolved at a time. In other words, each crystal is considered to be indexable in one of only two ways. If the true indexing ambiguity has more possibilities than this, the resolution must be performed by running ambigator multiple times with a different ambiguity operator each time.

If the ambiguity operator is known (or, equivalently, the actual and apparent symmetries are both known), then the algorithm can be enhanced by including in f the correlation coefficients of all the crystals with the opposite indexing assignment to the current one, but after reindexing the other crystal first. Likewise, g includes the correlation coefficients of the crystals with the same indexing assignment after reindexing. This enhances the algorithm to an extent roughly equivalent to doubling the number of crystals.

The default behaviour is to compare each crystal to every other crystal. This leads to a computational complexity proportional to the square of the number of crystals. If the number of crystals is large, the number of comparisons can be limited without compromising the algorithm much. In this case, the crystals to correlate against will be selected randomly.

By default, the reflections will be compared and correlated in three resolution bins: up to 10, 10-2.5 and above 2.5 Angstrom. You can override this by using --highres and --lowres, in which case only one resolution bin will be used for all reflections.

Options

-o filename
--output=filename

Write a re-indexed version of the input stream to filename. This stream can then be merged normally using process_hkl or partialator, but using the actual symmetry instead of the apparent one.

Warning: There is no default filename. The default behaviour is not to output any reindexed stream!

-y pg
--symmetry=pg

Set the actual symmetry of the crystals. If you're not sure, set this to the highest symmetry which you want to assume, which might be -1 to assume Friedel's Law alone or 1 (the default) for no symmetry at all. The algorithm will work significantly better if you can use a higher symmetry here.

-w pg

Set the apparent symmetry of the crystals. The ambiguity operator will be determined by comparing this to the actual symmetry.

If you prefer (or the scenario demands it), you can specify the ambiguity operator directly using --operator.

Using this option (or --operator) improves the algorithm to an extent roughly equivalent to doubling the number of crystals.

--operator=op

Specify the indexing ambiguity operator. Example: --operator=k,h,-l.

If you prefer, you can specify the ambiguity operator by specifying the apparent symmetry using -w.

Using this option (or -w) improves the algorithm to an extent roughly equivalent to doubling the number of crystals.

-n n
--iterations=n

The number of passes through the data to make. Extra iterations are not expensive once the initial correla tion calculation has been performed, so set this value quite high. Two or three iterations are normally sufficient unless the number of correlations (see --ncorr) is small compared to the number of crystals. The default is --iterations=6.

-j n

Number of threads to use for the CC calculation.

--highres=d

High resolution cutoff in Angstroms.

--lowres=d

Low resolution cutoff in Angstroms.

--start-assignments=filename

Read the starting assignments to filename. The file must be a list of 0 or 1, one value per line, in the same order as the crystals appear in the input stream. 1 means that the pattern should be reindexed according to the ambiguity operator. The length of the file must be at least equal to the number of crystals in the input stream.

--end-assignments=filename

Write the end assignments to filename. The file will be a list of 0 or 1, one value per line, in the same order as the crystals appear in the input stream. 1 means that the pattern should be reindexed according to the ambiguity operator.

--fg-graph=filename

Write f and g values to filename, one line per crystal, repeating all crystals as they are visited by the algorithm. Plot these using fg-graph from the CrystFEL script folder to evaluate the ambiguity resolution.

--ncorr=n

Use n correlations per crystal. The default is to correlate against every crystal. If the CC calculation is too slow, try --ncorr=1000. Note that this option sets the maximum number of correlations, and some crystals might not have enough common reflections to correlate to the number requested. The mean number of actual correlations per crystal will be output by the program after the CC calculation, and if this number is much smaller than n then this option will not have a significant effect.

--really-random

Be non-deterministic by seeding the random number generator (used to make the initial indexing assignments and select patterns to correlate against) from /dev/urandom. Otherwise, with single-threaded operation (-j 1) on the same data, the results from this program should be the same if it is re-run. Using more than one thread already introduces some non-deterministic behaviour.

--corr-matrix=filename
Write the the correlation matrices in HDF5 format to filename. The file will contain two datasets: correlation_matrix and correlation_matrix_reindexed. They contain, respectively, the correlation matrix with all crystals in their original orientations and all crystals in the reindexed orientations. If the ambiguity operator is unknown (i.e. neither --operator nor -w were used), then the latter will be zero everywhere.

Author

This page was written by Thomas White.

Reporting bugs

Report bugs to taw@physics.org

Copyright and Disclaimer

Copyright © 2014-2020 Deutsches Elektronen-Synchrotron DESY, a research centre of the Helmholtz Association.

Please read the AUTHORS file in the CrystFEL source code distribution for a full list of contributions and contributors.

CrystFEL is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

CrystFEL is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with CrystFEL. If not, see http://www.gnu.org/licenses/.

See also

crystfel indexamajig

If CrystFEL is installed on your computer, you can read this manual page offline using the command man ambigator.