There are two major paradigms for running AlignTK tools in parallel for faster throughput on large datasets.  The first paradigm is used for simple image processing operations that generally run fairly quickly but must be applied to thousands of images.  Here, a program called prun manages the parallel execution of individual commands.  It take a command template and a list of strings, substitutes each list entry into the template, and then runs those commands on a set of worker nodes.  For example, if the file reduce.cmd contains:

reduce -factor 8 -input images/%s.tif -output images8/%s.tif

and the file images.lst contains:


then running the following MPI program (possibly inside a batch job):

mpirun -np 4 prun -command reduce.cmd -list images.lst

will cause these individual commands to be delegated to the workers:

reduce -factor 8 -input images/z01.tif -output images8/z01.tif
reduce -factor 8 -input images/z02.tif -output images8/z02.tif
reduce -factor 8 -input images/z03.tif -output images8/z03.tif

Since there are 4 processes total, there will be one master and three workers, and all the above commands will run in parallel.  If there had been more entries in the list, then the first free worker would get the next command, etc., until all items in the list were completely processed.

The more CPU-intensive programs such as find_rst, register, align, and ortho handle their parallelism internally, and do not need to be invoked via prun.  These programs take lists of images or maps as input, and distribute these over the available processes.  For example, to start a parallel run of register, one would issue a command like this (either interactively or from within a batch job):

mpirun -np 128 register -pairs pairs.lst -input images/ -output maps/  ...<other options>...

This would start up 128 instances of the register program, with one being the master and the other 127 being workers.  The number of processes should not exceed the number of elements in the list by more than 1, otherwise, there will be idle processes that never have any work assigned to them.  The -update option to find_rst and register is often useful when adding new images to a list, or when manually adding correspondence points using inspector.  This option only causes these codes to recompute maps for images which have changed or to which correspondence points have been added since the last time the codes were run.

align and ortho are the only codes that require fast, tight synchronization among their component processes.  They do not use a master/worker decomposition, and any process may potentially communicate with any other process.   For these code a high-bandwidth, low-latency interconnect such as Infiniband is recommended.  Since align only requires maps and not the images themselves, the amount of data that it requires is only a fraction of the raw image data size, and one might consider transferring these maps to a tightly-coupled cluster with low-latency for processing with align, while doing the remainder of the processing on a more loosely-coupled cluster.


Copyright © 2019 National Center for Multiscale Modeling of Biological Systems. All Rights Reserved.