Data analysis at the European XFEL

 

Data analysis at the European XFEL

Provided on a best effort basis - and may go out of date!


Links to useful resources:

  1. Bullet    XFEL data analysis documentation

  2. Bullet    Maxwell cluster (offline analysis)

  3. Bullet    Detector specific analysis

  4. Bullet    XFEL detector calibration



Offline analysis

  1. Bullet   Log on to the Maxwell cluster: max-exfl.desy.de

                User notes for remote access, installed software and such stuff: Maxwell cluster

  1. Bullet   Data and computing is physically located at DESY.  Data must be moved to DESY before it is available.

  2. Bullet   Subscribe to the maxwell-user mailing list for updates on things such as cluster status and file system problems
                Users can self-subscribe here: https://lists.desy.de/sympa/info/maxwell-user

  3. Bullet   Data location: /gpfs/exfel/exp/<instrument>/<proposal_cycle>/<proposal_id>

                Example: /gpfs/exfel/exp/SPB/201701/p002012/

                    /raw = raw data

                    /scratch = temporary data (really scratch - may be wiped as needed)

                    /usr = where to put your scripts (synchronised between online and offline, limited space)

                    /proc = location for data output by XFEL calibration pipeline

  1. Bullet   Batch queue is managed by Slurm, queue name for XFEL analysis is upex

                Instructions: Getting started and more detailed

                Example: > sbatch -p upex --wrap hostname

                                    Submitted batch job 1516

                                > cat slurm-1516.out

                                    max-exfl001.desy.de

  1. Bullet    Don’t run big jobs on the login nodes, request an interactive node instead

                Example: > salloc -p upex -t 10:00:00

                then ssh directly to the node allocated to you

  1. Bullet    Docker containers are used to distribute Karabo for use on Maxwell:  Instructions

                Note: you need to first request access to docker containers by sending an email to maxwell.service@desy.de

                or you will get the error “cannot connect to the docker daemon. is the docker daemon running on this host?

  1. Bullet    Remote access to the cluster is possible through ssh://bastion.desy.de

  2. Bullet    FastX graphics connections available using a FastX client.
            In an emergency, there is also a web browser interface but performance is not as good.

  3. Bullet   Transferring large data back home is best done with Globus Online



Available software on Maxwell

  1. Bullet  An extensive software stack is available on Maxwell
            See the list of installed software and photon science specific packages


  1. Bullet  Standard installation modules available on Maxwell can be found by
                 > module avail
           
    This includes python3, IDL, Matlab and many other common programs


  1. Bullet  A public version of the CFEL software stack is available:

                > source /gpfs/cfel/cxi/common/public/cfelsoft-rh7-public/conda-setup.sh

                > conda info --envs

                        # conda environments:

                        #

                        base                

                        ana-1.4.2         

                        conda_build     

                        crystallography


                > conda activate crystallpgraphy

             

  1. Bullet The normal CFEL stack is also available for CFEL people.
           To use the public stack, CFEL people will have to first unload our own internal version:

                > unset MODULEPATH (to start from a clean slate)



Rapid disk-based analysis (“online”) (at Schenefeld facility)

  1. Bullet   Data on the online cluster before it is moved to the offline cluster at DESY (which must be done manually)

  2. Bullet    Machines are on the private control network, access only available from certain machines physically at XFEL 

  3. Bullet    Shared access to 6 nodes.  Usage starts and stops with your 12-hour experiment shift (!!!)

                (exflonc06, exflonc07, exflonc08, exflonc09, exflonc10, exflonc11)

  1. Bullet    Off-shift access to one machine for installation and testing

                SPB/SFX: exflonc05

               FXE: exflonc12

  1. Bullet   Data is located in /gpfs/p<proposal_id>/(raw|usr|proc|scratch)   

                Note different path to the offline cluster (!!!)

  1. Bullet    /usr folder synchronised between online and offline, limited space so use for scripts and software but not data.



Real time analysis

  1. Bullet   Needs to be integrated with Karabo operation. Experts only at the moment. See documentation from XFEL

  2. Bullet   OnDA is running but in development mode at the moment



Reading data files

  1. Bullet     EuXFEL data is saved in highly structured HDF5 files.

  2. Bullet     Example data can be found in /gpfs/exfel/data/scratch/example_data/

  3. Bullet     A collection of file readers is available from this repository:

                    https://stash.desy.de/projects/ELBE/repos/euxfel/browse

                mirrored here for push access for those without a stash account

                    https://github.com/antonbarty/EuXFEL

                Contributions welcome - there is no point reinventing the wheel here.  Please email for push access.



Detector data

  1. Bullet   AGIPD is a complex detector consisting of 352 memory cells each with a 3-stage dynamically-switching gain.
           Effectively 352 detectors in one, data is saved directly from the detector and converted to linearised units in
           software post-processing.  This lossless approach allows for the application of improved calibrations and
           photon conversion after data is taken, but requires a script to be run in order to perform that conversion.

  2. Bullet    Example data can be found in /gpfs/exfel/data/scratch/example_data/

  3. Bullet    Sample calibration script is /gpfs/exfel/data/scratch/example_data/calibrate.py

  4. Bullet    python calibrate.py --input /gpfs/exfel/data/scratch/example_data/r0283/RAW-R0283-AGIPD*0.h5 \

                    --output ../offline_data/calibrated_agipd/ \

                    --local-cal-store ../offline_ana/agipd_store.h5 \

                    --mem-cells 30 --cores 64 \

                    --instance SPB_DET_AGIPD1M-1 \

                    --type correct --nodes 5 \

                    --partition upex

  1. Bullet    Running this script requires access to docker containers (see note above in offline analysis section)

  2. Bullet    Notes from XFEL: offline_calibration.pdf



   

Back to Cheetah at EuXFEL


Note:

First experiments at EuXFEL are in Mid September 2017.  Things are changing very quickly and this may go out of date. Please be patient regarding any errors.