-
Notifications
You must be signed in to change notification settings - Fork 20
Running ORAC
Three Python scripts are provided to simplify the process of running ORAC.
-
orac.pyfully processes a single file and is the main script you should use; -
single_process.pyruns one step of the ORAC processor on a single file; -
regression.pyruns a suite of regression tests.
A full list of arguments for each script can be found by calling any script with the --help argument. This page introduces the most common arguments.
This script has a single mandatory argument: the name of a satellite imagery file to process. If not an absolute path, specify the directory with --in_dir. Please do no rename satellite files as the formatting is used to determine much of the file's metadata. These formats are specified in the FileName class, which you may edit to accommodate new sensors.
The most commonly used arguments are,
-
--out_dirto specify the output directory; -
--preset_settingsindicates which predefined settings (from your local defaults should be used; -
--limit X0 X1 Y0 Y1limits processing to the rectangle specified (in 1-indexed satellite pixel count); -
--l1_land_maskuses the land mask in the satellite data and is highly recommended for polar orbiting satellites; -
--skip_ecmwf_hris recommended for new users as this feature isn't very important when using new meteorological data; -
--use_ocuses Ocean Colour CCI data in the sea-surface reflectance calculation and is highly recommended with aerosol retrievals; -
--revisionsets the revision number, which is written into the file names. It must be set if not using a Git repository; -
--procs Nto use N cores during processing; -
--clobber Xsets the clobber level: 3 overrides all existing files, 2 only overrides final results, 1 overrides everything except pre-processed files, and 0 leaves all existing files in place.
ORAC can be run through a batch queuing system rather than on your local machine with the --batch argument. The batch system is specified at the bottom of your local_defaults.py. Controls for that are,
-
--labelsets the name of the job; -
--dur X Y Zspecifies the maximal allowed duration (in HH:MM) for the pre, main and post processors (X, Y, Z, respectively); -
--ram X Y Zspecifies the maximal allowed memory (in Mb) for the pre, main and post processors.
The following arguments are useful for debugging,
-
--script_verboseprints the driver files to the screen; -
--verboseactivates full verbosity, printing all progress within the program; -
--dry_runis a dry-run, which will print driver files to the screen but not call any executables; -
--keep_driverto keep the driver files after processing; -
--timingprints the duration of each process; -
--available_channelsto specify the channels to read from the satellite data (though they may not necessarily be used); -
--settingsallows direct specification of settings (without--preset_settings). It may be given multiple times to specify multiple processings; -
--settings_file FILEworks like--settings, but each line ofFILEspecifies a processing.
If you wish to run with a different ORAC executable (e.g. because you have made some changes and wish to compare the results with and without them), use --orac_dir to specify the root of the altered source code directory and/or --orac_lib to specify the new library dependencies.
For users familiar with the driver file format, it is possible to directly write lines by two means. In these, SECTION can take the values pre, main or post to indicate which part of the processor to affect.
-
--additional SECTION KEY VALUEwill set variableKEYto equalVALUEin the driver file when runningSECTION; -
--extra_lines SECTION FILEwill copy the contents ofFILEinto the driver file ofSECTION.
The main job of the Python scripts is, given a satellite file, to locate the appropriate auxiliary files to pass to ORAC. It does so by searching various paths for expected filenames. To save typing them in each call, these paths are consolidated in a single file: local_defaults.py. You will need to prepare one to describe your local environment. A description of each variable is provided in our general example while a specific example is available for processing on JASMIN.
If you installed ORAC using Anaconda, your local defaults file should be stored at ${CONDA_PREFIX}/lib/python3.7/site-packages/pyorac (adjusting the Python version as appropriate). Otherwise, leave it in tools/pyorac.
The values defined in this file are only defaults. All can be overridden for a call using the arguments --aux (for paths), --global_att (for NCDF attributes) or --batch_settings (for batch processing settings). These all use the syntax -x KEY VALUE, where KEY is the name of the variable you wish to set and VALUE is its new value.
This script takes a single argument, as above, and
- if it is a satellite image, runs the pre-processor;
- if it is any output of the pre-processor, runs the main processor once;
- if it is any output of the main processor, runs the post-processor.
Each of these has associated arguments to control the operation of ORAC. The same arguments are used by --settings or the retrieval_settings in your local defaults.
-
--day_flag Nspecifies if only day (1) or night (2) should be processed. Default behaviour (0) is to process everything. (Twilight is neither day nor night.) -
--dellatand--dellonset the reciprocal of the resolution of the pre-processing grid. RTTOV is only run over that grid and then interpolated for each satellite pixel. These should be less than or equal the equivalent for the meteorological data used. -
--ir_onlyskips all visible channels. Saves time for cloud top height retrievals. -
--camel_emisuses the CAMEL surface emissivity library rather than the RTTOV atlas. -
--ecmwf_nlevels [60, 91, 137]specifies the number of levels in the meteorological data given. -
--use_ecmwf_snowuses the snow/ice fields in the meteorological data rather than from NISE. -
--no_snow_corrskips the snow/ice correction. Saves time for geostationary imagery.
-
--approachgives the forward model to be used. These are,- AppCld1l for single-layer cloud;
- AppCld2l for two-layer cloud;
- AppAerOx for aerosol over sea (using a BRDF surface model);
- AppAerSw for aerosol over land and multiple-view imagery (using the Swansea surface model);
- AppAerO1 for aerosol over land and single-view imagery (using a BRDF surface model).
-
--ret_classallows alteration of the approach and only needs to be set if you wish to experiment with the forward model. Options are,- ClsCldWat for water cloud;
- ClsCldIce for ice cloud;
- ClsAerOx for aerosol over sea (using a BRDF surface model);
- ClsAerSw for multiple-view aerosol (using the Swansea surface model);
- ClsAerBR for multiple-view aerosol (using a BRDF-resolving Swansea surface model);
- ClsAshEyj for ash.
-
--phasegives the type of particle to evaluate. These are,- WAT for water cloud;
- ICE for ice cloud;
- A70 for dust;
- A71 for polluted dust;
- A72 for light polluted dust;
- A73 for light dust;
- A74 for light clean dust;
- A75 for Northern Hemisphere background;
- A76 for clean maritime;
- A77 for dirty maritime;
- A78 for polluted maritime;
- A79 for smoke;
- EYJ for ash.
-
--multilayer PHS CLSsets the--phaseand--ret_classfor the lower layer in a two-layer retrieval. -
--use_channelsset which channels should be used. Requested channels that weren't made available with-care quietly ignored. -
--typesallows the user to limit which pixels are processed to those listed. Pixels are flagged by type in pre-processing as one of CLEAR, SWITCHED_TO_WATER, FOG, WATER, SUPERCOOLED, SWITCHED_TO_ICE, OPAQUE_ICE, CIRRUS, OVERLAP, PROB_OPAQUE_ICE, or PROB_CLEAR. By default, all are processed. -
--no_landskips all land pixels; -
--no_seaskips all ocean pixels; -
--cloud_onlyskips all clear-sky pixels; -
--aerosol_onlyskips all cloudy pixels.
-
--phasesspecifies which phases should be combined into this file. The code will not automatically work out which ones you want during single processing (but will work fine during normal running). -
--chunkingsplits the satellite orbit in 4096 line chunks. Useful for machines with limited memory. -
--compresscompresses the data in the final output. This can significantly reduce the size of aerosol files, which contain many fill values. -
--no_night_optsuppresses the output of cloud optical properties at night. -
--switch_phaseis a correction of cloud-only processing, whereby water pixels with a CTT below the freezing point are forced to ice (and vice versa).
This runs the ORAC regression tests, a sampling of orbits over Australia on 20 June 2008. If you intend to commit code to this repository, make certain that it compiles and can run these tests without unexpected changes.
The script accepts all of the arguments from the scripts above but ignores any --settings in favour of built in tests. Additional arguments are,
-
--testsspecifies which tests should be run. They are,- The short tests (five lines containing both cloud and clear-sky) DAYMYDS, NITMYDS, DAYAATSRS, NITAATSRS, DAYAVHRRS, NITAVHRRS. These are sufficient in most circumstances and are run by default.
- The long tests (processing the entire image) DAYMYD, NITMYD, AATSR, AVHRR. All of these can be called by
--long.
-
--test_typespecifies which manner of test should be run (specifically, which suffix to use when setting--preset_settings): C for cloud, A for aerosol, or J for joint. -
--benchmarksuppresses comparison. By default, the script will increment the revision number of the repository by 1 and compare the new outputs to the previous.
orac.py /network/group/aopp/eodg/atsr/aatsr/v3/2008/06/03/ATS_TOA_1PUUPA20080603_160329_000065272069_00111_32730_5967.N1 \
--out_dir /data/MEXICO --available_channels 1 2 3 4 5 6 7 8 9 10 11 12 13 14 \
--limit 1 512 17200 18200 -S AATSR_J --l1_land_mask --use_oc --procs 7
This will process an AATSR orbit from 3 June 2008 stored in Oxford, saving the results to the folder /data/MEXICO. All 14 channels in the data will be used (--available_channels) over a 512x101 pixel block of the orbit (--limit). The orbit will be evaluated using the preset settings for a joint retrieval (-S AATSR_J; 23 runs covering two single-layer clouds, one multilayer cloud, ten sea-only BRDF-surface aerosol retrievals, and ten land-only Swansea-surface aerosol retrievals) but using the satellite data's own land/sea mask (--l1_land_mask) and input from Ocean Colour CCI data (--use_oc). Seven cores will be used for this processing (--procs 7).
orac.py -i /network/group/aopp/eodg/atsr/aatsr/v3/2008/01/19 \
-o /network/aopp/apres/users/povey/settings_eval/default_land \
--day_flag 1 --dellon 1.5 --dellat 1.5 --ecmwf_flag 4 \
-x ecmwf_dir /network/aopp/matin/eodg/ecmwf/Analysis/REZ_0750 --skip_ecmwf_hr \
--settings_file ~/new_retrieval --use_oc --l1_land_mask --keep_driver \
--batch --ram 4000 4000 4000 -b queue legacy \
-g project AERONETCOLLOCATION -g product_name N0183-L2 \
ATS_TOA_1PUUPA20080119_103548_000065272065_00165_30780_4013.N1
This will process the day-time segment (--day_flag) of an AATSR file from 19 January 2008, saving the result to a folder called default_land (-o). The pre-processing grid will have a resolution of 0.75° (--dellon --dellat) and draw from operational ECMWF forecasts (--ecmwf_flag -x ecmwf_dir) only (--skip_ecmwf_hr). The settings are drawn from the file new_retrieval in my home directory (--settings_file), though Ocean Colour CCI data and the L1 land mask are added. The driver files will be retained after processing (--keep_driver). The processing will be batch processed on the queue legacy (--batch -b queue), allocating 4Gb of RAM to each stage (--ram). The project name will be AERONETCOLLOCATION and the product name will be N0183-L2 (-g project -g product_name).
regression.py -o /data/testoutput --l1_land_mask --procs 8 -r 1870 -C1
This runs the six short, cloud regression tests (the default), saving the results to /data/testoutput. The L1 land/sea mask is used. Eight cores will be used and the result labelled as revision 1870. Any existing files of that pre-processor files of that revision will be kept (-C1). (This call was made during debugging, where the pre-processor worked fine but the main processor failed, so we had no need to repeat the successful steps.)
The Python scripts in orac/tools are fairly simple wrappers for the code in orac/tools/pyorac. The files there are,
-
arguments.pydefines all of the command line arguments for the various scripts and functions that check the inputs are valid; -
batch.pydefines the interface to call a batch queuing system; -
definitions.pydefines some classes used throughout the code. All satellite instruments and particle types need to be defined in here; -
drivers.pycontains functions that create driver files for each part of ORAC. This is where most of the work actually happens; -
local_defaults.pydefines the default locations that the scripts search for input files so you don't have to type the full paths every time; -
mappable.pycontains a class that used for plotting satellite swaths on maps (it's sort of like Mappoints from IDL); -
regression_tests.pydefines which files and pixels to run during testing; -
run.pycontains functions that call everything else (process_allcontains everything needed to run ORAC); -
swath.pycontains a class that used for loading and filtering ORAC data; -
util.pycollects various routine functions used throughout the code.
The most common error is an environment error. To run, the scripts require a number of external libraries to be installed and to be able to find the orac/tools/pyorac folder. The conda installation should do all of that. When something goes wrong, try cd $ORACDIR/tools (if that helps, it means PYTHONPATH hasn't been updated correctly). If you want to quickly check the scripts compile, call orac.py -h for the help prompt.
The next most common error is from the local_defaults.py file pointing to folders that don't exist. That probably comes up as some sort of "File not found" error. Then there are file name errors. The script makes certain assumptions about the format for the input files and when those change, the script fails with unhelpful error messages from drivers.py.
Other common errors:
- When multiple syntax errors happen, it means Python can't compile the code. Check your ORAC environment is activated (as you might be using the default Python version, which is rather old).
- If you get error code -11, try
ulimit -s 500000to increase your stack size. If that works, consider adding that line to your .bashrc file or the ORAC activation script.
- User Guide
- ORAC
- Developer Guide