dlpick¶
Phase detection and picking on waveforms.
Description¶
dlpick is a highly configurable SeisComP StreamApplication client that reads waveforms from RecordStream and outputs picks, processed by Deep Learning (DL) models, that were created and trained using the SeisBench framework.
Installation¶
Install dlpick using GSM:
$ gsm install dlpick
This installs three packages:
dlpick - this application
dlbase - dependency package for deep learning applications
dlmodels-pick - DL models, installed to /home/data/dlmodels, by default
Workflow¶

Figure 1: simplified architecture of dlpick¶
Quick start¶
Like any other SeisComP StreamApplication client, dlpick can read from any RecordStream, attach to a database and messaging. This also means, dlpick can operate in online and offline mode, respectively.
Let’s start with online mode.
Example: Online mode¶
Make sure to have the necessary seiscomp modules running. For example, if you provide waveforms via seedlink:
$ seiscomp start scmaster seedlink
While it is possible to start dlpick as a module like above (or via scconfig), we will have a look at the commandline options, first:
Without any arguments, dlpick will use the local configuration, connect to the local database and messaging.
$ dlpick
If you already have global bindings attached to your streams providing detecStream and detecLocid, dlpick will use its default settings to process streams and send resulting picks to the messaging. One major difference to scautpick is that, in general, DL models need three components. Therefore dlpick will try to find three-component channels based on the parameters above.
The DL model¶
The default DL model is PhaseNet with its original weights. This model needs 30s input of raw three-component data with sample rate of 100sps. The app takes care of resampling and all pre-processing steps that are specific to the model.
Other models can be used, currently only EQTransformer and PhaseNet architectures are supported. A list of available models, that shipped with your installation:
$ dlpick --print-models
Try another model. For local seismicity, instance weights are a good choice.
$ dlpick --model eqtransformer --weights instance
Note that the model and the weights should be written in lower-case. It’s possible to use abbreviations for the weights, as well. You could also type –weights in.
In contrast to PhaseNet, the EQTransformer architecture needs 60s of input data.
Confidence thresholds¶
The applied DL model outputs data that can be considered as the model’s confidence about a P or an S pick for each data sample. This enables us to use confidence thresholds to restrict or relax the number (and quality) of picks. Default thresholds are very low 0.1. For most models and data, a threshold about 0.3 would be appropriate, for P and S respectively.
$ dlpick --p-threshold 0.3 --s-threshold 0.3
Overlapping¶
It is advisable to process waveforms with overlapping time windows. This way the model gets more chances to find phase onsets that are close to the window’s borders. But this also means more picks. dlpick uses a reduction algorithm that searches for very dense picks and selects only those with the highest confidence.
To set a useful overlapping (between 0.5 and 0.9), but keep in mind that this comes at the expense of performance:
$ dlpick --ol-ratio 0.9
Per default, the reduction algorithm ensures picks have a distance of at least 5s to each other. If you think this interval is too wide, you can change it (currently for the corresponding model architecture only), say to 1.9s:
$ dlpick --model eqtransformer --weights in --ol-ratio 0.9 --models.eqtransformer.reducePicksInterval=1.9
Performance¶
Batches
To improve performance, waveforms should be processed in a batch and, if possible on the GPU. Per default, the batch size is already 32. But this might lead to longer waiting times. If you experience this as a problem, set the batch-size to 1, to process each data window, instantly:
$ dlpick --batch-size 1
Batches are packed for each Picker configuration. This means, if multiple streams are processed with the same configuration, only one Picker is in charge. Each picker will try to fill a batch of the same size before processing the data.
Hint
This behavior might change in future releases, as it can be desirable to have one picker for many streams running with large batch size and another one, say, for only one station that should be
At the moment, we only have one Picker configuration since we are doing everything from command line without using the specific dlpick profiles per station. Binding profiles enable to start dlpick with a specific Picker configuration per station.
GPU
If your machine has a CUDA GPU, you should use it accelerate predictions, together with appropriate batch sizes:
$ dlpick --gpu --batch-size 128
However, this will be more of advantage in offline mode.
Phase hints¶
Per default, dlpick puts a ‘d’ in front of the phase name and appends the confidence as percentage:
"dP (93.3%)", "dS (10.1%)"
If you need clean phaseHints for your pipeline, change that like this:
$ dlpick --phase-prefix "" --phase-postfix ""
Output¶
If you don’t want to send the picks to the messaging, there are other ways, look at these options:
Write to SCML file: –ep
Write CSV file: –csv
Print to terminal: –stdout
These options don’t take arguments (file names are numbered but will always have the same name stem) and they can be combined. Note, if one of them is provided, sending to message will be deactivated.
To activate messaging:
-H localhost
Example offline mode¶
Of course, dlpick can process miniSEED files and work in offline / playback mode completely:
$ dlpick -I waveforms.mseed --inventory-db inventory.xml --config-db config.xml --config-file dlpick.cfg \
--ep --offline --batch-size 512 --gpu --stdout --out /tmp
Module Configuration¶
etc/defaults/global.cfg
etc/defaults/dlpick.cfg
etc/global.cfg
etc/dlpick.cfg
~/.seiscomp/global.cfg
~/.seiscomp/dlpick.cfg
dlpick inherits global options.
Note
Modules/plugins may require a license file. The default path to license
files is @DATADIR@/licenses/
which can be overridden by global
configuration of the parameter gempa.licensePath
. Example:
gempa.licensePath = @CONFIGDIR@/licenses
- leadTime¶
Default:
300
Unit: s
Type: int
(Not implemented!) Defines the time in seconds to start picking on waveforms before current time.
- modelsPath¶
Default:
/home/data/dlmodels/dlmodels-pick
Type: path
Search path for Deep Learning models. This path is also used to look up the file "dlmodels_maps.json" holding additional information on the available models.
- model¶
Type: string
Name of the picker model. Possible values: phasenet, phasenetv, eqtransformer - for PhaseNet, VariableLengthPhaseNet and EQTransformer, respectively.
- weights¶
Type: string
Name of the weights to load model with. Following the SeisBench naming convention, this is usually the name of the dataset the model was trained on. For example, if you want to load a PhaseNet model with pre-trained weights on the ETHZ dataset, you would use ‘phasenet’ as model and ‘ethz’ as weights parameter. Apart from that, Gempa’s "dlmodels naming convention" can be used, but needs to be split in two: E.g., you may find this model in /home/data/dlmodels/dlmodels-pick/ with file stem "pt-et-0". You can load it by setting the model parameter to ‘phasenet’ and ‘et-0’ as weights parameter (instead of "ethz"). This way you can load any model in that directory, without the need of knowing the actual dataset name. For a list of available models and weights, call dlpick --print-models.
- phasePrefix¶
Default:
d
Type: string
Prefix that will be put in front of the phase hint.
- phasePostfix¶
Type: string
Suffix for the phaseHint string of the pick. This string is interpolated like a Python format string. You can use "$c" as placeholder for the pick confidence. Example: "({$c:.1%})" would result in "(56.0%)" as a suffix for a pick with confidence 0.56012345, whereas "{$c}" would simply add the confidence to the phase string: "0.56012345". Note that "$c" needs to be inside a curly-brace context to work as a variable.
- phaseMap¶
Type: string
Rename phase hints. E.g., to rename P -> Is and S -> It: "P:Is,S:It"
- ep¶
Type: boolean
If set, picks will be dumped to an XML file. Can be combined with --csv. You can change the output path using parameter
outDir
or --out on the commandline. During a run, dlpick generates consecutively numbered files with the file name stem "picks". When the picker is about to finish, those files are merged into one big file with the stem named "picks-merged". If one or more files with the same name already exist, dlpick will also number the file name.
- csv¶
Type: boolean
If set, picks will be dumped to csv file. Can be combined with --ep. You can change the output path using parameter
outDir
or --out on the commandline. For more details see option --ep.
- stdout¶
Type: boolean
If set, picks will be dumped to stdout in the form of "streamID phaseHint : time confidence: confidenceValue model weights".
- gpu¶
Type: boolean
If set, model and data will work on the GPU if cuda is available on the machine else on CPU.
- outDir¶
Type: string
Set output folder for pick xml/csv files.
- filter¶
Type: string
The filter to apply on the data before it is sent to the picker. This parameter overrides any model filter parameter.
- minGatherPicksBeforeReduce¶
Default:
10
Type: int
This parameter sets the number of picks that must be staged before the reduction algorithm is applied to them.
Details: Minimum-size criterion for applying the reduction algorithm to a pick buffer. When pushing to or popping from a pick buffer, this criterion is tested as well as the timeout. If one of them is true, the reduction will be applied. (There is a pick buffer for each stream.)
- minTimeBeforeReduce¶
Default:
10.0
Unit: s
Type: double
This parameter sets the timeout for staging picks until the reduction algorithm is applied to them. After reduction, the timer starts over again.
Details: Timeout criterion for applying the reduction algorithm to a pick buffer. When pushing to or popping from a pick buffer, this criterion is tested as well as the minimum picks number. If one of them is true, the reduction will be applied. (There is a pick buffer for each stream.)
- minGatherPicksBeforeFlush¶
Default:
10
Type: int
This parameter sets the number of picks that must be gathered before publishing.
Details: Minimum-size criterion for flushing a pick buffer. When the publisher process tries to pop picks from the pick buffers, this criterion is tested as well as the timeout. If one of them is true, those picks considered as safe-to-flush will be sent to the output sinks.
- minTimeBeforeFlush¶
Default:
10.0
Unit: s
Type: double
This parameter sets the timeout for gathering picks until they are published. After publishing, the timer starts over again with the next staging of picks, in a buffer.
Details: Time criterion for flushing a pick buffer. When the publisher process tries to pop picks from the pick buffers, this criterion is tested as well as the minimum picks number. If one of them is true, those picks considered as safe-to-flush will be sent to the output sinks.
- uncertainPThresh¶
Type: double
If set, adds additional picks with a lower threshold, marking them with the prefix ‘u’. Picks above the normal
PThreshold
are not touched. This currently only applies to xml output. This value can be overridden by the same-named bindings’ config parameter for the corresponding set of stations.
- uncertainSThresh¶
Type: double
If set, adds additional picks with a lower threshold, marking them with the prefix ‘u’. Picks above the normal
SThreshold
are not touched. This currently only applies to xml output. This value can be overridden by the same-named bindings’ config parameter for the corresponding set of stations.
Note
models.* Parameters for the picker model types.
- models.eqtransformer.traceWindowOlRatio¶
Default:
0.05
Type: double
Ratio defining how much trace windows are overlapping each other.
- models.eqtransformer.reducePicksOrder¶
Default:
2
Type: int
(Deprecated! Don’t touch this value!) Sets the number of passes for the picks reducing algorithm.
The combination of the specific picking algorithm the model applies, picking confidence threshold and the
models.eqtransformer.traceWindowOlRatio
can lead to unwanted dense picks which can be reduced. If set to 0, no reduction is done. If set to 1, the reducer is applied once, which is good for performance but may leave some dense picks. If set to 2, in theory more dense picks are left. For praxis, see note inmodels.eqtransformer.reducePicksInterval
.
- models.eqtransformer.reducePicksInterval¶
Default:
5.0
Unit: s
Type: double
Sets the time span in seconds before and after a pick where no other pick is supposed to exist.
The algorithm leaves only those picks with the highest confidence in the interval.
- models.eqtransformer.PThreshold¶
Default:
0.1
Type: double
Threshold defining the minimum confidence of for a P pick.
- models.eqtransformer.SThreshold¶
Default:
0.1
Type: double
Threshold defining the minimum confidence of for an S pick.
- models.eqtransformer.minPrefDataRatio¶
Default:
1.33
Type: double
(Surplus) ratio of data in relation to the actually necessary (model-defined) length that the app shall wait for before passing the data to the inference step. Having surplus data allows finding component overlaps without padding missing data with zeros.
- models.eqtransformer.maxGap¶
Default:
0.5
Unit: s
Type: double
(Not implemented!) Maximum gap in seconds that is tolerated for records to be considered as continuous. Note that the gap will be linearly interpolated.
- models.eqtransformer.snrThreshold¶
Default:
0
Type: double
A pick whose SNR is up to this threshold won’t be published. Won’t be applied if set to 0.
- models.eqtransformer.filter¶
Type: string
The filter to apply on the data before it is sent to the picker. This parameter is overridden by
filter
.
- models.phasenet.traceWindowOlRatio¶
Default:
0.05
Type: double
Ratio defining how much trace windows are overlapping each other.
- models.phasenet.reducePicksOrder¶
Default:
2
Type: int
(Deprecated! Don’t touch this value!) Sets the number of passes for the picks reducing algorithm.
The combination of the specific picking algorithm the model applies, picking confidence threshold and the
models.phasenet.traceWindowOlRatio
can lead to unwanted dense picks which can be reduced. If set to 0, no reduction is done. If set to 1, the reducer is applied once, which is good for performance but may leave some dense picks. If set to 2, in theory more dense picks are left. For praxis, see note inmodels.phasenet.reducePicksInterval
.
- models.phasenet.reducePicksInterval¶
Default:
5.0
Unit: s
Type: double
Sets the time span in seconds before and after a pick where no other pick is supposed to exist.
The algorithm leaves only those picks with the highest confidence in the interval.
- models.phasenet.PThreshold¶
Default:
0.1
Type: double
Threshold defining the minimum confidence of for a P pick.
- models.phasenet.SThreshold¶
Default:
0.1
Type: double
Threshold defining the minimum confidence of for an S pick.
- models.phasenet.minPrefDataRatio¶
Default:
1.33
Type: double
(Surplus) ratio of data in relation to the actually necessary (model-defined) length that the app shall wait for before passing the data to the inference step. Having surplus data allows finding component overlaps without padding missing data with zeros.
- models.phasenet.maxGap¶
Default:
0.5
Unit: s
Type: double
(Not implemented!) Maximum gap in seconds that is tolerated for records to be considered as continuous. Note that the gap will be linearly interpolated.
- models.phasenet.snrThreshold¶
Default:
0
Type: double
A pick whose SNR is up to this threshold won’t be published. Won’t be applied if set to 0.
- models.phasenet.filter¶
Type: string
The filter to apply on the data before it is sent to the picker. This parameter is overridden by
filter
.
Bindings Parameters¶
- streamChannels¶
Type: string
A list of channels as input for the picker model. Overrides global.detecStream. The order of channels is determined by the model’s "component_order" parameter. Therefore, the last letters must be distinct. E.g., ‘EHZ,HDF’. ‘BH’ would tell the model to try to log all three components (i.e.,BHZ, BHN, BHE), exactly as
global.detecStream
would do. You can further define that a channel can be neglected if no data is available by annotating it with a ‘:0’, e.g., ‘HDF:1,EHZ:0’. Another feature is the possibility to force a specific location code, even though it might not be found in the inventory: ‘DN.EHZ:1,HDF:1’. This can be useful if the record stream provides processed data via a fake location ‘DN’.
- model¶
Type: string
Name of the picker model to use. See module config for details.
- weights¶
Type: string
Name of the weights model. Usually the name of the dataset the model was trained on. See module config for details.
- traceWindowOlRatio¶
Type: double
Ratio defining how much trace windows are overlapping each other.
- batchSize¶
Type: int
Number of data windows to process per picker at the same time.
- PThreshold¶
Type: double
Set the confidence threshold for P picks.
This value overrides the same-named config parameter of the chosen model type for the corresponding set of station(s).
- SThreshold¶
Type: double
Set the confidence threshold for S picks.
This value overrides the same-named config parameter of the chosen model type for the corresponding set of station(s).
- uncertainPThresh¶
Type: double
If set, adds additional picks with a lower threshold, marking them with the prefix ‘u’. Picks above the normal
PThreshold
are not touched. This currently only applies to xml output.This value overrides the same-named global config parameter for the corresponding set of stations.
- uncertainSThresh¶
Type: double
If set, adds additional picks with a lower threshold, marking them with the prefix ‘u’. Picks above the normal
SThreshold
are not touched. This currently only applies to xml output.This value overrides the same-named global config parameter for the corresponding set of stations.
- minPrefDataRatio¶
Type: double
(Surplus) ratio of data in relation to the actually necessary (model-defined) length that the app shall wait for before passing the data to the inference step. Having surplus data allows finding component overlaps without padding missing data with zeros.
This value overrides the same-named config parameter of the chosen model type for the corresponding set of station(s).
- phaseToComp¶
Type: string
Map a pick to a component. E.g., ‘P:Z,S:Z’ would output both P and S on the Z component. Make sure that the station of this binding actually has the corresponding component.
If not set, per default, dlpick puts P picks on the vertical and S picks on the first horizontal component.
- maxGap¶
Unit: s
Type: double
(Not implemented!) Maximum gap in seconds that is tolerated for records to be considered as continuous. Note that the gap will be linearly interpolated.
- minGatherPicksBeforeReduce¶
Default:
10
Type: int
This parameter sets the number of picks that must be staged before the reduction algorithm is applied to them.
Details: Minimum-size criterion for applying the reduction algorithm to a pick buffer. When pushing to or popping from a pick buffer, this criterion is tested as well as the timeout. If one of them is true, the reduction will be applied. (There is a pick buffer for each stream.)
This value overrides the same-named global config parameter for the corresponding set of stations.
- minTimeBeforeReduce¶
Default:
10.0
Unit: s
Type: double
This parameter sets the timeout for staging picks until the reduction algorithm is applied to them. After reduction, the timer starts over again.
Details: Timeout criterion for applying the reduction algorithm to a pick buffer. When pushing to or popping from a pick buffer, this criterion is tested as well as the minimum picks number. If one of them is true, the reduction will be applied. (There is a pick buffer for each stream.)
This value overrides the same-named global config parameter for the corresponding set of stations.
- minGatherPicksBeforeFlush¶
Default:
10
Type: int
This parameter sets the number of picks that must be gathered before publishing.
Details: Minimum-size criterion for flushing a pick buffer. When the publisher process tries to pop picks from the pick buffers, this criterion is tested as well as the timeout. If one of them is true, those picks considered as safe-to-flush will be sent to the output sinks.
This value overrides the same-named global config parameter for the corresponding set of stations.
- minTimeBeforeFlush¶
Default:
10.0
Unit: s
Type: double
This parameter sets the timeout for gathering picks until they are published. After publishing, the timer starts over again with the next staging of picks, in a buffer.
Details: Time criterion for flushing a pick buffer. When the publisher process tries to pop picks from the pick buffers, this criterion is tested as well as the minimum picks number. If one of them is true, those picks considered as safe-to-flush will be sent to the output sinks.
This value overrides the same-named global config parameter for the corresponding set of stations.
Command-Line Options¶
dlpick [options]
Generic¶
- -h, --help¶
Show help message.
- -V, --version¶
Show version information.
- --config-file arg¶
Use alternative configuration file. When this option is used the loading of all stages is disabled. Only the given configuration file is parsed and used. To use another name for the configuration create a symbolic link of the application or copy it. Example: scautopick -> scautopick2.
- --plugins arg¶
Load given plugins.
- -D, --daemon¶
Run as daemon. This means the application will fork itself and doesn’t need to be started with &.
- --auto-shutdown arg¶
Enable/disable self-shutdown because a master module shutdown. This only works when messaging is enabled and the master module sends a shutdown message (enabled with --start-stop-msg for the master module).
- --shutdown-master-module arg¶
Set the name of the master-module used for auto-shutdown. This is the application name of the module actually started. If symlinks are used, then it is the name of the symlinked application.
- --shutdown-master-username arg¶
Set the name of the master-username of the messaging used for auto-shutdown. If "shutdown-master-module" is given as well, this parameter is ignored.
- --print-models¶
Print list of available models.
Verbosity¶
- --verbosity arg¶
Verbosity level [0..4]. 0:quiet, 1:error, 2:warning, 3:info, 4:debug.
- -v, --v¶
Increase verbosity level (may be repeated, eg. -vv).
- -q, --quiet¶
Quiet mode: no logging output.
- --component arg¶
Limit the logging to a certain component. This option can be given more than once.
- -s, --syslog¶
Use syslog logging backend. The output usually goes to /var/lib/messages.
- -l, --lockfile arg¶
Path to lock file.
- --console arg¶
Send log output to stdout.
- --debug¶
Execute in debug mode. Equivalent to --verbosity=4 --console=1 .
- --log-file arg¶
Use alternative log file.
Messaging¶
- -u, --user arg¶
Overrides configuration parameter
connection.username
.
- -H, --host arg¶
Overrides configuration parameter
connection.server
.
- -t, --timeout arg¶
Overrides configuration parameter
connection.timeout
.
- -g, --primary-group arg¶
Overrides configuration parameter
connection.primaryGroup
.
- -S, --subscribe-group arg¶
A group to subscribe to. This option can be given more than once.
- --content-type arg¶
Overrides configuration parameter
connection.contentType
.Default:
binary
- --start-stop-msg arg¶
Default:
0
Set sending of a start and a stop message.
Database¶
- --db-driver-list¶
List all supported database drivers.
- -d, --database arg¶
The database connection string, format: service://user:pwd@host/database. "service" is the name of the database driver which can be queried with "--db-driver-list".
- --config-module arg¶
The config module to use.
- --inventory-db arg¶
Load the inventory from the given database or file, format: [service://]location .
Records¶
- --record-driver-list¶
List all supported record stream drivers.
- -I, --record-url arg¶
The recordstream source URL, format: [service://]location[#type]. "service" is the name of the recordstream driver which can be queried with "--record-driver-list". If "service" is not given, "file://" is used.
- --record-file arg¶
Specify a file as record source.
- --record-type arg¶
Specify a type for the records being read.
Output¶
- --phase-prefix¶
Overrides configuration parameter
phasePrefix
.
- --phase-postfix¶
Overrides configuration parameter
phasePostfix
.
- --no-file-merging¶
This option prevents that, when the application finishes, all output files are merged into one, which it does per default.
Picking¶
- --snr-threshold¶
Overrides configuration parameter
models.eqtransformer.snrThreshold
.
- --p-threshold¶
Overrides configuration parameter
PThreshold
.
- --s-threshold¶
Overrides configuration parameter
SThreshold
.
- --p-uncertain¶
Overrides configuration parameter
uncertainPThresh
.
- --s-uncertain¶
Overrides configuration parameter
uncertainSThresh
.
- --ol-ratio¶
Overrides configuration parameter
traceWindowOlRatio
.
Mode¶
- --force-channels¶
Load stations by config but ignore channels specified in detecStream. Force loading streams with channels as listed. E.g., ‘HDF:1,EHZ:0’, where ‘1’ means ‘necessary’ and ‘0’ means ‘negligible’. In this example, if EHZ data is missing, zeros will be passed to the model, in its place. Attention: Since this parameter is global, use it only with a consistent model architecture.
- --offline¶
Do not connect to messaging server
- --ignore-bindings¶
Ignore station bindings read from config. Parameters from module configuration (overridden by commandline parameters) will be used for all stations.
- --prefer-bindings¶
Ignore options set by command line arguments, in case there is a specific station binding for the same parameter. This is useful when you want to test a certain value only on stations that have no bindings, but not on those with bindings, since they have their special settings for a good reason.
- --lead-time¶
(Not implemented!) Look at leadTime seconds since now.