fdsnws¶
Provides FDSN Web Services
Description¶
fdsnws is a server that provides FDSN Web Services from a SeisComP3 database and RecordStream source. Also it may be configured to serve data availability information similar to the IRIS DMC IRISWS availability Web Service
Caution
If you expose the FDSN Web Service as a public service, make sure that the database connection is read-only. fdsnws will never attempt to write into the database.
Service Overview¶
The following services are available:
Service | Provides | Provided format |
---|---|---|
fdsnws-dataselect | time series data | miniSEED |
fdsnws-station | network, station, channel, response metadata | FDSN Station XML, StationXML, SC3ML |
fdsnws-event | earthquake origin and magnitude estimates | QuakeML, SC3ML |
ext-availability | waveform data availability information | text, geocsv, json, sync, request (fdsnws-dataselect) |
The available services can be reached from the fdsnws start page. The services also provide an interactive URL builder constructing the request URL based on user inputs. The FDSN specifications can be found on FDSN Web Services.
URL¶
If fdsnws
is started, it accepts connections by default on port 8080 which
can be changed in the configuration. Also please read Changing the service port for
running the services on a privileged port, e.g. port 80 as requested by the
FDSNWS specification.
Note
If you decide to run the service on a different URL than localhost:8080
you have to change the URL string in the *.wadl
documents located under
$DATADIR/fdsnws
.
DataSelect¶
- provides time series data in miniSEED format
- request type: HTTP-GET, HTTP-POST
URL¶
Example¶
Request URL for querying waveform data from the GE station BKNI, all BH channels on 11 April 2013 between 00:00:00 and 12:00:00:
http://localhost:8080/fdsnws/dataselect/1/query?net=GE&sta=BKNI&cha=BH?&start=2013-04-11T00:00:00&end=2013-04-11T12:00:00
To submit HTTP-POST requests the command line tool curl
may be used:
sysop@host:~$ curl -X POST --data-binary @request.txt "http://localhost:8080/fdsnws/dataselect/1/query"
where request.txt contains the POST message body. For details read the FDSN specifications.
Feature Notes¶
quality
parameter not implemented (information not available in SeisComP)minimumlength
parameter is not implementedlongestonly
parameter is not implemented- access to restricted networks and stations is only granted through the
queryauth
method
The data channels exposed by this service may be restrict by defining an inventory filter, see section Filtering the inventory.
Service Configuration¶
- activate
serveDataSelect
in the module configuration - configure the RecordStream in the module’s global configuration. If the data is stored in a local waveform archive the SDSArchive provides fast access to the data. For archives on remote hosts use ArcLink or FDSNWS instead.
Warning
Requesting future or delayed data may block the DataSelect service.
Therefore, real-time RecordStream requests such as SeedLink
should be avoided.
If SeedLink is inevitable make use of the timeout
and
retries
parameters. E.g. set the recordstream.source
to
localhost:18000?timeout=1&retries=0
or in case of the Combined
service to
slink/localhost:18000?timeout=1&retries=0;sdsarchive//home/sysop/seiscomp3/var/lib/archive
.
Station¶
- provides network, station, channel, response metadata
- request type: HTTP-GET, HTTP-POST
- stations may be filtered e.g. by geographic region and time, also the information depth level is selectable
URL¶
Example¶
Request URL for querying the information for the GE network on response level:
http://localhost:8080/fdsnws/station/1/query?net=GE&cha=BH%3F&level=response&nodata=404
Feature Notes¶
- to enable FDSNXML or StationXML support the plugins
fdsnxml
resp.staxml
have to be loaded updatedafter
request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.formatted
: boolean, default:false
- additional values of request parameters:
- format:
- standard:
[xml, text]
- additional:
[fdsnxml (=xml), stationxml, sc3ml]
- default:
xml
- standard:
- format:
The inventory exposed by this service may be restricted, see section Filtering the inventory.
Event¶
- provides earthquake origin and magnitude estimates
- request type: HTTP-GET
- events may be filtered e.g. by hypocenter, time and magnitude
URL¶
Example¶
Request URL for fetching the event parameters within 10 degrees around 50°N/11°E starting on 18 April 2013:
http://localhost:8080/fdsnws/event/1/query?start=2018-06-01&lat=50&lon=11&maxradius=10&nodata=404
Feature Notes¶
- SeisComP does not distinguish between catalogs and contributors, but
supports agencyIDs. Hence, if specified, the value of the
contributor
parameter is mapped to the agencyID. The file@DATADIR@/share/fdsn/contributors.xml
has to be filled manually with all available agency ids - origin and magnitude filter parameters are always applied to preferred origin resp. preferred magnitude
updatedafter
request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.- additional request parameters:
includepicks
: boolean, default:false
, works only in combination withincludearrivals
set totrue
includecomments
: boolean, default:true
formatted
: boolean, default:false
- additional values of request parameters:
- format:
- standard:
[xml, text]
- additional:
[qml (=xml), qml-rt, sc3ml, csv]
- default:
xml
- standard:
- format:
Data Availability¶
The data availability web service returns detailed time span information of what time series data is available at the DMC archive. The availability information can be created by scardac in the SeisComP3 database from where it is fetched by fdsnws.
The availability service is no official standard yet. This implementation aims to be compatible with the IRIS DMC IRISWS availability Web Service implementation.
- request type: HTTP-GET, HTTP-POST
- results may be filtered e.g. by channel code, time and quality
URL¶
- http://localhost:8080/ext/availability/1/extent - Produces list of available time extents (earliest to latest) for selected channels (network, station, location and quality) and time ranges.
- http://localhost:8080/ext/availability/1/builder-extent - URL builder helping you to form your data extent requests
- http://localhost:8080/ext/availability/1/query - Produces list of contiguous time spans for selected channels (network, station, location, channel and quality) and time ranges.
- http://localhost:8080/ext/availability/1/builder - URL builder helping you to form your data time span requests
- http://localhost:8080/ext/availability/1/version
Examples¶
Request URL for data extents of seismic network
IU
:http://localhost:8080/fdsnws/ext/availability/1/extent?net=IU
Further limit the extents to those providing data for August 1st 2018:
http://localhost:8080/fdsnws/ext/availability/1/extent?net=IU&start=2018-08-01
Request URL for continues time spans of station
ANMO
in July 2018:http://localhost:8080/fdsnws/ext/availability/1/query?sta=ANMO&start=2018-07-01&end=2018-08-01
Note
Use scardac for creating the availability information.
Feature Notes¶
The IRISWS availability implementation truncates the time spans of the returned data extents and segments to the requested start and end times (if any). This implementation truncates the start and end time only for the formats:
sync
andrequest
. Thetext
,geocsv
andjson
format will return the exact time windows extracted from the waveform archive.The reasons for this derivation are:
- Performance: With the
/extent
query thetext
,geocsv
andjson
offer the display of the number of included time spans (show=timespancount
). The data model offers no efficient way to recalculate the number of time spans represented by an extent if the extents time window is altered by the requested start and end times. Thesync
andrequest
formats do not provided this counter and it is convenient to use their outputs for subsequent data requests. - By truncating the time windows information is lost. There would be no efficient way for a client to retrieve the exact time windows falling into a specific time span.
- Network and station epochs returned by the Station service are also not truncated to the requested start and end times.
- Truncation can easily be done on client side. No additional network traffic is generated.
- Performance: With the
Filtering the inventory¶
The channels served by the Station and DataSelect service
may be filtered by specified an INI file in the stationFilter
and
dataSelectFilter
configuration parameter. You may use the same file for both
services or define a separate configuration set. Note: If distinct file
names are specified and both services are activated, the inventory is loaded
twice which will increase the memory consumption of this module.
[Chile]
code = CX.*.*.*
[!Exclude station APE]
code = GE.APE.*.*
[German (not restricted)]
code = GE.*.*.*
restricted = false
shared = true
archive = GFZ
The listing above shows a configuration example which includes all Chile stations. Also all not restricted German stations, with exception of the station GE.APE, are included.
The configuration is divided into several rules. The rule name is given in
square brackets. A name starting with an exclamation mark defines an exclude
rule, else the rule is an include. The rule name is not evaluated by the
application but is plotted when debugging the rule set, see configuration
parameter debugFilter
.
Each rule consists of a set of attributes. The first and mandatory attribute is
code
which defines a regular expression for the channel code (network,
station, location, channel). In addition the following optional attributes
exist:
Attribute | Type | Network | Station | Location | Channel |
---|---|---|---|---|---|
restricted | Boolean | X | X | X | |
shared | Boolean | X | X | X | |
netClass | String | X | |||
archive | String | X | X |
A rule matches if all of its attributes match. The optional attributes are
evaluated bottom-up where ever they are applicable. E.g. if a rule defines
restricted = false
but the restricted flag is not present on channel level
then it is searched on station and then on network level. If no restricted
attribute is found in the hierarchy, the rule will not match even if the value
was set to false
.
The individual rules are evaluated in order of their definition. The processing stops once a matching rule is found and the channel is included or excluded immediately. So the order of the rules is important.
One may decided to specify a pure whitelist, a pure blacklist, or to mix include and exclude rules. If neither a matching include nor exclude rule is found, then channel is only added if no other include rule exists in the entire rule set.
Changing the service port¶
The FDSN Web service specification defines that the Service SHOULD be available under port 80. Typically SeisComP3 runs under a user without root permissions and therefore is not allowed to bind to privileged ports (<1024). To serve on port 80 you may for instance
- run SeisComP3 with root privileged (not recommended)
- use a proxy Webserver, e.g. Apache with mod-proxy module
- configure and use Authbind
- setup Firewall redirect rules
Authbind¶
authbind
allows a program which does not or should not run as root to bind
to low-numbered ports in a controlled way. Please refer to man authbind
for
program descriptions. The following lines show how to install and setup authbind
for the user sysop
under the Ubuntu OS.
sysop@host:~$ sudo apt-get install authbind
sysop@host:~$ sudo touch /etc/authbind/byport/80
sysop@host:~$ sudo chown sysop /etc/authbind/byport/80
sysop@host:~$ sudo chmod 500 /etc/authbind/byport/80
Once authbind
is configured correctly the FDSN Web services may be started
as follows:
sysop@host:~$ authbind --deep seiscomp exec fdsnws
In order use authbind
when starting fdsnws
as SeisComP service the last
line in the ~/seiscomp3/etc/init/fdsnws.py
have to be commented in.
Firewall¶
All major Linux distributions ship with their own firewall implementations which
are front-ends for the iptables
kernel functions. The following line
temporary adds a firewall rule which redirects all incoming traffic on port 8080
to port 80.
sysop@host:~$ sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 8080
Please refer to the documentation of your particular firewall solution on how to set up this rule permanently.
Authentication extension¶
The FDSNWS standard requires HTTP digest authentication as the authentication mechanism. The “htpasswd” configuration option is used to define the location of the file storing usernames and passwords of users who are allowed to get restricted data. Any user with valid credentials would have access to all restricted data.
An extension to the FDSNWS protocol has been developed in order to use email-pattern-based access control lists, which is an established authorization mechanism in SC3 (used by Arclink). It works as follows:
- The user contacts an authentication service (based on eduGAIN AAI, e-mail, etc.) and receives a list of attributes (a token), signed by the authentication service. The validity of the token is typically 30 days.
- The user presents the token to /auth method (HTTPS) of the dataselect service. This method is the only extension to standard FDSNWS that is required.
- If the digital signature is valid, a temporary account for /queryauth is created. The /auth method returns username and password of this account, separated by ‘:’. The account is typically valid for 24 hours.
- The username and password are to be used with /queryauth as usual.
- Authorization is based on user’s e-mail address in the token and arclink-access bindings.
Configuration¶
The authentication extension is enabled by setting the “auth.enable” configuration option to “true” and pointing “auth.gnupgHome” to a directory where GPG stores its files. Let’s use the directory ~/seiscomp3/var/lib/gpg, which is the default.
- First create the direcory and your own signing key:
sysop@host:~$ mkdir -m 700 ~/seiscomp3/var/lib/gpg
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --gen-key
- Now import GPG keys of all authentication services you trust:
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --import <keys.asc
- Finally sign all imported keys with your own key (XXXXXXXX is the ID of an imported key):
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --edit-key XXXXXXXX sign save
- …and set auth.enable, either using the “scconfig” tool or:
sysop@host:~$ echo "auth.enable = true" >>~/seiscomp3/etc/fdsnws.cfg
Usage example¶
A client like fdsnws_fetch is recommended, but also tools like wget and curl can be used. As an example, let’s request data from the restricted station AAI (assuming that we are authorized to get data of this station).
- The first step is to obtain the token from an authentication service. Assuming that the token is saved in “token.asc”, credentials of the temporary account can be requsted using one of the following commands:
sysop@host:~$ wget --post-file token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -O cred.txt
sysop@host:~$ curl --data-binary @token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -o cred.txt
- The resulting file “cred.txt” contains username and password separated by a colon, so one can conveniently use a shell expansion:
sysop@host:~$ wget "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -O data.mseed
sysop@host:~$ curl --digest "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -o data.mseed
- Using the fdsnws_fetch utility, the two steps above can be combined into one:
sysop@host:~$ fdsnws_fetch -a token.asc -s 2015-12-15T16:00:00Z -e 2015-12-15T16:10:00Z -N IA -S AAI -o data.mseed
Logging¶
In addition to normal SC3 logs, fdsnws can create a simple HTTP access log and/or a detailed request log. The locations of log files are specified by “accessLog” and “requestLog” in fdsnws.cfg.
Both logs are text-based and line-oriented. Each line of access log contains the following fields, separated by ‘|’ (some fields can be empty):
- service name;
- hostname of service;
- access time;
- hostname of user;
- IP address of user (proxy);
- length of data in bytes;
- processing time in milliseconds;
- error message;
- agent string;
- HTTP response code;
- username (if authenticated);
- network code of GET request;
- station code of GET request;
- location code of GET request;
- channel code of GET request;
Each line of request log contains a JSON object, which has the following attributes:
- service
- service name
- userID
- anonymized (numeric) user ID for statistic purposes
- clientID
- agent string
- userEmail
- e-mail address of authenticated user if using restricted data.
- userLocation
- JSON object containing rough user location (eg., country) for statistic purposes
- created
- time of request creation
- status
- “OK”, “NODATA”, “ERROR” or “DENIED”
- bytes
- length of data in bytes
- finished
- time of request completion
- trace
- request content after wildcard expansion (array of JSON objects)
Each trace object has the following attributes:
- net
- network code
- sta
- station code
- loc
- location code
- cha
- channel code
- start
- start time
- end
- end time
- restricted
- True if the data requires authorization
- status
- “OK”, “NODATA”, “ERROR” or “DENIED”
- bytes
- length of trace in bytes
Both logs are rotated daily. In case of access log, one week of data is kept. Request logs are compressed using bzip2 and not removed.
If trackdb.enable=true in fdsnws.cfg, then requests are additionally logged into SC3 database using the ArcLink request log schema. Be aware that the number of requests in a production system can be rather large. For example, the GEOFON datacentre is currently serving between 0.5..1 million FDSNWS requests per day.
Configuration¶
etc/defaults/global.cfg
etc/defaults/fdsnws.cfg
etc/global.cfg
etc/fdsnws.cfg
~/.seiscomp3/global.cfg
~/.seiscomp3/fdsnws.cfg
fdsnws inherits global options.
-
listenAddress
¶ Type: IP
Defines the bind address of the server. “0.0.0.0” allows any interface to connect to this server whereas “127.0.0.0” only allows connections from localhost. Default is
0.0.0.0
.
-
port
¶ Type: int
Server port to listen for incoming requests. Note: The FDSN Web service specification defines the service port 80. Please refer to the documentation on how to serve on privileged ports. Default is
8080
.
-
connections
¶ Type: int
Number of maximum simultaneous requests. Default is
5
.
-
queryObjects
¶ Type: int
Maximum number of objects per query, used in fdsnws-station and fdsnws-event to limit main memory consumption. Default is
10000
.
-
realtimeGap
¶ Type: int
Unit: s
Restricts end time of requests to current time - realtimeGap seconds. Negative values allowed. Used in fdsnws-dataselect. WARNING: If this value is unset and a realtime recordsource (e.g. slink) is used, requests may block if end time in future is requested.
-
samplesM
¶ Type: float
Maximum number of samples (in units of million) per query, used in fdsnws-dataselect to prevent a single user to block one connection with a large request.
-
recordBulkSize
¶ Type: int
Unit: bytes
Set the number of bytes to buffer for each chunk of waveform data served to the client. The lower the buffer the higher the overhead of Python Twisted. The higher the buffer the higher the memory usage per request. 100kB seems to be a good trade-off. Default is
102400
.
-
htpasswd
¶ Type: string
Path to password file used in fdsnws-station/queryauth. The format is ‘username:password’ separated by lines. Because of the HTTP digest authentication method required by the FDSN specification, the passwords have to be stored in plain text. Default is
@CONFIGDIR@/fdsnws.htpasswd
.
-
accessLog
¶ Type: string
Path to access log file. If unset no access log is created.
-
requestLog
¶ Type: string
Path to request log file. If unset no request log is created.
-
corsOrigins
¶ Type: list:string:
List of domain names Cross-Origin Resource Sharing (CORS) request may originate from. A value of ‘*’ allows any web page to embed your service. An empty value will switch of CORS requests entirely. An example of multiple domains might be: ‘https://test.domain.de, https://production.domain.de’. Default is
*
.
-
allowRestricted
¶ Type: boolean
Enables/disables access to restricted inventory data. Default is
true
.
-
useArclinkAccess
¶ Type: boolean
If enabled, then access to restricted waveform data is controlled by arclink-access bindings. By default authenticated users have access to all data. Default is
false
.
-
hideAuthor
¶ Type: boolean
If enabled author information is removed from any event creationInfo element. Default is
false
.
-
hideComments
¶ Type: boolean
If enabled event comment elements are no longer accessible. Default is
false
.
-
evaluationMode
¶ Type: string
If set the event service will only return events having a preferred origin with a matching evaluationMode property.
-
eventFormats
¶ Type: list:string
List of enabled event formats. If unspecified all supported formats are enabled.
-
serveDataSelect
¶ Type: boolean
Enables/disables the DataSelect service. Default is
true
.
-
serveEvent
¶ Type: boolean
Enables/disables the Event service. Default is
true
.
-
serveStation
¶ Type: boolean
Enables/disables the Station service. Default is
true
.
-
serveAvailability
¶ Type: boolean
Enables/disables the Availability service. Note: This is a non standard FDSNWS extension served under fdsnws/ext/availability. Default is
false
.
-
stationFilter
¶ Type: string
Path to station inventory filter file.
-
dataSelectFilter
¶ Type: string
Path to dataselect inventory filter file.
-
debugFilter
¶ Type: boolean
If enabled a debug line is written for each stream ID explaining why a stream was added/removed by a inventory filter. Default is
false
.
-
fileNamePrefix
¶ Type: string
Defines the prefix for the default filenames if downloading and saving data from within a browser. For data loaded using dataselect, it is thus fdsnws.mseed by default. Default is
fdsnws
.
-
eventType.whitelist
¶ Type: list:string
List of enabled event types
-
eventType.blacklist
¶ Type: list:string
List of disabled event types
Note
dataAvailability.* Provide access to waveform data availability information stored in the SeisComP3 database. In case of a SDS archive this information may be collected by scardac (SeisComP archive data availability collector).
-
dataAvailability.enable
¶ Type: boolean
Enable loading of data availabilty information from SeisComP3 database. Availability information is used by station and ext/availability service. Default is
false
.
-
dataAvailability.cacheDuration
¶ Type: int
Unit: s
Number of seconds data availabilty information is considered valid. If the duration time is exeeded the information is fetched again from the database. Default is
300
.
-
dataAvailability.dccName
¶ Type: string
Name of the archive use in sync format of dataavailability extent service Default is
DCC
.
-
dataAvailability.repositoryName
¶ Type: string
Name of the archive use in some format of dataavailability extent service Default is
primary
.
-
trackdb.enable
¶ Type: boolean
Save request log to database. Default is
false
.
-
trackdb.defaultUser
¶ Type: string
Default user. Default is
fdsnws
.
-
auth.enable
¶ Type: boolean
Enable auth extension. Default is
false
.
-
auth.gnupgHome
¶ Type: string
GnuPG home directory. Default is
@ROOTDIR@/var/lib/gpg
.
-
auth.blacklist
¶ Type: list:string
List of revoked token IDs.
Command-line¶
Generic¶
-
-h
,
--help
¶
show help message.
-
-V
,
--version
¶
show version information
-
--config-file
arg
¶ Use alternative configuration file. When this option is used the loading of all stages is disabled. Only the given configuration file is parsed and used. To use another name for the configuration create a symbolic link of the application or copy it, eg scautopick -> scautopick2.
-
--plugins
arg
¶ Load given plugins.
-
-D
,
--daemon
¶
Run as daemon. This means the application will fork itself and doesn’t need to be started with &.
-
--auto-shutdown
arg
¶ Enable/disable self-shutdown because a master module shutdown. This only works when messaging is enabled and the master module sends a shutdown message (enabled with –start-stop-msg for the master module).
-
--shutdown-master-module
arg
¶ Sets the name of the master-module used for auto-shutdown. This is the application name of the module actually started. If symlinks are used then it is the name of the symlinked application.
-
--shutdown-master-username
arg
¶ Sets the name of the master-username of the messaging used for auto-shutdown. If “shutdown-master-module” is given as well this parameter is ignored.
Verbosity¶
-
--verbosity
arg
¶ Verbosity level [0..4]. 0:quiet, 1:error, 2:warning, 3:info, 4:debug
-
-v
,
--v
¶
Increase verbosity level (may be repeated, eg. -vv)
-
-q
,
--quiet
¶
Quiet mode: no logging output
-
--component
arg
¶ Limits the logging to a certain component. This option can be given more than once.
-
-s
,
--syslog
¶
Use syslog logging back end. The output usually goes to /var/lib/messages.
-
-l
,
--lockfile
arg
¶ Path to lock file.
-
--console
arg
¶ Send log output to stdout.
-
--debug
¶
Debug mode: –verbosity=4 –console=1
-
--log-file
arg
¶ Use alternative log file.
Database¶
-
--db-driver-list
¶
List all supported database drivers.
-
-d
,
--database
arg
¶ The database connection string, format: service://user:pwd@host/database. “service” is the name of the database driver which can be queried with “–db-driver-list”.
-
--config-module
arg
¶ The configmodule to use.
-
--inventory-db
arg
¶ Load the inventory from the given database or file, format: [service://]location
-
--db-disable
¶
Do not use the database at all
Records¶
-
--record-driver-list
¶
List all supported record stream drivers
-
-I
,
--record-url
arg
¶ The recordstream source URL, format: [service://]location[#type]. “service” is the name of the recordstream driver which can be queried with “–record-driver-list”. If “service” is not given “file://” is used.
-
--record-file
arg
¶ Specify a file as record source.
-
--record-type
arg
¶ Specify a type for the records being read.