fdsnws

Provides FDSN Web Services

Description

fdsnws is a server that provides FDSN Web Services from a SeisComP3 database and RecordStream source. Also it may be configured to serve data availability information similar to the IRIS DMC IRISWS availability Web Service

Caution

If you expose the FDSN Web Service as a public service, make sure that the database connection is read-only. fdsnws will never attempt to write into the database.

Service Overview

The following services are available:

Service Provides Provided format
fdsnws-dataselect time series data miniSEED
fdsnws-station network, station, channel, response metadata FDSN Station XML, StationXML, SC3ML
fdsnws-event earthquake origin and magnitude estimates QuakeML, SC3ML
ext-availability waveform data availability information text, geocsv, json, sync, request (fdsnws-dataselect)

The available services can be reached from the fdsnws start page. The services also provide an interactive URL builder constructing the request URL based on user inputs. The FDSN specifications can be found on FDSN Web Services.

URL

If fdsnws is started, it accepts connections by default on port 8080 which can be changed in the configuration. Also please read Changing the service port for running the services on a privileged port, e.g. port 80 as requested by the FDSNWS specification.

Note

If you decide to run the service on a different URL than localhost:8080 you have to change the URL string in the *.wadl documents located under $DATADIR/fdsnws.

DataSelect

  • provides time series data in miniSEED format
  • request type: HTTP-GET, HTTP-POST

Example

  • Request URL for querying waveform data from the GE station BKNI, all BH channels on 11 April 2013 between 00:00:00 and 12:00:00:

    http://localhost:8080/fdsnws/dataselect/1/query?net=GE&sta=BKNI&cha=BH?&start=2013-04-11T00:00:00&end=2013-04-11T12:00:00

To submit HTTP-POST requests the command line tool curl may be used:

sysop@host:~$ curl -X POST --data-binary @request.txt "http://localhost:8080/fdsnws/dataselect/1/query"

where request.txt contains the POST message body. For details read the FDSN specifications.

Feature Notes

  • quality parameter not implemented (information not available in SeisComP)
  • minimumlength parameter is not implemented
  • longestonly parameter is not implemented
  • access to restricted networks and stations is only granted through the queryauth method

The data channels exposed by this service may be restrict by defining an inventory filter, see section Filtering the inventory.

Service Configuration

  • activate serveDataSelect in the module configuration
  • configure the RecordStream in the module’s global configuration. If the data is stored in a local waveform archive the SDSArchive provides fast access to the data. For archives on remote hosts use ArcLink or FDSNWS instead.

Warning

Requesting future or delayed data may block the DataSelect service. Therefore, real-time RecordStream requests such as SeedLink should be avoided. If SeedLink is inevitable make use of the timeout and retries parameters. E.g. set the recordstream.source to localhost:18000?timeout=1&retries=0 or in case of the Combined service to slink/localhost:18000?timeout=1&retries=0;sdsarchive//home/sysop/seiscomp3/var/lib/archive.

Station

  • provides network, station, channel, response metadata
  • request type: HTTP-GET, HTTP-POST
  • stations may be filtered e.g. by geographic region and time, also the information depth level is selectable

Example

Feature Notes

  • to enable FDSNXML or StationXML support the plugins fdsnxml resp. staxml have to be loaded
  • updatedafter request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.
  • formatted: boolean, default: false
  • additional values of request parameters:
    • format:
      • standard: [xml, text]
      • additional: [fdsnxml (=xml), stationxml, sc3ml]
      • default: xml

The inventory exposed by this service may be restricted, see section Filtering the inventory.

Event

  • provides earthquake origin and magnitude estimates
  • request type: HTTP-GET
  • events may be filtered e.g. by hypocenter, time and magnitude

Example

Feature Notes

  • SeisComP does not distinguish between catalogs and contributors, but supports agencyIDs. Hence, if specified, the value of the contributor parameter is mapped to the agencyID. The file @DATADIR@/share/fdsn/contributors.xml has to be filled manually with all available agency ids
  • origin and magnitude filter parameters are always applied to preferred origin resp. preferred magnitude
  • updatedafter request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.
  • additional request parameters:
    • includepicks: boolean, default: false, works only in combination with includearrivals set to true
    • includecomments: boolean, default: true
    • formatted: boolean, default: false
  • additional values of request parameters:
    • format:
      • standard: [xml, text]
      • additional: [qml (=xml), qml-rt, sc3ml, csv]
      • default: xml

Data Availability

The data availability web service returns detailed time span information of what time series data is available at the DMC archive. The availability information can be created by scardac in the SeisComP3 database from where it is fetched by fdsnws.

The availability service is no official standard yet. This implementation aims to be compatible with the IRIS DMC IRISWS availability Web Service implementation.

  • request type: HTTP-GET, HTTP-POST
  • results may be filtered e.g. by channel code, time and quality

URL

Examples

Note

Use scardac for creating the availability information.

Feature Notes

  • The IRISWS availability implementation truncates the time spans of the returned data extents and segments to the requested start and end times (if any). This implementation truncates the start and end time only for the formats: sync and request. The text, geocsv and json format will return the exact time windows extracted from the waveform archive.

    The reasons for this derivation are:

    • Performance: With the /extent query the text, geocsv and json offer the display of the number of included time spans (show=timespancount). The data model offers no efficient way to recalculate the number of time spans represented by an extent if the extents time window is altered by the requested start and end times. The sync and request formats do not provided this counter and it is convenient to use their outputs for subsequent data requests.
    • By truncating the time windows information is lost. There would be no efficient way for a client to retrieve the exact time windows falling into a specific time span.
    • Network and station epochs returned by the Station service are also not truncated to the requested start and end times.
    • Truncation can easily be done on client side. No additional network traffic is generated.

Filtering the inventory

The channels served by the Station and DataSelect service may be filtered by specified an INI file in the stationFilter and dataSelectFilter configuration parameter. You may use the same file for both services or define a separate configuration set. Note: If distinct file names are specified and both services are activated, the inventory is loaded twice which will increase the memory consumption of this module.

[Chile]
code = CX.*.*.*

[!Exclude station APE]
code = GE.APE.*.*

[German (not restricted)]
code = GE.*.*.*
restricted = false
shared = true
archive = GFZ

The listing above shows a configuration example which includes all Chile stations. Also all not restricted German stations, with exception of the station GE.APE, are included.

The configuration is divided into several rules. The rule name is given in square brackets. A name starting with an exclamation mark defines an exclude rule, else the rule is an include. The rule name is not evaluated by the application but is plotted when debugging the rule set, see configuration parameter debugFilter.

Each rule consists of a set of attributes. The first and mandatory attribute is code which defines a regular expression for the channel code (network, station, location, channel). In addition the following optional attributes exist:

Attribute Type Network Station Location Channel
restricted Boolean X X   X
shared Boolean X X   X
netClass String X      
archive String X X    

A rule matches if all of its attributes match. The optional attributes are evaluated bottom-up where ever they are applicable. E.g. if a rule defines restricted = false but the restricted flag is not present on channel level then it is searched on station and then on network level. If no restricted attribute is found in the hierarchy, the rule will not match even if the value was set to false.

The individual rules are evaluated in order of their definition. The processing stops once a matching rule is found and the channel is included or excluded immediately. So the order of the rules is important.

One may decided to specify a pure whitelist, a pure blacklist, or to mix include and exclude rules. If neither a matching include nor exclude rule is found, then channel is only added if no other include rule exists in the entire rule set.

Changing the service port

The FDSN Web service specification defines that the Service SHOULD be available under port 80. Typically SeisComP3 runs under a user without root permissions and therefore is not allowed to bind to privileged ports (<1024). To serve on port 80 you may for instance

  • run SeisComP3 with root privileged (not recommended)
  • use a proxy Webserver, e.g. Apache with mod-proxy module
  • configure and use Authbind
  • setup Firewall redirect rules

Authbind

authbind allows a program which does not or should not run as root to bind to low-numbered ports in a controlled way. Please refer to man authbind for program descriptions. The following lines show how to install and setup authbind for the user sysop under the Ubuntu OS.

sysop@host:~$ sudo apt-get install authbind
sysop@host:~$ sudo touch /etc/authbind/byport/80
sysop@host:~$ sudo chown sysop /etc/authbind/byport/80
sysop@host:~$ sudo chmod 500 /etc/authbind/byport/80

Once authbind is configured correctly the FDSN Web services may be started as follows:

sysop@host:~$ authbind --deep seiscomp exec fdsnws

In order use authbind when starting fdsnws as SeisComP service the last line in the ~/seiscomp3/etc/init/fdsnws.py have to be commented in.

Firewall

All major Linux distributions ship with their own firewall implementations which are front-ends for the iptables kernel functions. The following line temporary adds a firewall rule which redirects all incoming traffic on port 8080 to port 80.

sysop@host:~$ sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 8080

Please refer to the documentation of your particular firewall solution on how to set up this rule permanently.

Authentication extension

The FDSNWS standard requires HTTP digest authentication as the authentication mechanism. The “htpasswd” configuration option is used to define the location of the file storing usernames and passwords of users who are allowed to get restricted data. Any user with valid credentials would have access to all restricted data.

An extension to the FDSNWS protocol has been developed in order to use email-pattern-based access control lists, which is an established authorization mechanism in SC3 (used by Arclink). It works as follows:

  • The user contacts an authentication service (based on eduGAIN AAI, e-mail, etc.) and receives a list of attributes (a token), signed by the authentication service. The validity of the token is typically 30 days.
  • The user presents the token to /auth method (HTTPS) of the dataselect service. This method is the only extension to standard FDSNWS that is required.
  • If the digital signature is valid, a temporary account for /queryauth is created. The /auth method returns username and password of this account, separated by ‘:’. The account is typically valid for 24 hours.
  • The username and password are to be used with /queryauth as usual.
  • Authorization is based on user’s e-mail address in the token and arclink-access bindings.

Configuration

The authentication extension is enabled by setting the “auth.enable” configuration option to “true” and pointing “auth.gnupgHome” to a directory where GPG stores its files. Let’s use the directory ~/seiscomp3/var/lib/gpg, which is the default.

  • First create the direcory and your own signing key:
sysop@host:~$ mkdir -m 700 ~/seiscomp3/var/lib/gpg
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --gen-key
  • Now import GPG keys of all authentication services you trust:
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --import <keys.asc
  • Finally sign all imported keys with your own key (XXXXXXXX is the ID of an imported key):
sysop@host:~$ gpg --homedir ~/seiscomp3/var/lib/gpg --edit-key XXXXXXXX sign save
  • …and set auth.enable, either using the “scconfig” tool or:
sysop@host:~$ echo "auth.enable = true" >>~/seiscomp3/etc/fdsnws.cfg

Usage example

A client like fdsnws_fetch is recommended, but also tools like wget and curl can be used. As an example, let’s request data from the restricted station AAI (assuming that we are authorized to get data of this station).

  • The first step is to obtain the token from an authentication service. Assuming that the token is saved in “token.asc”, credentials of the temporary account can be requsted using one of the following commands:
sysop@host:~$ wget --post-file token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -O cred.txt
sysop@host:~$ curl --data-binary @token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -o cred.txt
  • The resulting file “cred.txt” contains username and password separated by a colon, so one can conveniently use a shell expansion:
sysop@host:~$ wget "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -O data.mseed
sysop@host:~$ curl --digest "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -o data.mseed
  • Using the fdsnws_fetch utility, the two steps above can be combined into one:
sysop@host:~$ fdsnws_fetch -a token.asc -s 2015-12-15T16:00:00Z -e 2015-12-15T16:10:00Z -N IA -S AAI -o data.mseed

Logging

In addition to normal SC3 logs, fdsnws can create a simple HTTP access log and/or a detailed request log. The locations of log files are specified by “accessLog” and “requestLog” in fdsnws.cfg.

Both logs are text-based and line-oriented. Each line of access log contains the following fields, separated by ‘|’ (some fields can be empty):

  • service name;
  • hostname of service;
  • access time;
  • hostname of user;
  • IP address of user (proxy);
  • length of data in bytes;
  • processing time in milliseconds;
  • error message;
  • agent string;
  • HTTP response code;
  • username (if authenticated);
  • network code of GET request;
  • station code of GET request;
  • location code of GET request;
  • channel code of GET request;

Each line of request log contains a JSON object, which has the following attributes:

service
service name
userID
anonymized (numeric) user ID for statistic purposes
clientID
agent string
userEmail
e-mail address of authenticated user if using restricted data.
userLocation
JSON object containing rough user location (eg., country) for statistic purposes
created
time of request creation
status
“OK”, “NODATA”, “ERROR” or “DENIED”
bytes
length of data in bytes
finished
time of request completion
trace
request content after wildcard expansion (array of JSON objects)

Each trace object has the following attributes:

net
network code
sta
station code
loc
location code
cha
channel code
start
start time
end
end time
restricted
True if the data requires authorization
status
“OK”, “NODATA”, “ERROR” or “DENIED”
bytes
length of trace in bytes

Both logs are rotated daily. In case of access log, one week of data is kept. Request logs are compressed using bzip2 and not removed.

If trackdb.enable=true in fdsnws.cfg, then requests are additionally logged into SC3 database using the ArcLink request log schema. Be aware that the number of requests in a production system can be rather large. For example, the GEOFON datacentre is currently serving between 0.5..1 million FDSNWS requests per day.

Configuration

etc/defaults/global.cfg
etc/defaults/fdsnws.cfg
etc/global.cfg
etc/fdsnws.cfg
~/.seiscomp3/global.cfg
~/.seiscomp3/fdsnws.cfg

fdsnws inherits global options.

listenAddress

Type: IP

Defines the bind address of the server. “0.0.0.0” allows any interface to connect to this server whereas “127.0.0.0” only allows connections from localhost. Default is 0.0.0.0.

port

Type: int

Server port to listen for incoming requests. Note: The FDSN Web service specification defines the service port 80. Please refer to the documentation on how to serve on privileged ports. Default is 8080.

connections

Type: int

Number of maximum simultaneous requests. Default is 5.

queryObjects

Type: int

Maximum number of objects per query, used in fdsnws-station and fdsnws-event to limit main memory consumption. Default is 10000.

realtimeGap

Type: int

Unit: s

Restricts end time of requests to current time - realtimeGap seconds. Negative values allowed. Used in fdsnws-dataselect. WARNING: If this value is unset and a realtime recordsource (e.g. slink) is used, requests may block if end time in future is requested.

samplesM

Type: float

Maximum number of samples (in units of million) per query, used in fdsnws-dataselect to prevent a single user to block one connection with a large request.

recordBulkSize

Type: int

Unit: bytes

Set the number of bytes to buffer for each chunk of waveform data served to the client. The lower the buffer the higher the overhead of Python Twisted. The higher the buffer the higher the memory usage per request. 100kB seems to be a good trade-off. Default is 102400.

htpasswd

Type: string

Path to password file used in fdsnws-station/queryauth. The format is ‘username:password’ separated by lines. Because of the HTTP digest authentication method required by the FDSN specification, the passwords have to be stored in plain text. Default is @CONFIGDIR@/fdsnws.htpasswd.

accessLog

Type: string

Path to access log file. If unset no access log is created.

requestLog

Type: string

Path to request log file. If unset no request log is created.

corsOrigins

Type: list:string:

List of domain names Cross-Origin Resource Sharing (CORS) request may originate from. A value of ‘*’ allows any web page to embed your service. An empty value will switch of CORS requests entirely. An example of multiple domains might be: ‘https://test.domain.de, https://production.domain.de’. Default is *.

allowRestricted

Type: boolean

Enables/disables access to restricted inventory data. Default is true.

useArclinkAccess

Type: boolean

If enabled, then access to restricted waveform data is controlled by arclink-access bindings. By default authenticated users have access to all data. Default is false.

hideAuthor

Type: boolean

If enabled author information is removed from any event creationInfo element. Default is false.

hideComments

Type: boolean

If enabled event comment elements are no longer accessible. Default is false.

evaluationMode

Type: string

If set the event service will only return events having a preferred origin with a matching evaluationMode property.

eventFormats

Type: list:string

List of enabled event formats. If unspecified all supported formats are enabled.

serveDataSelect

Type: boolean

Enables/disables the DataSelect service. Default is true.

serveEvent

Type: boolean

Enables/disables the Event service. Default is true.

serveStation

Type: boolean

Enables/disables the Station service. Default is true.

serveAvailability

Type: boolean

Enables/disables the Availability service. Note: This is a non standard FDSNWS extension served under fdsnws/ext/availability. Default is false.

stationFilter

Type: string

Path to station inventory filter file.

dataSelectFilter

Type: string

Path to dataselect inventory filter file.

debugFilter

Type: boolean

If enabled a debug line is written for each stream ID explaining why a stream was added/removed by a inventory filter. Default is false.

fileNamePrefix

Type: string

Defines the prefix for the default filenames if downloading and saving data from within a browser. For data loaded using dataselect, it is thus fdsnws.mseed by default. Default is fdsnws.

eventType.whitelist

Type: list:string

List of enabled event types

eventType.blacklist

Type: list:string

List of disabled event types

Note

dataAvailability.* Provide access to waveform data availability information stored in the SeisComP3 database. In case of a SDS archive this information may be collected by scardac (SeisComP archive data availability collector).

dataAvailability.enable

Type: boolean

Enable loading of data availabilty information from SeisComP3 database. Availability information is used by station and ext/availability service. Default is false.

dataAvailability.cacheDuration

Type: int

Unit: s

Number of seconds data availabilty information is considered valid. If the duration time is exeeded the information is fetched again from the database. Default is 300.

dataAvailability.dccName

Type: string

Name of the archive use in sync format of dataavailability extent service Default is DCC.

dataAvailability.repositoryName

Type: string

Name of the archive use in some format of dataavailability extent service Default is primary.

trackdb.enable

Type: boolean

Save request log to database. Default is false.

trackdb.defaultUser

Type: string

Default user. Default is fdsnws.

auth.enable

Type: boolean

Enable auth extension. Default is false.

auth.gnupgHome

Type: string

GnuPG home directory. Default is @ROOTDIR@/var/lib/gpg.

auth.blacklist

Type: list:string

List of revoked token IDs.

Command-line

Generic

-h, --help

show help message.

-V, --version

show version information

--config-file arg

Use alternative configuration file. When this option is used the loading of all stages is disabled. Only the given configuration file is parsed and used. To use another name for the configuration create a symbolic link of the application or copy it, eg scautopick -> scautopick2.

--plugins arg

Load given plugins.

-D, --daemon

Run as daemon. This means the application will fork itself and doesn’t need to be started with &.

--auto-shutdown arg

Enable/disable self-shutdown because a master module shutdown. This only works when messaging is enabled and the master module sends a shutdown message (enabled with –start-stop-msg for the master module).

--shutdown-master-module arg

Sets the name of the master-module used for auto-shutdown. This is the application name of the module actually started. If symlinks are used then it is the name of the symlinked application.

--shutdown-master-username arg

Sets the name of the master-username of the messaging used for auto-shutdown. If “shutdown-master-module” is given as well this parameter is ignored.

Verbosity

--verbosity arg

Verbosity level [0..4]. 0:quiet, 1:error, 2:warning, 3:info, 4:debug

-v, --v

Increase verbosity level (may be repeated, eg. -vv)

-q, --quiet

Quiet mode: no logging output

--component arg

Limits the logging to a certain component. This option can be given more than once.

-s, --syslog

Use syslog logging back end. The output usually goes to /var/lib/messages.

-l, --lockfile arg

Path to lock file.

--console arg

Send log output to stdout.

--debug

Debug mode: –verbosity=4 –console=1

--log-file arg

Use alternative log file.

Database

--db-driver-list

List all supported database drivers.

-d, --database arg

The database connection string, format: service://user:pwd@host/database. “service” is the name of the database driver which can be queried with “–db-driver-list”.

--config-module arg

The configmodule to use.

--inventory-db arg

Load the inventory from the given database or file, format: [service://]location

--db-disable

Do not use the database at all

Records

--record-driver-list

List all supported record stream drivers

-I, --record-url arg

The recordstream source URL, format: [service://]location[#type]. “service” is the name of the recordstream driver which can be queried with “–record-driver-list”. If “service” is not given “file://” is used.

--record-file arg

Specify a file as record source.

--record-type arg

Specify a type for the records being read.