fdsnws

Provide FDSN Web Services.

Description

fdsnws is a server that provides event and station information by FDSN Web Services (FDSNWS [8]) from a SeisComP database and waveforms from a RecordStream source. Also it may be configured to serve data availability information as described in the IRIS DMC FDSNWS availability Web Service Documentation (IRIS DMC [12]).

Caution

If you expose the FDSN Web Service as a public service, make sure that the database connection is read-only. fdsnws will never attempt to write into the database.

Service Overview

The following services are available:

Service

Provides

Provided format

fdsnws-dataselect

time series data

miniSEED

fdsnws-station

network, station, channel, response metadata

FDSN Station XML, StationXML, SCML

fdsnws-event

earthquake origin and magnitude estimates

QuakeML, SCML

ext-availability

waveform data availability information

text, geocsv, json, sync, request (fdsnws-dataselect)

The available services can be reached from the fdsnws start page. The services also provide an interactive URL builder constructing the request URL based on user inputs. The FDSN specifications can be found on FDSNWS [8].

URL

If fdsnws is started, it accepts connections by default on port 8080 which can be changed in the configuration. Also please read Changing the Service Port for running the services on a privileged port, e.g. port 80 as requested by the FDSNWS specification.

Note

If you decide to run the service on a different URL than localhost:8080 you have to change the URL string in the *.wadl documents located under $DATADIR/fdsnws.

DataSelect

  • Provides time series data in miniSEED format

  • Request type: HTTP-GET, HTTP-POST

URL

Example

  • Request URL for querying waveform data from the GE station BKNI, all BH channels on 11 April 2013 between 00:00:00 and 12:00:00:

    http://localhost:8080/fdsnws/dataselect/1/query?net=GE&sta=BKNI&cha=BH?&start=2013-04-11T00:00:00&end=2013-04-11T12:00:00

To submit HTTP-POST requests the command line tool curl may be used:

sysop@host:~$ curl -X POST --data-binary @request.txt "http://localhost:8080/fdsnws/dataselect/1/query"

where request.txt contains the POST message body. For details read the FDSN specifications.

Feature Notes

  • quality parameter not implemented (information not available in SeisComP)

  • minimumlength parameter is not implemented

  • longestonly parameter is not implemented

  • Access to restricted networks and stations is only granted through the queryauth method

  • The data channels exposed by this service may be restricted by defining an inventory filter, see section Filtering Inventories.

  • No trimming of miniSEED records to requested time window – This FDSNWS implementation returns the records as available in its data source, e.g., SDS archive. It is guaranteed that the requested time range is within the returned data (if available in the archive) but not that it is exactly the requested time range. FDSNWS does not trim and therefore re-encode miniSEED records. The rationale for that is that miniSEED records are probably further distributed and stored in other archives and we do not see the point in modifying them. Furthermore we do not want to increase the load on the web server with that extra processing step.

Service Configuration

  • Activate serveDataSelect in the module configuration

  • Configure the RecordStream in the module’s global configuration. If the data is stored in a local waveform archive the SDSArchive provides fast access to the data. For archives on remote hosts use ArcLink or FDSNWS instead.

Warning

Requesting future or delayed data may block the DataSelect service. Therefore, real-time RecordStream requests such as SeedLink should be avoided. If SeedLink is inevitable make use of the timeout and retries parameters. E.g. set the recordstream to slink://localhost:18000?timeout=1&retries=0 or in case of the Combined service to combined://slink/localhost:18000?timeout=1&retries=0;sdsarchive//home/sysop/seiscomp/var/lib/archive.

Station

  • Provides network, station, channel, response metadata

  • Request type: HTTP-GET, HTTP-POST

  • Stations may be filtered e.g. by geographic region and time, also the information depth level is selectable

  • Optionally handles time-based conditional HTTP-GET requests as specified by RFC 7232.

URL

Example

Feature Notes

  • To enable FDSNXML or StationXML support load the plugin fdsnxml. The plugin is loaded by default configuration.

  • updatedafter request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.

  • formatted: boolean, default: false

  • Additional values of request parameters:

    • format:

      • standard: [xml, text]

      • additional: [fdsnxml (=xml), stationxml, sc3ml]

      • default: xml

The inventory exposed by this service may be restricted, see section Filtering Inventories.

Event

  • Provides earthquake origin and magnitude estimates

  • Request type: HTTP-GET

  • Events may be filtered e.g. by hypocenter, time and magnitude

URL

Example

Feature Notes

  • SeisComP does not distinguish between catalogs and contributors, but supports agencyIDs. Hence, if specified, the value of the contributor parameter is mapped to the agencyID. The file @DATADIR@/share/fdsn/contributors.xml has to be filled manually with all available agency ids

  • Origin and magnitude filter parameters are always applied to preferred origin resp. preferred magnitude

  • updatedafter request parameter not implemented: The last modification time in SeisComP is tracked on the object level. If a child of an object is updated the update time is not propagated to all parents. In order to check if a station was updated all children must be evaluated recursively. This operation would be much too expensive.

  • Additional request parameters:

    • includepicks: boolean, default: false, works only in combination with includearrivals set to true

    • includecomments: boolean, default: true

    • formatted: boolean, default: false

  • Additional values of request parameters:

    • format:

      • standard: [xml, text]

      • additional: [qml (=xml), qml-rt, sc3ml, csv]

      • default: xml

Data Availability

The data availability web service returns detailed time span information of what time series data is available at the DMC archive. The availability information can be created by scardac in the SeisComP database from where it is fetched by fdsnws.

The availability service is no official standard yet. This implementation aims to be compatible with the IRIS DMC availability FDSN Web Service (IRIS DMC [12]) implementation.

  • request type: HTTP-GET, HTTP-POST

  • results may be filtered e.g. by channel code, time and quality

URL

Examples

Note

Use scardac for creating the availability information.

Feature Notes

  • The IRISWS availability implementation truncates the time spans of the returned data extents and segments to the requested start and end times (if any). This implementation truncates the start and end time only for the formats: sync and request. The text, geocsv and json format will return the exact time windows extracted from the waveform archive.

    The reasons for this derivation are:

    • performance: With the /extent query the text, geocsv and json offer the display of the number of included time spans (show=timespancount). The data model offers no efficient way to recalculate the number of time spans represented by an extent if the extents time window is altered by the requested start and end times. The sync and request formats do not provided this counter and it is convenient to use their outputs for subsequent data requests.

    • by truncating the time windows information is lost. There would be no efficient way for a client to retrieve the exact time windows falling into a specific time span.

    • network and station epochs returned by the Station service are also not truncated to the requested start and end times.

    • truncation can easily be done on client side. No additional network traffic is generated.

Filtering Inventories

The channels served by the Station and DataSelect services may be filtered by specified an INI file in the stationFilter and dataSelectFilter configuration parameter. You may use the same file for both services or define a separate configuration set. Note: If distinct file names are specified and both services are activated, the inventory is loaded twice which will increase the memory consumption of this module.

[Chile]
code = CX.*.*.*

[!Exclude station APE]
code = GE.APE.*.*

[German (not restricted)]
code = GE.*.*.*
restricted = false
shared = true
archive = GFZ

The listing above shows a configuration example which includes all Chile stations. Also all not restricted German stations, with exception of the station GE.APE, are included.

The configuration is divided into several rules. The rule name is given in square brackets. A name starting with an exclamation mark defines an exclude rule, else the rule is an include. The rule name is not evaluated by the application but is plotted when debugging the rule set, see configuration parameter debugFilter.

Each rule consists of a set of attributes. The first and mandatory attribute is code which defines a regular expression for the channel code (network, station, location, channel). In addition the following optional attributes exist:

Attribute

Type

Network

Station

Location

Channel

restricted

Boolean

X

X

X

shared

Boolean

X

X

X

netClass

String

X

archive

String

X

X

A rule matches if all of its attributes match. The optional attributes are evaluated bottom-up where ever they are applicable. E.g. if a rule defines restricted = false but the restricted flag is not present on channel level then it is searched on station and then on network level. If no restricted attribute is found in the hierarchy, the rule will not match even if the value was set to false.

The individual rules are evaluated in order of their definition. The processing stops once a matching rule is found and the channel is included or excluded immediately. So the order of the rules is important.

One may decided to specify a pure whitelist, a pure blacklist, or to mix include and exclude rules. If neither a matching include nor exclude rule is found, then channel is only added if no other include rule exists in the entire rule set.

Changing the Service Port

The FDSN Web service specification defines that the Service SHOULD be available under port 80. Typically SeisComP runs under a user without root permissions and therefore is not allowed to bind to privileged ports (<1024). To serve on port 80 you may for instance

  • Run SeisComP with root privileged (not recommended)

  • Use a proxy Webserver, e.g. Apache with mod-proxy module

  • Configure and use Authbind

  • Setup Firewall redirect rules

Authbind

authbind allows a program which does not or should not run as root to bind to low-numbered ports in a controlled way. Please refer to man authbind for program descriptions. The following lines show how to install and setup authbind for the user sysop under the Ubuntu OS.

sysop@host:~$ sudo apt-get install authbind
sysop@host:~$ sudo touch /etc/authbind/byport/80
sysop@host:~$ sudo chown sysop /etc/authbind/byport/80
sysop@host:~$ sudo chmod 500 /etc/authbind/byport/80

Once authbind is configured correctly the FDSN Web services may be started as follows:

sysop@host:~$ authbind --deep seiscomp exec fdsnws

In order use authbind when starting fdsnws as SeisComP service the last line in the ~/seiscomp/etc/init/fdsnws.py have to be commented in.

Firewall

All major Linux distributions ship with their own firewall implementations which are front-ends for the iptables kernel functions. The following line temporary adds a firewall rule which redirects all incoming traffic on port 8080 to port 80.

sysop@host:~$ sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 8080

Please refer to the documentation of your particular firewall solution on how to set up this rule permanently.

Authentication Extension

The FDSNWS standard requires HTTP digest authentication as the authentication mechanism. The “htpasswd” configuration option is used to define the location of the file storing usernames and passwords of users who are allowed to get restricted data. Any user with valid credentials would have access to all restricted data.

An extension to the FDSNWS protocol has been developed in order to use email-pattern-based access control lists, which is an established authorization mechanism in SeisComP3 (used by Arclink). It works as follows:

  • The user contacts an authentication service (based on eduGAIN AAI, e-mail, etc.) and receives a list of attributes (a token), signed by the authentication service. The validity of the token is typically 30 days.

  • The user presents the token to /auth method (HTTPS) of the dataselect service. This method is the only extension to standard FDSNWS that is required.

  • If the digital signature is valid, a temporary account for /queryauth is created. The /auth method returns username and password of this account, separated by ‘:’. The account is typically valid for 24 hours.

  • The username and password are to be used with /queryauth as usual.

  • Authorization is based on user’s e-mail address in the token and arclink-access bindings.

Configuration

The authentication extension is enabled by setting the “auth.enable” configuration option to “true” and pointing “auth.gnupgHome” to a directory where GPG stores its files. Let’s use the directory ~/seiscomp/var/lib/gpg, which is the default.

  • First create the direcory and your own signing key:

    sysop@host:~$ mkdir -m 700 ~/seiscomp/var/lib/gpg
    sysop@host:~$ gpg --homedir ~/seiscomp/var/lib/gpg --gen-key
    
  • Now import GPG keys of all authentication services you trust:

    sysop@host:~$ gpg --homedir ~/seiscomp/var/lib/gpg --import <keys.asc
    
  • Finally sign all imported keys with your own key (XXXXXXXX is the ID of an imported key):

    sysop@host:~$ gpg --homedir ~/seiscomp/var/lib/gpg --edit-key XXXXXXXX sign save
    
  • …and set auth.enable, either using the “scconfig” tool or:

    sysop@host:~$ echo "auth.enable = true" >> ~/seiscomp/etc/fdsnws.cfg
    

Usage Example

A client like fdsnws_fetch is recommended, but also tools like wget and curl can be used. As an example, let’s request data from the restricted station AAI (assuming that we are authorized to get data of this station).

  • The first step is to obtain the token from an authentication service. Assuming that the token is saved in “token.asc”, credentials of the temporary account can be requested using one of the following commands:

    sysop@host:~$ wget --post-file token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -O cred.txt
    sysop@host:~$ curl --data-binary @token.asc https://geofon.gfz-potsdam.de/fdsnws/dataselect/1/auth -o cred.txt
    
  • The resulting file “cred.txt” contains username and password separated by a colon, so one can conveniently use a shell expansion:

    sysop@host:~$ wget "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -O data.mseed
    sysop@host:~$ curl --digest "http://`cat cred.txt`@geofon.gfz-potsdam.de/fdsnws/dataselect/1/queryauth?starttime=2015-12-15T16:00:00Z&endtime=2015-12-15T16:10:00Z&network=IA&station=AAI" -o data.mseed
    
  • Using the fdsnws_fetch utility, the two steps above can be combined into one:

    sysop@host:~$ fdsnws_fetch -a token.asc -s 2015-12-15T16:00:00Z -e 2015-12-15T16:10:00Z -N IA -S AAI -o data.mseed
    

Logging

In addition to normal SeisComP logs, fdsnws can create a simple HTTP access log and/or a detailed request log. The locations of log files are specified by “accessLog” and “requestLog” in fdsnws.cfg.

Both logs are text-based and line-oriented. Each line of access log contains the following fields, separated by ‘|’ (some fields can be empty):

  • service name;

  • hostname of service;

  • access time;

  • hostname of user;

  • IP address of user (proxy);

  • length of data in bytes;

  • processing time in milliseconds;

  • error message;

  • agent string;

  • HTTP response code;

  • username (if authenticated);

  • network code of GET request;

  • station code of GET request;

  • location code of GET request;

  • channel code of GET request;

Each line of request log contains a JSON object, which has the following attributes:

service

service name

userID

anonymized (numeric) user ID for statistic purposes

clientID

agent string

userEmail

e-mail address of authenticated user if using restricted data

auth

True if user is authenticated (not anonymous)

userLocation

JSON object containing rough user location (eg., country) for statistic purposes

created

time of request creation

status

“OK”, “NODATA”, “ERROR” or “DENIED”

bytes

length of data in bytes

finished

time of request completion

trace

request content after wildcard expansion (array of JSON objects)

Each trace object has the following attributes:

net

network code

sta

station code

loc

location code

cha

channel code

start

start time

end

end time

restricted

True if the data requires authorization

status

“OK”, “NODATA”, “ERROR” or “DENIED”

bytes

length of trace in bytes

Both logs are rotated daily. In case of access log, one week of data is kept. Request logs are compressed using bzip2 and not removed.

If trackdb.enable=true in fdsnws.cfg, then requests are additionally logged into SeisComP database using the ArcLink request log schema. Be aware that the number of requests in a production system can be rather large. For example, the GEOFON datacentre is currently serving between 0.5..1 million FDSNWS requests per day.

Public FDSN Web Servers

IRIS maintains a list of data centers (FDSN data centers [6]) supporting FDSN Web Services (FDSNWS [8]).

Module Configuration

etc/defaults/global.cfg
etc/defaults/fdsnws.cfg
etc/global.cfg
etc/fdsnws.cfg
~/.seiscomp/global.cfg
~/.seiscomp/fdsnws.cfg

fdsnws inherits global options.

listenAddress

Default: 0.0.0.0

Type: IP

Define the bind address of the server. "0.0.0.0" allows any interface to connect to this server whereas "127.0.0.0" only allows connections from localhost.

port

Default: 8080

Type: int

Server port to listen for incoming requests. Note: The FDSN Web service specification defines the service port 80. Please refer to the documentation on how to serve on privileged ports.

connections

Default: 5

Type: int

Number of maximum simultaneous requests.

queryObjects

Default: 10000

Type: int

Maximum number of objects per query, used in fdsnws-station and fdsnws-event to limit main memory consumption.

realtimeGap

Type: int

Unit: s

Restrict end time of requests to current time - realtimeGap seconds. Negative values allowed. Used in fdsnws-dataselect. WARNING: If this value is unset and a realtime recordsource (e.g. slink) is used, requests may block if end time in future is requested.

samplesM

Type: float

Maximum number of samples (in units of million) per query, used in fdsnws-dataselect to prevent a single user to block one connection with a large request.

recordBulkSize

Default: 102400

Type: int

Unit: bytes

Set the number of bytes to buffer for each chunk of waveform data served to the client. The lower the buffer the higher the overhead of Python Twisted. The higher the buffer the higher the memory usage per request. 100kB seems to be a good trade-off.

htpasswd

Default: @CONFIGDIR@/fdsnws.htpasswd

Type: string

Path to password file used in fdsnws-dataselect/queryauth. The format is ‘username:password’ separated by lines. Because of the HTTP digest authentication method required by the FDSN specification, the passwords have to be stored in plain text.

accessLog

Type: string

Path to access log file. If unset no access log is created.

requestLog

Type: string

Path to request log file. If unset no request log is created.

userSalt

Type: string

Secret salt for calculating userID.

corsOrigins

Default: *

Type: list:string:

List of domain names Cross-Origin Resource Sharing (CORS) request may originate from. A value of ‘*’ allows any web page to embed your service. An empty value will switch of CORS requests entirely. An example of multiple domains might be: ‘https://test.domain.de, https://production.domain.de’.

allowRestricted

Default: true

Type: boolean

Enable/disable access to restricted inventory data.

handleConditionalRequests

Default: false

Type: boolean

Enable/disable handling of time-based conditional requests (RFC 7232) by the fdsnws-station resource.

useArclinkAccess

Default: false

Type: boolean

If enabled, then access to restricted waveform data is controlled by arclink-access bindings. By default authenticated users have access to all data.

hideAuthor

Default: false

Type: boolean

If enabled, author information is removed from any event creationInfo element.

hideComments

Default: false

Type: boolean

If enabled, event comment elements are no longer accessible.

evaluationMode

Type: string

If set, the event service will only return events having a preferred origin with a matching evaluationMode property.

eventFormats

Type: list:string

List of enabled event formats. If unspecified, all supported formats are enabled.

serveDataSelect

Default: true

Type: boolean

Enable/disable the DataSelect service.

serveEvent

Default: true

Type: boolean

Enable/disable the Event service.

serveStation

Default: true

Type: boolean

Enable/disable the Station service.

serveAvailability

Default: false

Type: boolean

Enable/disable the Availability service. Note: This is a non standard FDSNWS extension served under fdsnws/ext/availability.

stationFilter

Type: string

Path to station inventory filter file.

dataSelectFilter

Type: string

Path to dataselect inventory filter file.

debugFilter

Default: false

Type: boolean

If enabled, a debug line is written for each stream ID explaining why a stream was added/removed by a inventory filter.

fileNamePrefix

Default: fdsnws

Type: string

Define the prefix for the default filenames if downloading and saving data from within a browser. For data loaded using dataselect, it is thus fdsnws.mseed by default.

eventType.whitelist

Type: list:string

List of enabled event types

eventType.blacklist

Type: list:string

List of disabled event types

Note

dataAvailability.* Provide access to waveform data availability information stored in the SeisComP database. In case of a SDS archive, this information may be collected by scardac (SeisComP archive data availability collector).

dataAvailability.enable

Default: false

Type: boolean

Enable loading of data availabilty information from SeisComP database. Availability information is used by station and ext/availability service.

dataAvailability.cacheDuration

Default: 300

Type: int

Unit: s

Number of seconds data availabilty information is considered valid. If the duration time is exceeded, the information is fetched again from the database.

dataAvailability.dccName

Default: DCC

Type: string

Name of the archive use in sync format of dataavailability extent service.

dataAvailability.repositoryName

Default: primary

Type: string

Name of the archive use in some format of data availability extent service.

trackdb.enable

Default: false

Type: boolean

Save request log to database.

trackdb.defaultUser

Default: fdsnws

Type: string

Default user.

auth.enable

Default: false

Type: boolean

Enable auth extension.

auth.gnupgHome

Default: @ROOTDIR@/var/lib/gpg

Type: string

GnuPG home directory.

auth.blacklist

Type: list:string

List of revoked token IDs.

Command-Line Options

fdsnws [options]

Generic

-h, --help

Show help message.

-V, --version

Show version information.

--config-file arg

Use alternative configuration file. When this option is used the loading of all stages is disabled. Only the given configuration file is parsed and used. To use another name for the configuration create a symbolic link of the application or copy it. Example: scautopick -> scautopick2.

--plugins arg

Load given plugins.

-D, --daemon

Run as daemon. This means the application will fork itself and doesn’t need to be started with &.

--auto-shutdown arg

Enable/disable self-shutdown because a master module shutdown. This only works when messaging is enabled and the master module sends a shutdown message (enabled with --start-stop-msg for the master module).

--shutdown-master-module arg

Set the name of the master-module used for auto-shutdown. This is the application name of the module actually started. If symlinks are used, then it is the name of the symlinked application.

--shutdown-master-username arg

Set the name of the master-username of the messaging used for auto-shutdown. If "shutdown-master-module" is given as well, this parameter is ignored.

Verbosity

--verbosity arg

Verbosity level [0..4]. 0:quiet, 1:error, 2:warning, 3:info, 4:debug.

-v, --v

Increase verbosity level (may be repeated, eg. -vv).

-q, --quiet

Quiet mode: no logging output.

--component arg

Limit the logging to a certain component. This option can be given more than once.

-s, --syslog

Use syslog logging backend. The output usually goes to /var/lib/messages.

-l, --lockfile arg

Path to lock file.

--console arg

Send log output to stdout.

--debug

Execute in debug mode. Equivalent to --verbosity=4 --console=1 .

--log-file arg

Use alternative log file.

Database

--db-driver-list

List all supported database drivers.

-d, --database arg

The database connection string, format: service://user:pwd@host/database. "service" is the name of the database driver which can be queried with "--db-driver-list".

--config-module arg

The config module to use.

--inventory-db arg

Load the inventory from the given database or file, format: [service://]location .

--db-disable

Do not use the database at all

Records

--record-driver-list

List all supported record stream drivers.

-I, --record-url arg

The recordstream source URL, format: [service://]location[#type]. "service" is the name of the recordstream driver which can be queried with "--record-driver-list". If "service" is not given, "file://" is used.

--record-file arg

Specify a file as record source.

--record-type arg

Specify a type for the records being read.