.. highlight:: rst .. _capssds: ####### capssds ####### **Virtual overlay file system presenting a CAPS archive directory as a read-only SDS archive.** Description =========== :program:`capssds` is a virtual overlay file system presenting a CAPS archive directory as a read-only :term:`SDS` archive with no extra disk space requirement. CAPS Directory and file names are mapped. An application reading from a file will only see :term:`miniSEED` records ordered by record start time. You may connect to the virtual SDS archive using the RecordStream SDS or directly read the single :term:`miniSEED` file. Other seismological software such as ObsPy or Seisan may read directly from the SDS archive of the files therein. .. _sec-capssds-usage: Usage ===== The virtual file system may be mounted by an unprivileged system user like `sysop` or configured by the `root` user to be automatically mounted on machine startup via an `/etc/fstab` entry or an systemd mount script. The following sections assume that the CAPS archive is located under `/home/sysop/seiscomp/var/lib/caps/archive` and the SDS archive should appear under `/tmp/sds` with all files and directories being owned by the `sysop` user. Regardless which of the following mount strategies is chosen make sure to create the target directory first: .. code-block:: sh mkdir -p /tmp/sds .. _sec-capssds-usage-unpriv: Unpriviledged user ------------------ Mount the archive: .. code-block:: sh capssds ~/seiscomp/var/lib/caps/archive /tmp/sds Unmount the archive: .. code-block:: sh fusermount -u /tmp/sds .. _sec-capssds-usage-fstab: System administrator - /etc/fstab --------------------------------- Create the /etc/fstab entry: .. code-block:: plaintext /home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds defaults 0 0 Alternatively you may define mount options, e.g., to deactivate the auto mount, grant the user the option to mount the directory himself or use the sloppy_size feature: .. code-block:: plaintext /home/sysop/seiscomp/var/lib/caps/archive /tmp/sds fuse.capssds fuse.capssds noauto,exact_size,user 0 0 Mount the archive: .. code-block:: sh mount /tmp/sds Unmount the archive: .. code-block:: sh umount /tmp/sds .. _sec-capssds-usage-systemd: System administrator - systemd ------------------------------ Create the following file under `/etc/systemd/system/tmp-sds.mount`. Please note that the file name must match the path specified under `Where` with all slashes replaced by a dash: .. code-block:: ini [Unit] Description=Mount CAPS archive as readonly miniSEED SDS After=network.target [Mount] What=/home/sysop/var/lib/caps/archive Where=/tmp/sds Type=fuse.capssds Options=defaults,allow_other [Install] WantedBy=multi-user.target Mount the archive: .. code-block:: sh systemctl start tmp-sds.mount Unmount the archive: .. code-block:: sh systemctl stop tmp-sds.mount Automatic startup: .. code-block:: sh systemctl enable tmp-sds.mount .. _sec-capssds-impl: Implementation Details ====================== :program:`capssds` makes use of the FUSE :cite:p:`fuse` is a userspace filesystem framework provided by the Linux kernel as well as the libfuse :cite:p:`libfuse` user space library. The file system provides only read access to the data files and implements only :ref:`basic operations ` required to list and read data files. It has to fulfill 2 main tasks, the :ref:`sec-capssds-impl-pathmap` of CAPS and SDS directory tree entries and the :ref:`sec-capssds-impl-conv`. :ref:`Caches ` are used the improve the performance. .. _sec-capssds-impl-ops: Supported operations -------------------- * `init` - initializes the file system * `getattr` - get file and directory attributes such as size and access rights * `access` - check for specific access rights * `open` - open a file * `read` - read data at a specific file position * `readdir` - list directory entries * `release` - release a file handle * `destroy` - shutdown the file system Please refer to `fuse.h `_ for a complete list of fuse operations. .. _sec-capssds-impl-pathmap: Path mapping ------------ CAPS uses a :ref:`comparable directory structure ` to SDS with three differences: * The channel does not use the `.D` prefix. * The day of year index is zero-based (0-365) where as SDS uses an index starting with 1 (1-366). * CAPS data files use the extension `.data`. The following example shows the translation from a CAPS data file path to an SDS file path for the stream AM.R0F05.00.SHZ for data on January 1st 2025: `2025/AM/R0F05/SHZ/AM.R0F05.00.SHZ.2025.000.data -> 2025/AM/R0F05/SHZ.D/AM.R0F05.00.SHZ.D.2025.001` Directories and file names not fulfilling the :term:`miniSEED` format specification are not listed. .. _sec-capssds-impl-conv: Data file conversion -------------------- A :ref:`CAPS data file ` contains records of certain types in the order of their arrival together with a record index for record lookup and sorting. If a process reads data, only :term:`miniSEED` records contained in the CAPS data file are returned in order of the records start time and not the order of arrival. Likewise only :term:`miniSEED` records are counted for the reported file size unless the `-o sloppy-size` option is specified. .. _sec-capssds-impl-perf: Performance optimization ------------------------ When a file is opened all :term:`miniSEED` records are copied to a memory buffer. This allows fast index based data access at the cost of main memory consumption. The number or simultaneously opened data files can be configured through the `-o cached_files` option and must match the available memory size. If an application tries to open more files than available, the action will fail. To obtain the mapped SDS file size the CAPS data file must be scanned for `miniSEED` records. Although only the header data is read this is still an expensive operation for hundreds of files. A file size cache is used containing up to `-o cached_file_sizes` entries each consuming 56 bytes of memory. File sizes recently accessed are pushed to the front of the cache. A cache item is invalidated if the modification time of the CAPS data file is more recent than the entry creation time. If your use case does not require the listing of the exact file size, you may use the `-o sloppy-size` option which will stop generating the :term:`miniSEED` file size and will return the size of the CAPS file instead. Command-Line Options ==================== :program:`capstool [options] [capsdir] mountpoint` .. _File-system specific options: File-system specific options ---------------------------- .. option:: -o caps_dir=DIR Default: ``Current working directory`` Path to the CAPS archive directory. .. option:: -o sloppy_size Return the size of the CAPS data file instead of summing up the size of all MSEED records. Although there is a cache for the MSEED file size calculating the real size is an expensive operation. If your use case does not depend on the exact size you may activate this flag for speedup. .. option:: -o cached_file_sizes=int Default: ``100000`` Type: *int* Number of file sizes to cache. Used when sloppy_size is off to avoid unnecessary recomputation of MSEED sizes. A cache entry is valid as long as neither the mtime nor size of the CAPS data file changed. Each entry consumes 56 bytes of memory. .. option:: -o cached_files=int Default: ``100`` Type: *int* Number of CAPS data files to cache \(100\). The file handle for each cached file will be kept open to speed up data access. .. _FUSE Options: FUSE Options ------------ .. option:: -h, --help Print this help text. .. option:: -V, --version Print version. .. option:: -d Enable debug output \(implies \-f\). .. option:: -o debug Enable debug output \(implies \-f\). .. option:: -f Enable foreground operation. .. option:: -s Disable multi\-threaded operation. .. option:: -o clone_fd Use separate fuse device fd for each thread \(may improve performance\). .. option:: -o max_idle_threads=int Default: ``-1`` Type: *int* The maximum number of idle worker threads allowed. .. option:: -o max_threads=int Default: ``10`` Type: *int* The maximum number of worker threads allowed. .. option:: -o kernel_cache Cache files in kernel. .. option:: -o [no]auto_cache Enable caching based on modification times. .. option:: -o no_rofd_flush Disable flushing of read\-only fd on close. .. option:: -o umask=M Type: *octal* Set file permissions. .. option:: -o uid=N Set file owner. .. option:: -o gid=N Set file group. .. option:: -o entry_timeout=T Default: ``1`` Unit: *s* Type: *float* Cache timeout for names. .. option:: -o negative_timeout=T Default: ``0`` Unit: *s* Type: *float* Cache timeout for deleted names. .. option:: -o attr_timeout=T Default: ``1`` Unit: *s* Type: *float* Cache timeout for attributes. .. option:: -o ac_attr_timeout=T Default: ``attr_timeout`` Unit: *s* Type: *float* Auto cache timeout for attributes. .. option:: -o noforget Never forget cached inodes. .. option:: -o remember=T Default: ``0`` Unit: *s* Type: *float* Remember cached inodes for T seconds. .. option:: -o modules=M1[:M2...] Names of modules to push onto filesystem stack. .. option:: -o allow_other Allow access by all users. .. option:: -o allow_root Allow access by root. .. option:: -o auto_unmount Auto unmount on process termination. .. _Options for subdir module: Options for subdir module ------------------------- .. option:: -o subdir=DIR Prepend this directory to all paths \(mandatory\). .. option:: -o [no]rellinks Transform absolute symlinks to relative. .. _Options for iconv module: Options for iconv module ------------------------ .. option:: -o from_code=CHARSET Default: ``UTF-8`` Original encoding of file names. .. option:: -o to_code=CHARSET Default: ``UTF-8`` New encoding of the file names.