# README
= Exporter Script Server // General Doc Settings :toc: left :source-highlighter: highlightjs :icons: font // Github specifics ifdef::env-github[] :toc: preamble :tip-caption: :bulb: :note-caption: :information_source: :important-caption: :heavy_exclamation_mark: :caution-caption: :fire: :warning-caption: :warning: endif::[] Elizabeth Paige Harper [email protected] v1.0.0
// Custom Config :repo-url: https://github.com/VEuPathDB/util-user-dataset-handler-server :site-url: https://veupathdb.github.io/util-user-dataset-handler-server :repo-file-base: {repo-url}/blob/master
image:https://www.travis-ci.org/VEuPathDB/util-user-dataset-handler-server.svg?branch=master["Build Status", link="https://www.travis-ci.org/VEuPathDB/util-user-dataset-handler-server"] image:https://goreportcard.com/badge/github.com/VEuPathDB/util-user-dataset-handler-server["Go Report Card", link="https://goreportcard.com/report/github.com/VEuPathDB/util-user-dataset-handler-server"] image:https://img.shields.io/github/v/release/VEuPathDB/util-user-dataset-handler-server["Latest Release", link="https://github.com/VEuPathDB/util-user-dataset-handler-server/releases/latest"]
Exposes configured exporter / validation scripts over HTTP.
ifdef::env-github[] {site-url}[Rendered Readme] | endif::[] {site-url}/api.html[API Docs]
== Usage
=== In Script Container
==== Standard Use
https://github.com/VEuPathDB/dataset-handler-biom[Working example project]
==== Custom Use
To create a custom setup not based on the demo container, this server can be configured by performing the following steps.
CAUTION: Currently, this server app only supports linux & mac environments.
. Download the latest release from the {repo-url}/releases/latest[releases page]
and unpack the archive in your project directory.
. Run the command ./service gen-config
to generate a server configuration
template file. This file will be named config.tpl.yml
.
. Edit the configuration file with your desired service name and command
configurations.
+
NOTE: Check the {file-config-readme}[config file readme] for more
information about the configuration file.
. Once you have edited the configuration file, rename it and place it in the
desired relative to the service binary.
. Validate the configuration by running the command
./service check-config --config=path/to/your/config.yml
. This command will
validate the configuration and report any errors.
+
.Bad Config Example
[source, bash-session]
$ ./service --config=config.yml check-config
== Script Expectations
This server is language agnostic when it comes to calling scripts. Anything may be used as long as it can be executed via a cli call.
=== Output
The script expectations are based on the Galaxy tooling. This means scripts are
expected to output a .tgz
file suitable for being directly loaded into iRODS.
The pattern for the tar file output name must match the iRODS trigger's expected
format of dataset_u\{user-id}_t\{timestamp-int64}_p\{process-id}.tgz
. For
example: dataset_u12345_t1647364816_p132.tgz
.
This means the tar file must contain a meta.json
, dataset.json
, and a
directory named datafiles
containing the actual dataset file(s).
This server will automatically pick up that file and return it to the udis service to be loaded into iRODS.
The reason for this is the HTTP server which unpacks and starts the command does not know what belongs or doesn't belong in the iRODS tar.
=== Error handling
A script can indicate a fatal error by exiting with a non-zero status code. An
error message will be generated using the text printed to stderr
by the
script.
Errors may be plaintext, a json object, or plaintext followed by a json object.
To indicate an error is a user error and not a script failure, a json object error must be returned.
.Valid stderr
output examples
[source, shell script]
A plaintext error
Some error message that will be recorded as a script/server error.
Error output followed by a final fatal error.
Some error text followed by a {"error": "json", "message": "object"}
A structured error
{"error": "user", "message": "some user error"}
When returning both plaintext and a json object, the plaintext will be prepended to the error message in the json output.
For example:
.Raw stderr
output
Some error messages Logged while executing the script {"error": "fatal", "message": "failed to open input file"}
.Recorded error [source, json]
{ "error": "fatal", "message": "Some error messages\nLogged while executing the script\nfailed to open input file" }
==== Json Object Error Format
[source,json]
{ "error": "user", <1> "message": "some error text" <2> }
<1> The "error"
field is used to define the type of error. A value of
"user"
indicates that the error was a user error and the user import
service will record the error as such. Any other value will be recorded as
a script error.
<2> The "message"
field defines the error message to record.
CAUTION: Both fields are required.
== Config File
=== Components
.config.yml [source, yaml, linenums]
service-name: my service <1> command: executable: /some/path/to/your/executable <2> args: <3> - <> <4>
<1> The display name of the running service <2> Path to an executable script or binary to be called by the server to process uploaded datasets. <3> An array of arguments that will be passed to the configured executable. Each array element is effectively a single, quote wrapped cli arg. + WARNING: Space separated entries in the args array will be treated as a single argument. <4> An <<Variables,Injected Variable>>
=== Command Context
The configured command will be executed in an isolated subshell, but will be provided the same environment as the server itself, meaning it is possible to set environment variables for the script simply by setting them on the docker container.
The output of the script's stdout
will be piped through the server's logging
mechanism and will appear in the container logs.
The output of the script's stderr
will be both piped through the server's
logging mechanism (like stdout
) but will also be captured for parsing and
returning to the caller.
=== Variables
==== +<<cwd>>+
The working directory for the job that is running in the current request.
.Example
/workspace/12345
==== +<<date>>+
The current date formatted as YYYY-MM-DD
.
.Example
1994-02-13
==== +<<date-time>>+
The current datetime in RFC3339 format.
.Example
2018-10-31T23:37:18.013557+0500
==== +<<ds-description>>+
The user provided description for the current dataset upload.
.Example
My dataset upload containing foo and bar
==== +<<ds-name>>+
The user provided name for the current dataset upload.
.Example
My Dataset 3
==== +<<ds-summary>>+
The user provided summary of the current dataset upload.
.Example
Some summary text for my dataset upload
==== +<<ds-origin>>+
The source/origin of the user dataset should be either galaxy
or direct-upload
.
.Example
direct-upload
==== +<<ds-user-id>>+
WDK user ID of the user that uploaded the dataset.
.Example
123456
==== +<<input-files>>+
A space separated list of the files that were unpacked from the uploaded zip or tar file sorted by name ascending.
===== Examples
[#upload-tgz] .Upload Contents
dataset.tgz ├─ foo.txt ├─ bar.xml ├─ fizz.json └─ buzz/ └─ fazz.yml
.+<>+
bar.xml buzz fizz.json foo.txt
===== +<<input-files[n]>>+
Array style access of the input file list allowing retrieval of a single file name from the input file list.
====== Examples
These use <<#upload-tgz, this example input>>.
.+<<input-files[0]>>+
bar.xml
.+<<input-files[2]>>+
fizz.json
==== +<<time>>+
The current time formatted as HH:MM:SS
.
.Example
03:47:58
==== +<<timestamp>>+
The current unix timestamp in seconds.
.Example
783647299
==== <<projects>>
A comma-separated list of project identifiers.
.Example
VectorBase,PlasmoDB
==== +<<handler-params.p1>>+
Access of handler-specific parameter values from the request body of the import request.