-
Notifications
You must be signed in to change notification settings - Fork 12
Factotum Server User Guide
In order to operate Factotum Server, you need to have a local copy of Factotum and Consul running on the same server.
The only required arg is --factotum-bin
which defines the path to the Factotum binary.
All other args are optional and have predefined defaults, as listed in the table below.
Arg | Default | Description |
---|---|---|
factotum-bin=<path> |
None - must be provided | Path to Factotum binary file. |
ip=<address> |
0.0.0.0 |
Specify binding IP address. |
port=<number> |
3000 |
Specify port number. |
log-level=<level> |
WARN |
Specify logging level. |
max-jobs=<size> |
1000 |
Max size of job requests queue. |
max-workers=<size> |
20 |
Max number of workers. |
webhook=<url> |
None | Factotum arg to post updates on job execution to the specified URL. |
no-colour |
false |
Factotum arg to turn off ANSI terminal colours/formatting in output. |
consul-name=<name> |
factotum |
Specify node name of Consul server agent. |
consul-ip=<address> |
127.0.0.1 |
Specify IP address for Consul server agent. |
consul-port=<number> |
8500 |
Specify port number for Consul server agent. |
consul-namespace=<namespace> |
com.snowplowanalytics/factotum |
Specify namespace of job references stored in Consul persistence. |
max-stdouterr-size=<bytes> |
10000 |
The maximum size of the individual stdout/err sent via the webhook functions for job updates. |
The server currently has two modes of operation: run
and drain
:
-
run
mode: normal behaviour where job requests are validated, accepted, and queued for processing -
drain
mode: stops accepting any further job requests, but continues to work on anything that is queued
Jobs are scheduled by accepting job requests in a certain JSON format that specifies the factfile location and any additional arguments that would normally be sent to the Factotum CLI - here is a simple example:
{
"jobName": "echotest",
"factfilePath": "/factotum/samples/echo.factfile",
"factfileArgs": [ "--tag", "foo,bar", "--no-colour" ]
}
An additional field is generated in the same way as the Factotum's job reference, which is a hash of the factfile contents and any tags appended to it. This id, unique to that particular job, is used to ensure that no duplicate jobs are queued or are running at the same time. Saving state through Consul makes this possible.
Consul is used as a persistence layer to track state of job requests that is compatible in a distributed system. Using the job ID, a job entry exists in the storage for each unique job that has been sent:
{
"state": "DONE",
"jobRequest": {
"jobId": "6ef5cf55f2815a098695861ec5f8e9c7122dd2e5339cf954d7e2cb2e761dd583",
"jobName": "echotest",
"factfilePath":"/vagrant/sleep.factfile",
"factfileArgs": ["--tag", "foo,bar", "--no-colour"]
},
"lastRunFrom": "factotum",
"lastOutcome": "SUCCEEDED"
}
In addition to the original job request, the other fields associated with an entry are:
-
state
: One ofQUEUED
,WORKING
, orDONE
- If an entry is
QUEUED
orWORKING
, then any new requests for that particular job will not be accepted by the server - Only jobs that have completed (
DONE
) or have never run before are valid to be scheduled
- If an entry is
-
lastRunFrom
: The Consul server id that last updated the entry of the last job run -
lastOutcome
: Outcome of the last runWAITING
,RUNNING
,SUCCEEDED
, orFAILED
A "worker queue" model is used for scheduling:
- New requests are validated and appended to a queue of job requests (
state == QUEUED
) - The next available worker is notified to check the queue
- The worker processes the job by executing Factotum in a separate thread (
state == WORKING
) - The worker persists outcome on completion (
state == DONE
) - The worker checks the queue again for any further requests
-
GET /status
General statistics on server and queue- version
- state
- start time, uptime
- workers
- queue size
curl http://localhost:3000/status?pretty=1
-
GET /check
Check status of a job request- retrieves job entry from persistence storage by id
curl http://localhost:3000/check?pretty=1&id=6ef5cf55f2815a098695861ec5f8e9c7122dd2e5339cf954d7e2cb2e761dd583
-
POST /settings
Update server settings and behaviour- run: all normal behaviour
- drain: refuse any more job requests
curl http://localhost:3000/settings -X POST -d '{"state":"run"}'
-
POST /submit
Submit a job with a DAG to the queue- validates job request
- performs dry run
- checks job is not already running
- adds request to queue
curl http://localhost:3000/submit -X POST -d '{"jobName":"echotest","factfilePath":"/tmp/echo.factfile","factfileArgs":["--tag","foo,bar","--no-colour"]}'