It is designed as a web service that represents the main interface (REST API) and the controlling service for the other monitoring components. These components include:
- ElasticsSearch
- Logstash Server
- kibana
- collectd
- logstash-forwarder
- jmxtrans
It is designed for:
- Design choices - See Deliverable D4.1
##Changelog
NOTE: Changelog moved to separate file. Plese follow this link.
##Installation
The installation is mostly based on bash scripts. Future versions will likely be based on chef recepies and/or deb/rpm packages. Currently there are two types of supported installation procedures.
This type of installation is for client/cloud deployment. It installs all python modules as well as the ELK stack (ElasticSearch, logstash and kibana 4). In this particular case only local deployment is currently supported.
- Download the installation script to the desired host and make it executable, example using wget and chmod
wget https://github.com/dice-project/DICE-Monitoring/releases/download/v1.0.0/install-dmon.sh && sudo chmod +x install-dmon.sh
- Afterwards execute the installation script using sudo
sudo ./install-dmon.sh
Note: This script will clone the D-Mon repository into /opt and change the owner of this directory to ubuntu.ubuntu!
- Next change directory to the newly cloned repository and run
sudo ./dmon-start.sh -i -p 5001
The '-i' flag will install all Core components of the monitoring platform (i.e. ELK stack) and sets the appropriate permissions for all folders and files. The '-p' flag specifies the port on which D-Mon will be deployed.
- For local deployment of D-Mon one needs to issue the following command:
./dmon-start.sh -l -p 5001
By using '-l' flag it signas to the service that this is a local deployment of both ElasticSearch and Logstash server. The service will start logging into stdout.
Note: Do not execute this command as root! It will corrupt the previously set permissions and the service will be inoperable.
In case one is not interested in creating a local deployment of the service, issue the following command:
./dmon-start.sh -p 5001
By doing so it will only start the service and does not load the local deployment module.
Note: By default all the IP's are set to 0.0.0.0. This can be change using the '-e' flag when issuing the command.
Observation: Kibana 4 service is started during the bootstrapping process. After this step one can check the service status by running:
sudo service kibana4 status
In order to start the top-ing service replace the status command with start, stop or restart.
There are two vagrant files in this repository. The first file creates a deployment of 4 VM on which it automatically installs the Cloudera Manager suite.
The second file is a script that installs D-Mon and the ELK stack. This script replaces the use of '-i' flag in the above presented instructions. In order to create a local deployment for the D-Mon one has to follow the same steps as the ones previously described.
- TODO This feature is still under investigation (not scheduled for M18)
##REST API Structure NOTE: This is a preliminary structure of the REST API. It may be subject to changes!
There are two main components from this API:
- First we have the management and deployment/provisioning component called Overlord (Monitoring Management API).
- It is responsible for deployment and management of the Monitoring Core components: ElasticSearch, Logstash Server and Kibana.
- Besides it is also responsible for the auxiliary component management and deployment. These include: Collectd, Logstash-forwarder.
- Second, we have the interface used by other applications to query the DataWarehouse represented by ElasticSearch. This component is called Observer.
- It is responsible for returning the monitoring metrics in various formats (CSV, JSON, simple output).
NOTE: Future versions will include authentication for the Overlord resources.
The Overlord is composed from two major components:
- Monitoring Core represented by: ElasticSearch, LogstashServer and Kibana
- Monitoring Auxiliary represented by: Collectd, Logstash-Forwarder
GET
/v1/log
Return the log of dmon. It contains information about the last requests and the IPs from which they originated as well as the status information from variouse sub components.
The D-Mon internal logging system lists 3 types of messages. INFO messages represent debug level information, WARNING is for handeled exceptions and finaly ERROR for caught errors.
GET
/v1/overlord
Returns information regarding the current version of the Monitoring Platform.
GET
/v1/overlord/framework
Returns the currently supported frameworks.
{
"Supported Frameworks":["list_of_frameoworks"]
}
GET
/v1/overlord/framework/{fwork}
Returns the metrics configuration file for big data technologies. The response will have the file mime-type encoded. For HDFS,Yarn and Spark it is set to 'text/x-java-properties' while for Storm it is 'text/yaml'.
PUT
/v1/overlord/application/{appID}
Registers an application with D-MON and creates a unique tag for the monitored data. The tag is defined by appID.
NOTE: This feature is scheduled for development in future versions!
POST
/v1/overlord/core
Deploys all monitoring core components provided that they have values for the preset hosts. If not it deploys all components locally with default settings.
NOTE: Currently the '-l' flag of the start script dmon-start.sh does the same as the later option. Schedueled for M18.
GET
/v1/overlord/core/database
Return the current internal state of D-MON in the form of an sqlite2 database. The response has application/x-sqlite3 mimetype.
PUT
/v1/overlord/core/database
Can submit a new version of the internal database to dmon by replacing the current states with new ones. The old states are backed up before applying the changes. The database should be formatted as a sqlite3 database file and sent unsing the application/x-sqlite3 mimetype.
GET
/v1/overlord/core/status
Returns the current status of the Monitoring platform.
{
"ElasticSearch":{
"Status":"<HTTP_CODE>",
"Name":"<NAME>",
"ClusterName":"<CLUSTER_NAME>",
"version":{
"number":"<ES_VERSION>",
"BuildHash":"<HASH>",
"BuildTimestamp":"<TIMESTAMP>",
"BuildSnapshot":"<BOOL>",
"LuceneVersion":"<LC_VERSION>"
}
},
"Logstash":{
"Status":"<HTTP_CODE>",
"Version":"<VNUMBER>"
},
"Kibana":{
"Status":"<HTTP_CODE>",
"Version":"<VNUMBER>"
}
}
NOTE: Only works for local deployments. It returns the current state of local ElasticSearch, Logstash server and Kibana status information.
POST
/v1/overlord/detect/storm
Tries do detect if the current registered nodes have a valid storm deployment. It will first test if there are any nodes that have a Storm endpoint and port set. If this step fails it starts to scan all registered nodes. In case it finds the endpoint, the first topology is set in order to be monitored. Then, it sets all configurations necesary for monitoring storm automatically.
GET
/v1/overlord/nodes
Returns the current monitored nodes list. It is the same as /v1/observer/chef
.
{
"Nodes":[
{"<NodeFQDN1>":"NodeIP1"},
{"<NodeFQDN2>":"NodeIP2"},
{"<NodeFQDNn>":"NodeIPn"}
]
}
PUT
/v1/overlord/nodes
Includes the given nodes into the monitored node pools. In essence nodes are represented as a list of dictionaries. Thus, it is possible to register one to many nodes at the same time. It is possible to assign different user names and passwords to each node.
Input:
{
"Nodes":[
{
"NodeName":"<NodeFQDN1>",
"NodeIP":"<IP>",
"key":"<keyName|null>",
"username":"<uname|null>",
"password":"<pass|null>"
},
{
"NodeName":"<NodeFQDNn>",
"NodeIP":"<IP>",
"key":"<keyName|null>",
"username":"<uname|null>",
"password":"<pass|null>"
}
]
}
NOTE: Only username and key authentication is currently supported. There is a facility to use public/private key authentication which is currently undergoing testing.
POST
/v1/overlord/nodes
Bootstrap of all non monitored nodes. Installs, configures and starts collectd and logstash-forwarder on them. This feature is not recommended for testing, the usage of separate commands is preffered in order to detect network failures.
NOTE: Meant for M24. Define one json to completely populate and set up dmon-controller. It can be then used to save and share internal state by sending the json between controller instances.
GET
/v1/overlord/nodes/roles
Returns the roles currently held by each computational node.
{
"Nodes": [
{
"dice.cdh5.mng.internal": [
"storm",
"spark"
]
},
{
"dice.cdh5.w1.internal": [
"unknown"
]
},
{
"dice.cdh5.w2.internal": [
"yarn",
"spark",
"storm"
]
},
{
"dice.cdh5.w3.internal": [
"unknown"
]
}
]
}
If the node has an unknown service installed, or the roles are not specified the type is set to unknown.
PUT
/v1/overlord/nodes/roles
Modifies the roles of each nodes.
Input:
{
"Nodes": [
{
"NodeName": "<nodeFQDN>",
"Roles": [
"yarn"
]
}
POST
/v1/overlord/nodes/roles
Generates metrics configuration files for each role assigned to a node and uploads them to the required directory. It returns a list of all nodes to which a configuration of a certain type (i.e. yarn, spark, storm etc) has been uploaded.
{
"Status":{
"yarn":["list_of_yarn_nodes"],
"spark":["list_of_spark_nodes"],
"storm":["list_of_storm_nodes"],
"unknown":["list_of_unknown_nodes"]
}
}
NOTE: The directory structure is based on the Vanilla and Cloudera distribution of HDFS, Yarn and Spark. Custom installtions are not yet supported. As yarn and HDFS have the same metrics system, their tags (i.e. hdfs and yarn) are interchangable in the context of D-Mon.
GET
/v1/overlord/nodes/{nodeFQDN}
Returns information of a particular monitored node identified by nodeFQDN.
Response:
{
"NodeName":"nodeFQDN",
"Status":"<online|offline>",
"IP":"<NodeIP>",
"OS":"<Operating_Systen>",
"key":"<keyName|null>",
"username":"<uname|null>",
"password":"<pass|null>",
"chefclient":"<True|False>",
"Roles":"[listofroles]"
}
FUTURE Version: A more fine grained node status will be implemented. Currently it is boolean - online/offline. The last three elements are not implemented. These are scheduled for future versions.
PUT
/v1/overlord/nodes/{NodeFQDN}
Changes the current information of a given node. Node FQDN may not change from one version to another.
Input:
{
"NodeName":"<nodeFQDN>",
"IP":"<NodeIP>",
"OS":"<Operating_Systen>",
"Key":"<keyName|null>",
"Username":"<uname|null>",
"Password":"<pass|null>",
"LogstashInstance": "<ip_logstash>"
}
POST
/v1/overlord/nodes/{NodeFQDN}
Bootstraps specified node.
NOTE: Possible duplication with ../aux/..
branch. DEPRECATED.
DELETE
/v1/overlord/nodes/{nodeFQDN}
Stops all auxiliary monitoring components associated with a particular node.
NOTE: This does not delete nodes or configurations; it only stops collectd and logstash-forwarder on the selected nodes. DEPRECATED.
PUT
/v1/overlord/nodes/{nodeFQDN}/roles
Defines the roles each node has inside the cluster.
Input:
{
"Roles":"[list_of_roles]"
}
POST
/v1/overlord/nodes/{nodeFQDN}/roles
Redeploys metrics configuration for a specific node based on the roles assigned to it.
FUTURE WORK: This feature will be developed for future versions.
DELETE
/v1/overlord/nodes/{nodeFQDN}/purge
This resource deletes auxiliary tools from a given node and also removes all setting from D-Mon. This process is irreversible.
GET
/v1/overlord/core/es
Return a list of current hosts comprising the ES cluster core components. The first registered host is set as the default master node. All subsequent nodes are set as workers.
{
"ES Instances": [
{
"DataNode": true,
"ESClusterName": "diceMonit",
"ESCoreDebug": "0",
"ESCoreHeap": "3g",
"FieldDataCacheExpire": "6h",
"FieldDataCacheFilterExpires": "6h",
"FieldDataCacheFilterSize": "20%",
"FieldDataCacheSize": "20%",
"HostFQDN": "dice.cdh5.dmon.internal",
"IP": "127.0.0.1",
"IndexBufferSize": "30%",
"MasterNode": true,
"MinIndexBufferSize": "96mb",
"MinShardIndexBufferSize": "12mb",
"NodeName": "esCoreMaster",
"NodePort": 9200,
"NumOfReplicas": 1,
"NumOfShards": 5,
"OS": "ubuntu",
"PID": 2531,
"Status": "Running"
}
]
}
NOTE:
POST
/v1/overlord/core/es
Generates and applies the new configuration options for the ES Core components. During this request the new configuration will be generated.
NOTE: If the configuration is unchanged ES Core will not be restarted! It is possible to deploy the monitoring platform on different hosts than elasticsearch only in case that the FQDN or IP is provided.
FUTURE Work: This process needs more streamlining. It is recommended to use only local deployments for this version.
GET
/v1/overlord/core/es/config
Returns the current configuration file for ElasticSearch in the form of a YAML file.
NOTE: The first registered ElasticSearch information will be set by default to be the master node.
PUT
/v1/overlord/core/es/config
Changes the current configuration options for the Elasticsearch instance defined by it's FQDN and IP.
Input:
{
{
"DataNode": true,
"ESClusterName": "string",
"ESCoreDebug": 1,
"ESCoreHeap": "4g",
"FieldDataCacheExpires": "6h",
"FieldDataCacheFilterExpires": "6h",
"FieldDataCacheFilterSize": "20%",
"FieldDataCacheSize": "20%",
"HostFQDN": "string",
"IP": "string",
"IndexBufferSize": "30%",
"MasterNode": true,
"MinIndexBufferSize": "96mb",
"MinShardIndexBufferSize": "12mb",
"NodeName": "string",
"NodePort": 9200,
"NumOfReplicas": 0,
"NumOfShards": 1,
"OS": "unknown"
}
}
NOTE: The new configuration will not be generated at this step. Currently only ESClusterName, HostFQDN, IP, NodeName, NodePort are required. This will be changed in future versions.
GET
/v1/overlord/core/es/status/<intComp>/property/<intProp>
Returns diagnostic data about the master elasticsearch instance.
DELETE
/v1/overlord/core/es/<hostFQDN>
Stops the ElasticSearch (es) instance on a given host and removes all configuration data from DMON.
POST
/v1/overlord/core/es/<hostFQDN>/start
Start the es instance on the host identified by hostFQDN. It uses the last good generated es configuration.
POST
/v1/overlord/core/es/<hostFQDN>/stop
Stops the es instance on the host identified by hostFQDN.
POST
/v1/overlord/core/halt
Stops all core components on every node.
NOTE: Future release.
GET
/v1/overlord/core/es/<hostFQDN>/status
Returns the current status (Running, Stopped, Unknown) and PID of the es instance on the host identified by hostFQDN.
GET
/v1/overlord/core/ls
Returns the current status of all logstash server instances registered with D-Mon.
Response:
{
"LS Instances":[
{
"ESClusterName": "diceMonit",
"HostFQDN": "dice.cdh5.dmon.internal",
"IP": "109.231.121.210",
"LPort": 5000,
"LSCoreHeap": "512m",
"LSCoreSparkEndpoint": "None",
"LSCoreSparkPort": "None",
"LSCoreStormEndpoint": "None",
"LSCoreStormPort": "None",
"LSCoreStormTopology": "None",
"OS": "ubuntu",
"Status": "Running",
"udpPort": 25680
}
]
}
POST
/v1/overlord/core/ls
Starts the logstash server based on the configuration information. During this step the configuration file is first generated.
FUTURE Work: Better support for distributed deployment of logstash core service instances.
DELETE
/v1/overlord/core/ls/<hostFQDN>
Stops the logstash server instance on a given host and removes all configuration data from DMON.
GET
/v1/overlord/core/ls/config
Returns the current configuration file of Logstash Server.
PUT
/v1/overlord/ls/config
Changes the current configuration of Logstash Server.
Input:
{
"ESClusterName": "diceMonit",
"HostFQDN": "string",
"IP": "string",
"Index": "logstash",
"LPort": 5000,
"LSCoreHeap": "512m",
"LSCoreSparkEndpoint": "None",
"LSCoreSparkPort": "None",
"LSCoreStormEndpoint": "None",
"LSCoreStormPort": "None",
"LSCoreStormTopology": "None",
"LSCoreWorkers": "4",
"OS": "string",
"udpPort": 25826
}
NOTE: LS instances are bound by their FQDN this means that it can't change. Future Work: Only for local deployment of logstash server core service. Future versions will include distributed deployment.
GET
/v1/overlord/core/ls/<hostFQDN>/status
Return the status of the logstash server running on the host identified by hostFQDN.
POST
/v1/overlord/core/ls/<hostFQDN>/start
Start the logstash server instance on the host identified by hostFQDN. It will use the last good configuration.
POST
/v1/overlord/core/ls/<hostFQDN>/stop
Stops the logstash server instance on the host identified by hostFQDN.
GET
/v1/overlord/core/ls/credentials
Returns the current credentials for logstash server core service.
Response:
{
"Credentials": [
{
"Certificate":"<certificate name>",
"Key":"<key name>",
"LS Host":"<host fqdn>"
}
]
}
NOTE: Logstash server and the logstash forwarder need a private/public key in order to establish secure communications. During local deployment ('-l' flag) a default public private key-pair is created.
GET
/v1/overlord/core/ls/cert/{certName}
Returns the hosts using a specified certificate. The certificate is identified by its certName.
Response:
{
"Host":"[listofhosts]",
}
Note: By default all Nodes use the default certificate created during D-Mon initialization. This request returns a list of hosts using the specified certificate.
PUT
/v1/overlord/core/ls/cert/{certName}/{hostFQDN}
Uploads a certificate with the name given by certName and associates it with the given host identified by hostFQDN.
NOTE: The submitted certificate must use the application/x-pem-file Content-Type.
GET
/v1/overlord/core/ls/key/{keyName}
Retruns the host associated with the given key identified by keyName parameter.
Response:
{
"Host":"<LS host name>",
"Key":"<key name>"
}
PUT
/v1/overlord/core/ls/key/{keyName}/{hostFQDN}
Uploads a private key with the name given by keyName and associates it with the given host identified by hostFQDN.
NOTE: The submitted private key must use the application/x-pem-file Content-Type.
GET
/v1/overlord/core/kb
Returns information for all kibana instances.
{
"KB Instances":[{
"HostFQDN":"<FQDN>",
"IP":"<host_ip>",
"OS":"<os_type>",
"KBPort":"<kibana_port>"
"PID":"<kibana_pid>",
"KBStatus":"<Running|Stopped|Unknown>"
}
]
}
POST
/v1/overlord/core/kb
Generates the configuration file and Starts or Restarts a kibana session.
NOTE: Currently supports only one instance. No distributed deployment.
GET
/v1/overlord/core/kb/config
Returns the current configuration file for Kibana. Uses the mime-type 'text/yaml'.
PUT
/v1/overlord/core/kb/config
Changes the current configuration for Kibana
Input:
{
"HostFQDN":"<FQDN>",
"IP":"<host_ip>",
"OS":"<os_type>",
"KBPort":"<kibana_port>"
}
GET
/v1/overlord/aux
Returns basic information about auxiliary components.
FUTURE Work: Information will basically be a kind of Readme.
GET
/v1/overlord/aux/agent
Returns the current deployment status of dmon-agents.
{
"Agents": [
{
"Agent": false,
"NodeFQDN": "dice.cdh5.mng.internal"
},
{
"Agent": false,
"NodeFQDN": "dice.cdh5.w1.internal"
},
{
"Agent": false,
"NodeFQDN": "dice.cdh5.w2.internal"
},
{
"Agent": false,
"NodeFQDN": "dice.cdh5.w3.internal"
}
]
}
POST
/v1/overlord/aux/agent
Bootstraps the installation of dmon-agent services on nodes that are note marked as already active. It only works if all nodes have the same authentication.
GET
/v1/overlord/aux/deploy
Returns monitoring component status of all nodes. Similar to v2 of this resource.
{
{
"NodeFQDN":"<nodeFQDN>",
"NodeIP":"<nodeIP>",
"Monitored":"<boolean>",
"Collectd":"<status>",
"LSF":"<status>",
"LSInstance": "<ip_logstash>"
},
............................
}
NOTE: Marked as DEPRECATED. Will be deleted in future versions.
POST
/v1/overlord/aux/deploy
Deploys all auxiliary monitoring components on registered nodes and configures them.
NOTE: There are three statuses associated with each auxiliary component.
- None -> There is no aux component on the registered node
- Running -> There is the aux component on the registered node and it is currently running
- Stopped -> There is the aux component on the registered node and it is currently stopped
If the status is None than this resource will install and configure the monitoring components. However if the status is Running nothing will be done. The services with status Stopped will be restarted.
All nodes can be restarted independent from their current state using the --redeploy-all parameter.
NOTE: Marked as DEPRECATED. Will be deleted in future versions. Use v2 version of the same resource for parallel implementation of this resource.
POST
/v1/overlord/aux/deploy/{collectd|logstashfw}/{NodeName}
Deploys either collectd or logstash-forwarder to the specified node. In order to reload the configuration file the --redeploy parameter has to be set. If the current node status is None then the defined component (collectd or lsf) will be installed.
FUTURE Work: Currently configurations of both collectd and logstash-forwarder are global and can't be changed on a node by node basis.
GET
/v1/overlord/aux/interval
Returns the current polling time interval for all tools. This is a global setting. Future versions may be implemented for a node by node interval setting.
Output:
{
"Spark": "5",
"Storm": "60",
"System": "15",
"YARN": "15"
}
PUT
/v1/overlord/aux/interval
Changes the settings for all monitored systems.
Input:
{
"Spark": "5",
"Storm": "60",
"System": "15",
"YARN": "15"
}
GET
/v1/overlord/aux/{collectd|logstashfw}/config
Returns the current collectd or logstashforwarder configuration file.
PUT
/v1/overlord/aux/{collectd|logstashfw}/config
Changes the configuration/status of collectd or logstashforwarder and restarts all auxiliary components.
POST
/v1/overlord/aux/{auxComp}/start
Starts the specified auxiliary component on all nodes.
NOTE: This resource is DEPRECATED. Use v2 instead.
POST
/v1/overlord/aux/{auxComp}/stop
Stops the specified auxiliary components on all nodes.
NOTE: This resource is DEPRECATED. Use v2 instead.
POST
/v1/overlord/aux/{auxComp}/{nodeFQDN}/start
Starts the specified auxiliary component on a specific node.
NOTE: This resource is DEPRECATED. Use v2 instead.
POST
/v1/overlord/aux/{auxComp}/{nodeFQDN}/stop
Stops the specified auxiliary component on a specific node.
NOTE: This resource is DEPRECATED. Use v2 instead.
Note: Some resources have been redesigned with parallel processing in mind. These use greenlets (gevent) to parallelize as much as possible the first version of the resources. These paralel resources are marked with ../v2/... All other functionality and return functions are the same.
For the sake of brevity these resources will not be detailed. Only additional functionality will be documented.
POST
/v2/overlord/aux/deploy
Sets up the dmon-agent based on what roles are registerd for each nodes.
POST
/v2/overlord/aux/{auxComp}/{nodeFQDN}/configure
Configures dmon-agent auxComp on node nodeFQDN.
POST
/v2/overlord/aux/{auxComp}/configure
Configures dmon-agent auxComp on all nodes.
GET
/v2/overlord/aux/deploy/check
Polls dmon-agents from the current monitored cluster.
{
"Failed": [],
"Message": "Nodes updated!",
"Status": "Update"
}
If nodes don't respon they are added to the Failed list togheter with the appropiate HTTP error code.
GET
/v2/overlord/aux/status
Returns the current status of all nodes and auxiliary components:
Outputs:
{
"Aux Status": [
{
"Collectd": "Running",
"LSF": "Running",
"Monitored": true,
"NodeFQDN": "dice.cdh5.mng.internal",
"NodeIP": "109.231.121.135",
"Status": true
},
{
"Collectd": "Running",
"LSF": "Running",
"Monitored": true,
"NodeFQDN": "dice.cdh5.w1.internal",
"NodeIP": "109.231.121.194",
"Status": true
},
{
"Collectd": "Running",
"LSF": "Running",
"Monitored": true,
"NodeFQDN": "dice.cdh5.w2.internal",
"NodeIP": "109.231.121.134",
"Status": true
},
{
"Collectd": "Running",
"LSF": "Running",
"Monitored": true,
"NodeFQDN": "dice.cdh5.w3.internal",
"NodeIP": "109.231.121.156",
"Status": true
}
]
}
POST
/v2/overlord/aux/{auxComp}/start
Start auxComp on all nodes using parallel calls to the dmon-agent.
POST
/v2/overlord/aux/{auxComp}/stop
Stops auxComp on all nodes using parallel calls to the dmon-agent.
POST
/v2/overlord/aux/{auxComp}/{nodeFQDN}/start
Start auxComp on node nodeFQDN using parallel calls to the dmon-agent.
POST
/v2/overlord/aux/{auxComp}/{nodeFQDN{/stop
Stops auxComp on node nodeFQDN using parallel calls to the dmon-agent.
GET
/v1/observer/applications
Returns a list of all YARN applications/jobs on the current monitored big data cluster.
NOTE: Scheduled for future release. After M18.
GET
/v1/observer/applications/{appID}
Returns information on a particular YARN application identified by {appID}. The information will not contain monitoring data only a general overview. Similar to YARN History Server.
NOTE: Scheduled for future release. After M18
GET
/v1/observer/nodes
Returns the current monitored nodes list. Listing is only limited to node FQDN and current node IP.
NOTE: Some cloud providers assign the IP dynamically at VM startup. Because of this D-Mon treats the FQDN as a form of UUID. In future versions this might change, the FQDN being replaced/augmented with a hash.
Response:
{
"Nodes":[
{"<NodeFQDN1>":"NodeIP1"},
{"<NodeFQDN2>":"NodeIP2"},
{"<NodeFQDNn>":"NodeIPn"}
]
}
GET
/v1/observer/nodes/{nodeFQDN}
Returns information of a particular monitored node. No information is limited to non confidential information, no authentication credentials are returned. The Status field is true if dmon-agent has been already deployed while Monitored is true if it is also started. LSInstance represents the to witch logstash instance the node is assigned to.
Response:
{
"<nodeFQDN>":{
"Status":"<boolean>",
"IP":"<NodeIP>",
"Monitored":"<boolean>",
"OS":"<Operating_System>",
"LSInstance": "<ip_of_logstash>"
}
}
GET
/v1/observer/nodes/{nodeFQDN}/roles
Returns roles of the node identified by 'nodeFQDN'.
Response:
{
"Roles":["yarn","spark"]
}
NOTE: Roles are returned as a list. Some elements represent in fact more than one service, for example 'yarn' represents both 'HDFS' and 'Yarn'.
POST
/v1/observer/query/{csv/json/plain}
Returns the required metrics in csv, json or plain format.
Input:
{
"DMON":{
"query":{
"size":"<SIZEinINT>",
"ordering":"<asc|desc>",
"queryString":"<query>",
"tstart":"<startDate>",
"tstop":"<stopDate>"
}
}
}
Output depends on the option selected by the user: csv, json or plain.
NOTE: The filter metrics must be in the form of a list. Also, filtering currently works only for CSV and plain output. Future versions will include the ability to export metrics in the form of RDF+XML in concordance with the OSLC Performance Monitoring 2.0 standard. It is important to note that we will export in this format only system metrics. No Big Data framework specific metrics.
From version 0.1.3 it is possible to ommit the tstop parameter, instead it is possible to define a time window based on the current system time:
{
"DMON":{
"query":{
"size":"<SIZEinINT>",
"ordering":"<asc|desc>",
"queryString":"<query>",
"tstart":"now-30s"
}
}
}
where s stands for second or m for minites and h for hours.
From version 0.2.0 it is possible to specify custom index to be used in the query. The index definiton supports the * wildcard character.
{
"DMON":{
"query":{
"size":"<SIZEinINT>",
"index":"logstash-*",
"ordering":"<asc|desc>",
"queryString":"<query>",
"tstart":"now-30s"
}
}
}
#License
DICE Monitoring Platform
Copyright (C) 2015 Gabriel Iuhasz, Institute e-Austria Romania
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.