This project is based on the Simple Metrics Collector which is a simple way of collecting web analytics and storing the data in a Cloudant database. This project breaks this concept down using a Microservices architecture, so instead of just writing data to a Cloudant database, this project adds data to a variety out outputs, depending on a runtime environment variable:
- stdout - to the terminal only
- redis_queue - to a Redis queue
- redis_pubsub - to a Redis pubsub channel
- rabbit_queue - to a RabbitMQ queue
- rabbit_pubsub - to a RabbitMQ pubsub channel
- kafka - to an Apache Kafka or IBM Message Hub topic
The fastest way to deploy this application to Bluemix is to click this Deploy to Bluemix button. Or, if you prefer working from the command line, skip to the Deploy Manually section.
Don't have a Bluemix account? If you haven't already, you'll be prompted to sign up for a Bluemix account when you click the button. Sign up, verify your email address, then return here and click the the Deploy to Bluemix button again. Your new credentials let you deploy to the platform and also to code online with Bluemix and Git. If you have questions about working in Bluemix, find answers in the Bluemix Docs.
If you haven't already, install the Cloud Foundry command line interface and connect to Bluemix.
To deploy to Bluemix, simply:
$ cf push
Note: You may notice that Bluemix assigns a URL to your application containing a random word. This is defined in the
manifest.yml
file where therandom-route
key set to the value oftrue
. This ensures that multiple people deploying this application to Bluemix do not run into naming collisions. To specify your own route, remove therandom-route
line from themanifest.yml
file and add ahost
key with the unique value you would like to use for the host name.
Privacy Notice: This web application includes code to track deployments to IBM Bluemix and other Cloud Foundry platforms. Tracking helps us measure our samples' usefulness, so we can continuously improve the content we offer to you. The following information is sent to a Deployment Tracker service on each deployment:
- Application Name (
application_name
) - Space ID (
space_id
) - Application Version (
application_version
) - Application URIs (
application_uris
)
This data is collected from the VCAP_APPLICATION
environment variable in IBM Bluemix and other Cloud Foundry platforms. IBM uses this data to track metrics around deployments of sample applications to Bluemix.
To disable deployment tracking, remove the following line from server.js
:
require("cf-deployment-tracker-client").track();
Once that line is removed, you may also uninstall the cf-deployment-tracker-client
npm package.
You can configure the installation by adding a number of custom environment variables and then restarting the application.
The value of QUEUE_TYPE can be one of stdout, redis_queue, redis_pubsub, rabbit_queue, rabbit_pubsub or kafka. If a value is not set, then 'stdout' is assumed.
The value of QUEUE_NAME determines which queue/topic the data is written to. If omitted, it takes the following values for each of the queue types:
- stdout - n/a
- redis_queue - mcqueue
- redis_pubsub - mcpubsub
- rabbit_queue - mcqueue
- rabbit_pubsub - mcpubsub
- kafka - mcqueue
ETCD_URL
determines which Etcd instance should be used for the Service Registry.
The Service Registry allows the Metrics Collector Microservice to be utilised by the Simple Search Service to log searches. This is achieved by using the Simple Service Registry module.
VCAP_SERVICES
is created for you by the Bluemix Cloud Foundry service. It defines the credentials of the attached services that this app can connect to.
Once the application is installed and configured, then your web-page needs to have code inserted into it to allow data to be collected e.g.
<html>
<body>
<div>
<a href="https://www.google.com" title="this will be tracked">Tracked Link</a>
</div>
<div>
<a href="#" onclick="javascript:_paq.push(['trackEvent', 'Menu', 'Freedom']);" title="this will be tracked">Async Tracked Link</a>
</div>
<script type="text/javascript">
var _paq = _paq || [];
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="http://mydomain.mybluemix.net/";
_paq.push(['setTrackerUrl', u+'tracker']);
_paq.push(['setSiteId', "mysite"]);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
})();
</script>
</body>
</html>
The main script tag loads the piwik.js
JavaScript from the server and records a page-tracking event. It also ensures that any link clicks are tracked too (enableLinkTracking). The example above also shows how asynchronous actions can be recorded by calling _paq.push
when the event occurs.
The only things you need to alter from this code snippet is the URL assigned to variable u
which should be the URL of your installation and the value passed to setSiteId.
To stream the events to stdout
is the default behaviour of the Simple Logging Service. Simply run the app and events will appear on the terminal.
Read the Getting Started with Redis on Compose.io. Spin up a new Redis cluster on Compose.io and use the credentials you get to feed into your Bluemix Redis by Compose service
Define your environment variable and run the process
> export QUEUE_TYPE=redis_queue
> node server.js
Queue mode: redis_queue
Connecting to Redis server on localhost:6379
CDS Labs Simple Logging Service started on port 8081 : Thu Nov 26 2015 16:32:15 GMT+0000 (GMT)
After generating some data in your web application, you can use the Redis command-line interface to check the collected data. The LLEN
command can tell you how many items have accumlated on the queue:
> redis-cli
127.0.0.1:6379> LLEN mcqueue
(integer) 26
while the RPOP
command will retrieve the oldest item on the queue:
> redis-cli
127.0.0.1:6379> RPOP mcqueue
"{\"action_name\":\"\",\"idsite\":\"mysite\",\"rec\":1,\"r\":176450,\"h\":16,\"m\":28,\"s\":14,\"url\":\"http://localhost:8000/metrics.html#\",\"$_id\":\"772aa0d070215d3b\",\"$_idts\":1448553217,\"$_idvc\":1,\"$_idn\":0,\"$_refts\":0,\"$_viewts\":1448553217,\"cs\":\"windows-1252\",\"send_image\":0,\"pdf\":1,\"qt\":0,\"realp\":0,\"wma\":0,\"dir\":0,\"fla\":1,\"java\":1,\"gears\":0,\"ag\":0,\"cookie\":1,\"res\":\"1440x900\",\"gt_ms\":7,\"type\":\"pageView\",\"ip\":\"::1\"}"
Note: if you have supplied a QUEUE_NAME
environment variable, then use that value rather than 'mcqueue' in the above examples.
Define your environment variable and run the process.
> export QUEUE_TYPE=redis_pubsub
> node server.js
Queue mode: redis_pubsub
Connecting to Redis server on localhost:6379
CDS Labs Simple Logigng Service started on port 8081 : Thu Nov 26 2015 16:32:15 GMT+0000 (GMT)
Using the Redis command-line interface, you can subscribe to the pubsub channel (mcpubsub or the value of QUEUE_NAME
you supplied):
> redis-cli
127.0.0.1:6379> SUBSCRIBE mcpubsub
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "mcpubsub"
3) (integer) 1
As you generate data in the application, you see it appear in your redis-cli
terminal:
1) "message"
2) "mcpubsub"
3) "{\"action_name\":\"\",\"idsite\":\"mysite\",\"rec\":1,\"r\":578292,\"h\":16,\"m\":35,\"s\":44,\"url\":\"http://localhost:8000/metrics.html#\",\"$_id\":\"772aa0d070215d3b\",\"$_idts\":1448553217,\"$_idvc\":1,\"$_idn\":0,\"$_refts\":0,\"$_viewts\":1448553217,\"cs\":\"windows-1252\",\"send_image\":0,\"pdf\":1,\"qt\":0,\"realp\":0,\"wma\":0,\"dir\":0,\"fla\":1,\"java\":1,\"gears\":0,\"ag\":0,\"cookie\":1,\"res\":\"1440x900\",\"gt_ms\":13,\"type\":\"pageView\",\"ip\":\"::1\"}"
Read the Getting Started with RabbitMQ on Compose.io. You need to create a RabbitMQ cluster and create a user with .*
access, as described in that document. As the Compose.io RabbitMQ service is very new, and there isn't a Bluemix service for it yet, you need to define the URL of your RabbitMQ service as a custom environment variable RABBITMQ_URL
in Bluemix or in the local environment:
export RABBITMQ_URL=amqps://myrabbbituser:[email protected]:10705/amazing-rabbitmq-72
```sh
or
```sh
export RABBITMQ_URL=amqp://localhost
Define your environment variable and run the process
> export QUEUE_TYPE=rabbit_queue
> node server.js
Queue mode: rabbit_queue
Connecting to Rabbit MQ server on amqps:*****@aws-us-east-1-portal.8.dblayer.com:10705/dazzling-rabbitmq-72
CDS Labs Simple Logging Service started on port 8081 : Fri Nov 27 2015 14:04:35 GMT+0000 (GMT)
Connected to RabbitMQ queue 'mcqueue'
After generating some data, in your web application, you should be able to use Compose.io's RabbitMQ Admin page to see the data coming in:
Define your environment variable and run the process.
> export QUEUE_TYPE=rabbit_pubsub
> node server.js
Queue mode: rabbit_pubsub
Connecting to Rabbit MQ server on amqps:*****@aws-us-east-1-portal.8.dblayer.com:10705/dazzling-rabbitmq-72
CCDS Labs Simple Logging Service started on port 8081 : Fri Nov 27 2015 15:08:53 GMT+0000 (GMT)
Connected to RabbitMQ pubsub channel 'mcpubsub'
After generating some data in your web application, you should be able to use Compose.io's RabbitMQ Admin page to see the data coming in:
Create an Message Hub instance in Bluemix. Bluemix will create the necessary environment variables
Define your environment variable and run the process.
> export QUEUE_TYPE=kafka
> node server.js
Queue mode: kafka
Connecting to Kafka MQ server
CDS Labs Simple Logging Service started on port 8081 : Fri Nov 27 2015 15:57:31 GMT+0000 (GMT)
Created topic 'mcqueue'
There is a realtime web output of each item logged at /output
. Load this page and wait for events to be displayed at the front end. This page does not load any historical data.
The Simple Logging Service is a Bluemix app that collects web metrics. Instead of storing the metrics directly in a database, it writes the data to a choice of queues (Redis, RabbitMQ and Apache Kafka). You can run this app on many instances to share the data collection load and couple it with other microservices that consume and analyse the data. It could serve as the basis of a high-volume metrics collection service.
© "Apache", "CouchDB", "Apache CouchDB" and the CouchDB logo are trademarks or registered trademarks of The Apache Software Foundation. All other brands and trademarks are the property of their respective owners.