I was previously task to look into viable monitoring system that is flexible enough to suite into our plans of having a well monitored and orchestrated infrastructure from apps to servers – possibly all on open-source instruments. I have yet to update the specific test and installation methods / shortcuts (if any) but i will find time to do so.
I recently went into check_mk + nagios and possibly nagios + collectd + graphite kind of setup, with orchestration from ansible. I must say i am a newbie to some of these open-source tools but this adventure has taught me more than i expected to know / learn and it has somewhat grew the curiosity and interest of me in getting into depth on certain areas.
Well on this post, i am sharing Riemann monitoring system which seem very flexible and promising though i must say, getting ruby to work nicely on Ubuntu can be quite a pain initially for a newbie who are not exposed to Ruby & Rails, Clojure kind of environment.
So future post on Riemann will shared soon. Meanwhile have a look at Riemann’s introduction video from Riemann.io (official site).
Riemann aggregates events from your servers and applications with a powerful stream processing language. Send an email for every exception raised by your code. Track the latency distribution of your web app. See the top processes on any host, by memory and CPU. Combine statistics from every Riak node in your cluster and forward to Graphite. Send alerts when a key process fails to check in. Know how many users signed up right this second.
Riemann provides low-latency, transient shared state for systems with many moving parts.
Riemann is a monitoring tool for distributed systems.
It seems to be named after the famous mathematician Riemann:
Riemann and its configuration files are written in Clojure.
I saw Riemann recently and took a brief look at it, also looked at some of the related links below, and found it pretty interesting.
A brief overview of Riemann:
You write a Riemann configuration file in Clojure.
This file describes what events from what systems (i.e. hosts on your network) you are interested in, and how you want Riemann to handle them.
Though Clojure is a Lisp, the Riemann config file syntax is easy to understand, even without looking at the documentation (for simple uses, anyway, such as in the easier examples).
Processing can include things like summarization (within or across hosts, event types, threshholds, etc.), grouping, filtering, emailing alerts to concerned entities based on events or the (conditional) results of processing events, and even some support for taking action on events, such as restarting a process that has failed.
Riemann also has integrations with Graphite and Librato Metrics.
Riemann clients, which can of course be servers of various kinds, send events to Riemann using (Google’s) Protocol Buffers, over TCP or UDP.
Client libraries for Riemann are available for several popular languages, and there is a guide to writing your own client.
Bernhard is a Python client library for Riemann.
It seems straightforward to use for simple cases:
you import the Client class from Bernhard, create an instance of it, and call methods on it, to send events that are of interest to Riemann, to be processed and acted upon.
Riemann also comes with a web dashboard written with Ruby and Sinatra.
I wrote to Kyle Kingsberg, the creator of Riemann, and he said it is used by a few big organizations like the BBC, The Guardian, and Blue Mountain Capital.
Related links for Riemann:
A Python wrapper for Riemann that used Bernhard:
A Node.js tool inspired partly by Riemann:
– Vasudev Ram
via Planet Python http://jugad2.blogspot.com/2013/06/riemann-and-bernhard-distributed.html