Solving the open source problem for nToggle

June 22, 2015
By Ashley Penney

The nToggle operations team faces an unusual problem that most SaaS platforms never have to face; part of our software must be deployed on servers that are hosted by our customers in a wild variety of data centers and cloud environments.

In order to accomplish these goals, we’ve built a deployment architecture driven by Ansible and Docker. If you haven’t heard of these tools before, Ansible is an incredibly powerful automation system that allows you to orchestrate and automate the deployment of software across large numbers of servers. Docker is a service that runs and manages containers, bundles that contain software, which allows us to abstract away the underlying operating systems we’re deployed on for an easier operational environment.

When we onboard new customers, we rely on these tools to do a fully automated deployment of our nRoute virtual appliance. nRoute is our high performance Scala router designed to handle massive throughput while exhibiting extremely low latency characteristics. If this sounds like a challenging problem you’re interested in, our engineering team is hiring and would love to speak with you!

One of the earliest problems we faced was how to monitor both the base servers and the deployed software in environments where we need to have as small a footprint as possible. We decided to turn to the wonderful cloud monitoring platform built by Datadog to allow us to easily centralize all of our monitoring in one location that is easily accessible by all of our customers.

Datadog provides a robust API to allow you to create graphs, screen boards to display those graphs, monitors and more. As an operations team, we want to ensure that we never have to manually manage the monitoring and that any new deployment automatically creates all of the dashboards and monitors that we would require.

Thanks to the hard efforts of our fantastic interns, we have begun writing ansible modules for all of the API possibilities that Datadog offers up to us.

We wrote a Datadog monitor module that was quickly picked up by the ansible community and improved further, as well as having submitted a Datadog_tags module that allows you manage the tags that are associated with hosts. We’re working on Datadog_screenboard and Datadog_downtime to allow us to fully automate all of the requirements of a hands off deployment.

As we grow the team, we plan to continue to contribute back to the projects that underpin our operations and work within the community to constantly improve the toolchain for everyone! We’re actively hiring devops engineers to help us further this work so please reach out to us if you’re interested in helping us tackle these kinds of improvements.

Tags: , , ,