With DevOps now firmly placed in the enterprise sights, we are seeing that automation visibility is becoming critical. Having the view of the state of the delivery pipeline ensures there is an increased awareness of the system as a whole. The question that often arises is “Yes, we need monitoring, but what do we monitor?” There are numbers components of your environments that you should be monitoring, and this blog explores what they are, how to track changes in your cloud environment and why it’s important to monitor them.
Often, we find people jumping straight into tools first, thought later. It is best for companies to first understand why they want to monitor, and what they want to monitor before picking tools. Once you know what you want to monitor and why, it is then easy to find services that offer a trial period or free service to assist in testing the monitoring integration and benefits for the organisation.
Infrastructure and service monitoring have been around long before DevOps, so how does DevOps really affect monitoring strategy, and is DevOps even needed for monitoring? Strangely, yes, in a way.
While monitoring predates DevOps, DevOps has furthered the software development process to such a degree that monitoring can't help but evolve as well. As a community, we are beyond writing cool application code; we are now writing cool infrastructure as code, automating integration and testing, and deploying everything in the cloud. Generally, the pace of development has increased, which imposes greater load on the customer feedback loop and deployment tooling. There is more to monitor, so where we use DevOps-style tooling to automate integration, testing, provisioning, and deployment, we need to use DevOps-style tooling to monitor our builds, resources, and performance.
Monitoring Categories Explained
You will likely want to cover at least one aspect of each category listed below:
With a strong DevOs adoption in your organisation, monitoring the development milestones ensures you get a good visibility into how effective your team is. How often are you hitting bugs? When are they fixed? How long does it take to get a new release out? There are many issue tracking systems out there, but it’s important to seek tools that integrate with your other DevOps systems, like your ChatOps or CD pipeline technologies to ensure the monitoring can eventually drive event driven changes to your development systems. There are a lot of metrics though, and so you should be ensuring you’re answering a real question with your monitoring, such as “What drivers caused us to deliver releases every month rather than week”?Possible Solutions: JIRA
There are two types of monitoring that you want to think about when it comes to vulnerabilities. There are the vulnerabilities in the national vulnerability database couples with vulnerabilities that may be introduced through your own insecure coding practises, and there are the vulnerabilities that arise from poor infrastructure and server security. Coding vulnerabilities are a much larger subject and should be looked at independently, but there are a lot of tools out there today that can monitor (even visually) your environments for vulnerabilities, and integrate with you other automation systems to prevent the release of environments that may introduce insecurities to your design.Possible Solutions: Trend Micro, Nessus, Hava.io
Deployment monitoring is sometimes as straightforward as configuring your build servers to notify the team or a designated team member that something is wrong. These notifications are cheap (i.e., they are easy to set up), but very important, so it pays to have this process fail loudly. Chances are, if you are already using DevOps, you already have some monitoring built in to your process. Many continuous integration servers are notification-capable and can communicate with chat servers to alert teams of failed builds and deployments. An extension to this is that you want to be able to monitor the change to your environments in real time when you have implemented continuous delivery.Possible Solutions: AWS Trusted Advisor, Hava.io
Often we see that the industry enables log collection in their company, and it ends there. How many times have we seen the syslog server configuration on our servers and then we look, and someone turned that server off a year ago! There is a wealth of knowledge about your applications and systems in your logs, and they should be monitored appropriately. Not only should you be aggregating and monitoring the content of the logs themselves, you need to be monitoring the log collection system itself, and ensure you’re not losing data.Possible Solutions: Splunk, ELK Stack, PaperTrail, CloudWatch
The original beast, server monitoring is the most obvious monitoring that IT departs think of when it comes to monitoring. The traditional and still important server health, uptime, performance and resource management is and will still be required for some time to come. But you need to be going further than this. There are now services and tools out their that provide you with the ability to actively alter your server and environment layout based on the health monitoring that is implemented in your environment, and you should be leveraging this. Coupled with this, it is become more and more critical to integrate an application performance monitoring into your monitoring services to help get an overall system and business view of the impact service health is having on your business.Possible Solutions: Datadog, NewRelic, SumoLogic
Whilst traditionally server health has been the key monitoring category that companies have focused on, it’s not increasingly more important for businesses to adopt a healthy ecosystem of aggregated monitoring across all of the categories that I’ve mentioned in this blog. I’d recommend that you start investigating and adopting free or trial applications in each of these categories to ensure you’re growing your monitoring capabilities at the same rate the you adopt your DevOps culture and tooling.