In our last article we had a look at what performance and capacity is when it comes to technology and the importance of assessing these two components in order to maintain high productivity.
In this article we explore the importance of monitoring, how to utilise monitoring tools, what kind of information they can provide you and how to interpret that information to ensure you are planning for the future.
Why and how do we monitor performance and capacity?
Monitoring is essential. We want to proactively check that we have enough hard drive space, CPU, RAM, network and internet performance to meet the needs of the business now, and into the short term future. This helps avoid situations which cause a lot of down time, such as the inability to save things because of low disk space, or crushing performance slow downs. By investing in proactive monitoring, we avoid reduction in productivity.
The easiest way to ensure you are monitored is to ensure that your IT service provider has you covered! An IT service provider should have the tools in place to monitor your devices, and manage the complexity for you.
However if you have the expertise, time and money to try and do it yourself, there are a number of options!
Spiceworks is free, and powered by advertising. IT has a very large community behind it and can do the basics quite well.
PRTG is a much more complicated and complex tool, that is really powerful with network monitoring and graphing. There is a free option, but then costs you money pretty quickly.
Nagios is a very mature, open source monitoring tool. Open source means that it is free to use, and there is a big community behind it. However, setup is very complex and requires time and expertise!
What can you monitor with monitoring tools?
There are many metrics to monitor over time to ensure you are covered. Below are some of the most critical information you should be keeping an eye on and looking for trends.
How often is your computer using 100% of your CPU? If this is frequently occuring then it's often an indication that your computer is not fit for purpose. If you are constantly maxing out the capacity of your CPU for extended periods of time, there is a significant performance hit on the user and should be looked at.
Consistently sitting above 70% indicates over capacity, and that you may need to upgrade. There are certain circumstances where this doesn’t hold true (e.g. servers hosting databases), however in general, consistently high memory usage is a sign of performance issues.
Disk queue length
This metric gives you an indication of how long is it taking for your hard drive to read and write data. Long disk queue lengths could indicate hard drive performance is too low for the applications you are using, or how you use your computer.
If your disk queue is long it could either indicate that your RAM is too low or your disk performance isn’t appropriate for your use case.
These symptoms are often shown through slowness when opening applications or general slowness as you run your computer.
Monitoring how much disk space you have over time can allow you to plan for upgrades or allow for planned clean ups to ensure everything is running smoothly, and you don’t unexpectedly have downtime!
Number of applications that start when you boot your computer
As you install applications over time, more applications are opened when you boot up. This causes your computer to start slower, and perform slower in general. Monitoring this metric allows you to regularly review and identify application clean ups as necessary.
Patches to be installed
Vendors such as Windows and Microsoft often release patches which either increase performance or resolve performance bugs. Monitoring and ensuring that patches are getting installed quickly and consistently is important.
What do I do with this information? What are my next steps?
To get the most value out of the monitoring system you choose, not only do you need to gather the data, but you need to act on the data. By doing the following three things, you will get the most value out of your monitoring efforts:
- Ensure alerts are configured: Monitoring key performance indicators on all your devices and ensuring that you (or your IT service provider) get alerted about the items that could lead to potential problems in the future helps you prevent downtime and issues from occurring in the first place
- Monitor trends of performance over time: Reviewing reports that summarise your monitoring data over time allows for planning for upgrades to ensure we have enough capacity for future growth
- Performing proactive maintenance: Regularly performing proactive maintenance to fix or adjust potential problems identified by the monitoring tools before it becomes major issue and potential cause down time is essential
By regularly monitoring, configuring real time alerts for issues and performing proactive maintenance, you end up avoiding issues that we believe save you money in the long term. Regular cleanups could take as little as 30 minutes but could save hours and hours of downtime and unnecessary hardware upgrades!