#143: Legacy Software - a risk matrix

This is part of a new mini-series looking at Legacy software - the term "legacy" is often seen a positive - yet within computing it is a negative term generally uses to indicate the need to replace hardware or software.

In this episode, I'll describe a simply risk matrix that can be used to monitor and highlight how legacy your software products are.

Or listen at:

Published: Wed, 03 Aug 2022 16:07:12 GMT

Transcript

Hello and welcome back to the Better ROI from Software Development Podcast.

I'm currently going through a mini-series looking at Legacy software - what it is, how it occurs, and the various strategies to deal with it.

Over the last few episodes, I've introduced it and I've talked about how all software is probably on a sliding scale rather than being an absolute of legacy or not.

I've talked about how much of an impact it is, based on how much effort the software development industry puts into trying to explain it - using things like the technical debt analogy, the broken window theory, and the messy campground example. They're all shared warnings that seemingly small problems mount up over time until the system is no longer viable - leading to ultimately expensive replacement work for something that you've already invested in.

And I've taken a look at some of the causes of how we get to legacy software.

In this episode, I want to suggest a way of highlighting and monitoring the problem using a risk matrix.

Wikipedia describes a risk matrix as:

"A risk matrix is a matrix that is used during risk assessment to define the level of risk by considering the category of probability or likelihood against the category of consequence severity. This is a simple mechanism to increase visibility of risks and assist management decision making"

To understand the risk represented by our software products, I would suggest a similar approach - a list of your organisation's software products and for each a scoring of various characteristics with some form of scoring to give you some sort of idea of the highest risk.

I find when doing this type of exercise, the key is to do it little and often with a focus on getting an impression of the risks rather than getting too caught up in precision.

This sort of work soon struggles to get anywhere if the action to produce it is too onerous. People simply do not have the time, and in this case, any impression is better than nothing.

So when building it, time-box any activity and, where empirical numbers cannot be found, then estimates from people closest to the product are fine. Remember that over time the matrix can and should be refined. Thus, don't expect perfection on the first try.

When building the matrix, we can start with a simple spreadsheet, beginning with a list of software products within the organisation. What you define as a software product will depend on your organisation, but I generally treat this as a discreetly deployable system.

This simple step can be an eye opener for some organisations. It can be surprising just how many software products an organisation can acquire. When I went through a similar exercise for a client, I found almost 50 software products for just part of their organisation. A good number that the development team had little to no knowledge of.

The next task is establishing how important each of these products are to the organisation. This can be a little trickier than it sounds and it can be very dependent on who you ask - and how much of their world is dependent on that product.

Be careful not to assign large numbers of products to being the most critical. Otherwise, it can make any form of prioritisation difficult. Remember, if you have more than one priority number one, then you have no priority number one.

Maybe consider some form of numeric system that allows you to rate each system comparatively to each other, effectively providing an ordered list of the most to least important.

At this point, it is also worth highlighting any that seem to have little or no importance. These are likely to be candidate for asking the question "do we need these anymore?" - it's not uncommon to find that products are still running, that providing little benefit to the organisation.

In the previous example, where we found almost 50 software products for a client, 10 of them could have been switched off with no business impact. They simply were no longer used but were costing money to host and run every month.

I would suggest the next assessment is how comfortable your development team are on working with each of the products. This is obviously going to be very subjective and this is partly the point. If the development team don't like working with it for some reason, then it's good to highlight so it allows for further investigation.

It is also good to get this assessment from individual team members. This should highlight islands of knowledge where only one person on the team is comfortable working on the product, leading to a single point of failure, which of course is a substantial risk.

And it's worth the team spending some time agreeing what criteria should be part of their decision, maybe including things like how up to date they think the software is, how enjoyable is to work with, how easy it is to find meaningful documentation online - This can be useful for creating a score across a number of very subjective metrics.

And I'd recommend asking the team to be as honest as possible - and definitely avoid this being linked to any form of review of an individual's capabilities. At one clients, the developers felt that they were being assessed on their knowledge, not the suitability of the code base. This led to a number of junior developers overestimating their knowledge to present themselves in a better light.

Without a level of honesty, there is a real danger of critical problems being unseen right up to the point where they can no longer be resolved.

The next thing I would look at was when the product was last deployed. There can be a false sense of security if a product has not been deployed recently and this obviously becomes a massive issue if there is a need to suddenly release a critical fix. And this will depend how often your organisation releases its software. But personally, anything over six months would cause me a level of concern.

In many cases of dormant systems, I would recommend automating regular releases to ensure that the organisation retains the ability when it needs it.

And finally, I would take a look at historic outages - how often do we have problems with that product? Are they related to releases? How quickly are they traditionally resolved? How impactful are those problems?

I wouldn't be surprised to see products with a low score on reliability, also being the same ones that score poorly for developer comfort. Often we find that products that have a lot of errors are the hardest and least enjoyable software to work on.

By this point, you should have a reasonable matrix to work from using the data you have. It should be enough to get an idea of your worst products, ones that you should really be investing effort on.

Over time, as an organisation, you will decide what constitutes "good enough". So while everything may not be perfect, you at least have some visibility into what is otherwise quite a difficult thing to get a handle on.

But initially I'd just start with those that jump out, start small, pick the most concerning and drill in deeper. Largely, this would constitute talking to the development team and establishing some form of remedial plan to improve the situation. Then make sure to review the effects of the remedial plan on a periodic basis along with reviewing the whole risk matrix. This will allow you to look for patterns over time.

The key thing is to watch for any products slowly but steadily getting worse. These are likely to be the legacy software in waiting.

The Matrix I've suggested in this podcast is basic - and it's intended to be.

Keeping it relatively simple and low tech gives it a much greater chance of getting off the ground. Once you're established, then it may be worth looking at additional factors - maybe the cost to run for example.

You can also look at tools that will help you generate empirical numbers on the quality of the code base. For example, a product like Sonaqube can be used to establish various metrics on the code base. I'd certainly recommend trying it and including its health scores alongside the comfort rating of the developers. I'll include a link in the show notes. This is not a product that I have any commercial relationship with - just at all I've had a good experience with.

This episode has been about helping us to highlight problems, hopefully before they get too big.

In the next episode, I'll take a look at approaches once you know that your software product is legacy.

Thank you for taking the time to listen to this episode and look forward to speaking to you again next week.

#143: Legacy Software - a risk matrix

Links

Transcript