Legacy code can be a beast. In most cases, they are in one way or the other. A lot of times, there is even no one left from the original team that created it. No help, scarce documentation and this big pile of bricks that you somehow need to handle. That is what your team lead expects from you. To pull that legacy project from a CVS repository (yes these things still exist in some places, and they say they haunt these companies at night) and try to prepare it for some substantial makeover. The marketing team has managed to sell that old sock to a new client, but they want quite a bit of things to be altered and also added (sounds familiar?).
Why not re-create the project using the state-of-the-art technology you say? This is what management decided, no discussions, just get to it. So you sit down, take a few deep breaths, take a piece of paper and start to plan. You say to yourself: I do not want to just add bricks to an already formless pile. I want to approach this as a true engineer and artist in my craft. I want to make sure that no bugs are introduced and I whatever I am going to work on, along with its surroundings, is going to be better afterwards.
Now, I cannot guide you what the exact steps that you need to follow when you find yourself in a situation like that are. Apart from testing the hell out of everything around of what you are doing. Below you will find a few things that you should AVOID when trying to tackle legacy code and apply automated testing in the process.
2) Legacy code testing mistake #1: lack of deep understanding
Legacy code testing and approach, in general, is almost the exact opposite of TDD. In TDD, you do not think about how the solution should be implemented and how it will work exactly. You simply start by writing tests that expect certain behaviours. The implementation and solution unfold themselves to you with time. When dealing with legacy code, that is something which you should avoid. Taking a deep dive straight away without a real understating of how this thing really works.
One of the best approaches from my experience is to start drawing. Yes, go to a whiteboard and sketch out the most important things about a certain unit of code (ask your manager to buy one if there is not any near by.. there are workplaces like that, unfortunately). Then try to figure out the relationships and dependencies between certain parts and start drawing lines. You do not need to go extra formal here with UML diagrams. It can look like a bunch of gibberish to someone from outside. The most important thing is that it helps you understand the complexity of the task at hand. So stop sitting there for hours being stuck in your head. Get physical and visual, and it will become simple in no time.
3) Legacy code testing mistake #2: covering the entire thing
I am sure you have heard that you should cover the public method that needs a change with a full set of unit tests before even touching it. That might be true if the method is relatively small. That happens very rarely though in a legacy codebase. The methods are mostly humongous, and to cover them, you would need tens if not hundreds of tests. All of that to change a single line? Well, for sure you would have 100% certainty that you did not break anything but being stuck on one line for a week or two would not make your team lead ecstatic.
Taking the previous point into consideration, you should now have a deep understanding of the unit of code which you need to alter. In most cases, the change that you need to implement will cover a few lines of code at most. The average legacy public method is 100+ lines at least to put it kindly. The safest way from my experience, where you do not need to waste a week covering all the possible logical paths with tests, is to take advantage of the Sprout Method / Class. You basically extract the smallest part of the public method possible into a separate class. You cover that with tests, and then you could start a few TDD cycles to get the actual change.
4) Legacy code testing mistake #3: not covering the adjacent code with tests
A lot of times we have so much in our backlog that we just want to get to the finish line as fast as possible. We follow the previous points, and after getting the adequate understanding, we refactor, test and implement the change. Now, that process regarding legacy code takes quite a bit of time, and an introduction of a change is way harder than in greenfield projects. Why would you not take full advantage of the situation? You have worked so hard to get to this point. The hours of sketching, figuring out the design and finally applying the change. Why stop there?
In most cases, once we isolate a part of the code that needs to change, we also had to separate a bit more for the sprouting to be possible at all. Try to cover all the possible paths of the isolated code. Not only the happy path. Exceptional and border paths also. Make this piece of code rock stable from now on.
Another step you can undertake is to introduce one or two IT's. I know it is not easy in legacy code, but once you understand how a particular feature works, there is always a way to test the most critical path which goes along a bunch of components. Just to make sure that not only the changed unit works as it should but also the collaboration is there as well.
5) Legacy code testing mistake #4: not being a scout
Finally, after we understood the code, applied the change, tested the hell out of it, we can lay on the couch and be proud of ourselves for the rest of the week. While that is tempting and of course, justified, there is still a bit of something that we can achieve here. Like with the previous point we get all excited that what was supposed to work is now working and we can get another story from the backlog. Hold your horses.
Finish the job off with a bunch of small cleanup actions:
● Make the method names more clear and self-explanatory.
● Add missing Javadoc on any adjacent interfaces.
● Remove any inline comments. There is usually a lot of them in legacy code.
● Remove obsolete or commented out code. There are chunks of this always laying somewhere around.
We have gone through a few most common legacy code testing and handling errors. Do not be afraid of the old stuff, just keep calm, figure out how it works, figure out the connections, test and implement the change with all the best practices and your legacy code will keep working in harmony.