Code Refactoring: Why, When, and How Explained
Posted 05 Apr by Pavel Gorbachenko
What is Code Refactoring?
Refactoring is taking software code and making modifications to improve it without changing the code's functionality.
If software refactoring is done correctly, the end-user will not notice that the code has changed other than seeing improvements in responsiveness.
Reasons to refactor code include:
- To make it easier to maintain
- To make it perform quicker
- To use fewer resources (memory, storage)
- To take advantage of the latest technologies and best practices
As software is modified over time with updates and patches, changes are often made that introduce inefficiencies, duplicated functionality, or conflicts with existing functions.
Without remedial action, software tends to become bloated, slow, and bug-ridden. Code refactoring can eliminate these problems.
Why is Refactoring your Code Important?
- Where done to make its performance more efficient, the user will benefit from a more responsive product.
- Where done to make it more maintainable, changes and adding new functions will be quicker, cheaper, and less likely to introduce bugs.
- Where done to make it easier to read, security vulnerabilities and programming bugs will be easier to identify and resolve.
- Where done to make its resource usage more efficient, reductions in processing requirements, memory usage, or storage needs will reduce operating costs.
Refactor or Rewrite?
Often the meanings of these terms are interchangeably used.
Refactoring code is taking an existing application and changing its programming to improve the code without changing its behavior.
Rewriting code discards the existing application and starts from scratch to write code that fulfills the original code's identical requirements.
The refactored code and rewritten code may deliver the same results to the user, but there is no guarantee that this will always be the case. It depends on the development team's interpretation of the requirements.
Choosing whether to refactor or rewrite will depend on the individual circumstances. There's never a simple answer. If one of these options is the only solution to the original problem, each will have its own pros and cons.
Examples of refactoring pros and cons
- Modifying existing code can potentially deliver results quicker. The scope of how much code is revised can be tailored to the available resources and project timescales.
- Partially re-engineered code can be released, and users can continue to use the product.
- Refactoring cannot resolve issues related to underlying architectures or programming technologies that underpin the original code.
Examples of rewriting pros and cons
- Rewriting will generally take longer to deliver results but may, in the long term, be simpler and cheaper to complete.
- Rewriting can deliver better performance using fewer server resources.
- Rewriting can deliver more stable code with fewer security vulnerabilities.
- Rewriting can make the inclusion of new requirements or compatibility with new technologies simpler to implement.
- Rewriting carries the risk that the rewritten code may not work the same as the original code.
- The original flawed code will need to be supported until the code rewriting project releases a production version of the new code. This can potentially be a significant period.
The decision will depend on the original code's nature and the refactoring/rewriting process's goals.
The University of Texas, in conjunction with Microsoft, has produced an interesting paper on A Field Study of Refactoring Challenges and Benefits - Microsoft Research.
The key benefits identified were that the refactored code modules showed a higher reduction in inter-module dependencies and post-release defects than rewritten code modules and led to increased reliability.
When to Refactor Code
The recommended time to consider refactoring is a significant milestone in the code's software engineering lifecycle. Examples of trigger events include:
- When there is a requirement for new functionality to be added.
- When the code is going to be migrated to a new hosting environment.
- When a critical bug has been uncovered that will require significant changes to resolve.
- When the development team has no scheduled activities leaving them free to take on this task.
Refactoring is often a project management decision based on resource availability rather than a response to a specific event.
We also have many examples where new clients have turned to us because their previous software development company could not deliver the required results.
We are often presented with unstable software that does not work correctly, meaning refactoring or rewriting is often the only option to resolve the issues.
Another factor to consider is the original developers' availability if the existing code is difficult to understand. Sometimes more effort is needed for new developers to understand legacy code than simply rewrite that code from scratch.
The University of Alberta, in conjunction with MacDonald, Dettwiler, and Associates, Ltd, has produced a paper on Understanding the Economics of Refactoring.
This study's key outcome was that refactoring low-level code modules had a beneficial impact on system dependency structure.
The investment required for refactoring could be recovered by reducing maintenance activities and the associated costs of regression testing activities.
How to Refactor Code
The refactoring exercise should have a clear goal and a plan of how to achieve that goal. Without careful planning, refactoring can result in the expenditure of effort that does not deliver noticeable results. The process should:
- Identify which parts of the existing code would provide the most significant benefits from re-engineering
- Focus on ensuring that the modified code will still work as intended
- Deliver improved efficiency
- Ensure the revised code works the same or better than the original code
- Be a staged and controlled process of the software engineering activities using best practices
Refactoring an entire application in one go has the potential to introduce errors that then have to be traced to which modified function caused them. Where failures are due to conflicts created between changes to two different functions, this can be a complex and time-consuming endeavor.
Processing changes to functions sequentially offers the lowest risk of creating conflicts across interdependencies. Once one function has been modified, its functionality verified, and its operation within the overall application validated, the next function can be refactored.
IBM has published some valuable resources as part of their development practices, including guidance on how to Improve code through refactoring - IBM Garage Practices.
Their recommended key steps are:
- Use a suitable editor,
- Focus initially on problem functions,
- Create a comprehensive unit test suite,
- Choose the best refactoring technique for each code function,
- Only put modified code into the repository when it passes all unit tests.
A technique suited to the agile development methodology is an iterative test-driven approach. A function is chosen, refactored, and then tested. The code is then modified to resolve any test failures that occurred. This modified version of the code is refactored again. This iterative cycle continues until the modified function passes its tests.
Complex code functions are decomposed into self-contained fragments, and these fragments are extracted into sub-functions. These called sub-functions can then be refactored in isolation from the rest of the original complex code function.
Where existing functions have become so complex to be unmanageable, their functionality can be refactored to simplify the function's logic. Techniques such as refining passed parameters or using polymorphism in place of conditional processing can be employed to achieve this.
Where existing functions have become so simplistic that the overhead of calling them is counterproductive, these functions can be removed, and their functionality moved up into the calling function.
Knowing When to Stop
Refactoring code has the potential to be a never-ending activity.
Once all functions have been modified, chances are improvements could still be possible. It is often best practice to consider refactoring as an ongoing maintenance activity.
The planning process should be used to identify which functions should be scheduled for consideration and when. Breaking the task down into bite-sized chunks prevents the exercise from spiraling out of control into an all-consuming never-ending task that prevents further development.
Verification of Refactored Code
The key to refactoring a function is the assurance that the function still operates correctly following its revision. Code testing and QA reviews are an essential part of the process.
- The re-engineering of any function will require the developers to identify all interface changes that affect the existing test suite.
- Updating tests to track changes to the functions is vital to ensure testing does not hold up progress.
- Test updates must also be undertaken independently from the code changes to prevent tests from being changed simply to pass rather than verifying the function remains correct.
Validation of Refactored Code
The end result of the exercise is that the code should operate with identical behavior to the original code.
Validation of the re-engineered code product is necessary to ensure this goal is achieved before the code is released to end-users.
Modified code that works differently or introduces bugs defeats the purpose of the code refactoring exercise.
The Bottom Line
Refactoring code offers the chance to make existing code more efficient, maintainable, and responsive.
However, the decision to take this action is a complex one.
The exercise requires resources to be diverted away from the development of new functionality.
Often, the process of re-engineering code can turn into a counterproductive drain on resources as developers dive down proverbial rabbit holes as they seek perfect performance. This is explored further in this article.
With its prioritized and iterative approach, the agile methodology is best suited for the software refactoring process.
Implemented correctly, code refactoring can be undertaken in parallel with product development.
Scheduling can be tailored to utilize programming resources that would otherwise be idle during periods with low development activity.