Class size

Imagine that you need to maintain two applications. Both are about 20.000 lines of code. Now imagine that one has about 10 classes, and the other has 200 classes. Which one would you rather work with?

I’ve had discussions about whether you should favor many classes over fewer. When you only take into account the amount of functionality delivered through those classes, it doesn’t matter. The same functionality will be in the codebase, whether there are few classes or many. Since creating a new class takes effort (not much, but still), it’s easier to have a few big ones. One could say that having a few big classes would contain related functionality in the same files.

The amount of functionality in the system isn’t the only metric. The ease of adding functionality and solving defects, and unit-testing are examples of other metrics that should be taken into account.

Big classes usually have lots of private methods. So, how are you going to write unit-tests for them? Are you going to use reflection to make those methods accessible? Are you going to write extensive setup code to reach those methods? Or are you going to extract classes containing those methods, and make them publicly accessible?

How are you going to change the functionality? How are you going to fix defects? Big classes are big, and usually it’s hard to keep track of what’s going on. Because of this, you’re spending more time figuring out what the code is doing, and what it actually should do. The clearer the intention of your code, the less time you need to spend on getting to know what it’s doing.

Personally, I prefer lots of small classes. But how do we get there? When you’re presented with a legacy project, it requires a lot of refactoring. But beware, don’t just go out and refactor. If there are no issues, and the required functionality doesn’t change, that part of the codebase is just fine. On the other hand, when you start a new project, it’s a bit easier.

One of the first thing I’d recommend is to read up on the SOLID principles. SOLID stands for Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation and Dependency Inversion. Knowing and applying these principles will help you create a well-factored system. You probably won’t be able to apply these principles all of the time, but it definitely helps to know about them.

Put some tests in place, and make sure these tests are of the highest quality. The more and better tests you have, the more secure your refactorings will be. As an added bonus, you gain knowledge of and insight in the system you’re working on. As you progress with fixing defects and implementing new functionality, the amount of code under test will increase, and the faster you can develop and refactor.

Practice Test Driven Development. Write a test, make it pass, and refactor to optimise readability. Make sure you do the last step, TDD won’t work otherwise. TDD will help you create a clear system with very high test-coverage. And that coverage will be high quality.

Use as few if-statements and switch/cases as possible.Using as few conditionals as possible makes the codebase more usable, because it forces you to use more object oriented design. You could use an inheritance structure, or a table-/map-based approach. There may be other patterns, if you’re creative enough to discover them.

Technical Debt

Technical debt is acquired when you take shortcuts while developing your software. It helps you get the changes in place faster, but it results in code that is harder to understand.

Ward Cunningham coined this term while he was working in a financial institution to explain why they were refactoring. His boss at the time was a financial guy, and this was financial software, so a financial metaphor was the best way to explain this principle.

When you want to buy a car, and you don’t have the money, there are basically two things you can do. You can wait and save until you do have the money to buy the car, or you can borrow it. Translating this to writing code, you can implement the feature correctly, with clean and clear design. This way, when it needs to change, it is easy to understand and changing it will be faster. Or, you can do it the quick and dirty way. This is faster, for now. But when the time comes to change the code, it will take more time.

The two components of financial debt also apply to technical debt. There is interest, and there is principle. We pay interest when we need to change dirty code, and we pay down the principle when we clean that code up.

Martin Fowler divides the term Technical Debt into four categories. Reckless vs. Prudent, and Deliberate vs. Inadvertent:
Deliberate, Reckless: “Design is boring and time consuming. This works, so who cares?”
Deliberate, Prudent: “We really should do this, but the deadline is approaching fast. We’ll note our shortcuts and refactor after the deadline.”
Inadvertent, Reckless: “What’s wrong with an entitymanager in the weblayer?”
Inadvertent, Prudent: “This seemed like a good idea at the time. Now we know how we should have done it.”

Here are a few things to help get technical debt under control.

Naming

Choose good names for your functions and variables, and rename to better names when your understanding of the code changes.
You are not just telling the computer what to do. You are also writing a document for future developers. You are creating a language in which to communicate your thoughts about the problem at hand. This language is a mixture of reserved words of the programming language (for, if), terms that are used in the domain (account, customer), and terms that are common in our industry, such as design patterns (action, command, listener). The easy part is  getting the computer to do what we want it to do. The hard part is telling our future selves what we are thinking.

Shorter methods

“Rule 1 of methods: they should be short. Rule 2 of methods: they should be shorter than that!”
“Functions should do one thing. They should do it well. They should do it only.”
Longer methods tend to do more than they should. Splitting methods into smaller methods makes them more readable, manageable, and reusable. Another advantage of smaller methods is that you get lots of them, and that makes it easier to group them in the correct classes.

Unittests

While the production sourcecode is the primary document we deliver, there is another document: unittests. When written correctly, unittests are quite helpful in further explaining the code. They will tell how objects are created and used. This is one of the reasons Test Driven Development is useful. They are also quite useful for spotting design flaws. For example, when you find an uncovered private method, and it’s just too hard to reach it through the public interface of the class, chances are that the class has too many responsibilities and should be split into multiple classes.

Remove duplications

With shorter methods comes a more fragmented codebase: functionality is split into more fragments. This usually means that duplications become more visible. The duplications are already in the code, but sometimes they are hidden. Duplications are generally considered bad, because it usually means that you need to apply the same change to multiple pieces of code. And that’s easy to forget. According to Kent Beck, eliminating duplications is a powerful way of getting a good design.

Single Responsibility Principle

What does the class do? What reasons does it have to change? How many responsibilities does it have? A class should have only one role, only one responsibility. For example, it should only validate input, or format some text. Some ways of finding the responsibilities of a class include grouping of method names and drawing the relations of the class members.
Responsibilities can live on the interface level, and on the implementation level. Classes can either delegate the responsibility to other classes, or they can implement the responsibility themselves. When a class implements too many responsibilities, it’s easier to move just the implementations those responsibilities to other classes, and keep the original methods. These methods then act as a gateway to the new classes. Over time, we can modify the system so it only uses the new classes, and then we can remove the gateway methods.