1. 程式人生 > >Unit Testing Private Methods

Unit Testing Private Methods

How can you unit test private methods? If you google this question, you find several different suggestions: test them indirectly, extract them into their own class and make them public there, or use reflection to test them. All these solutions have flaws. My preference is to simply remove the private modifier and make the method package private. Why is this a good idea? I will get to that after I discus the problems with the other methods.

First of all, not all methods need to be unit tested. Some methods are so simple that there is no point in unit-testing them. For example:

private void startLowBalanceTimer(int timeout) {
  if (lowBalanceTimer != null) {
    lowBalanceTimer.start(timeout);
  }
}

The use of the lowBalanceTimer

, including its timeout, needs to be tested, but this is better done in functional testing. It will quickly become apparent if the method above doesn’t work as expected. The methods that need to be unit tested are the ones that contain algorithmic logic of some kind. The article Selective Unit Testing – Costs and Benefits
makes a distinction between algorithms and coordinators that I think is very useful. The best use of unit testing is for algorithms, but not for coordinators. Furthermore, these two types of functionality should not be mixed in the same method.

Indirect Testing?

Proponents of indirect testing of private methods argue that you should only test the public methods of a class. If the public methods make use of private methods, then these private methods automatically get tested when the observable results of the public methods are checked. But this logic is flawed. We can use the same logic to argue that no unit testing whatsoever is needed. We just test the complete program. All methods will be indirectly tested when we check the results of the complete program. Clearly this is a bad idea. Either the private method does something interesting, and then it should be unit tested in its own right, or it doesn’t do anything interesting, and then it doesn’t need to be unit tested at all.

Furthermore, this doesn’t work well if you practice Test Driven Development (TDD). If you take a bottom up approach, and develop and test the building blocks before putting them together, you often don’t have the public method ready when you are developing the helper methods. Thus you don’t get the benefits of testing while developing.

A Separate Class?

Another argument goes like this: if you have private methods that do something interesting, then it is a sign that that functionality should be extracted out into its own class. In the new class, the methods become public methods (since the main functionality of the new class is to do what the private methods did).

Sometimes this is a valid argument, especially if the extracted functionality will be useful in more than one case. However, quite often this is not the case. The functionality in the private methods is often very specific supporting functionality of the original class, and is not useful in other cases. While it certainly could be extracted, it often makes more sense to keep it close to the functionality in the original class.

If it later turns out that the same functionality is needed somewhere else, then it is a good time to extract the functionality into its own class, but not before that.

Reflection?

By using reflection, you can access the private methods of a class while still keeping them private. In my mind, this is the worst of the suggested ways. Since the method names are treated as strings, it doesn’t handle refactorings well. It is also too complicated, especially given how easily the methods can be tested if the private modifier is dropped.

Solution: Package Private

By removing the private modifier, the method is visible in the package of the class, and nowhere else. This means that the method can now be unit tested (provided that the same package is used in the test class). But aren’t we exposing too much of the internals of the class?

The answer is no. The public interface of the class is still only the public methods in the class. Unless the calling class is in the same package, it can’t use the package private methods. Furthermore, you most likely have direct control over all classes in the same package. By direct control I mean that you can check out the code, make modifications to it, and check it back in again. So it is pointless to hide methods by making them private. Whoever feels the need to call one of the private methods from another class in the same package could just change the access modifier from private to package private or public and be done.

The idea of the original class is that its interface is the public methods. The package private methods are only helper methods, and not meant to be used from outside the class. For classes in the same package, we already rely on them to use it correctly (it can’t be enforced anyway). For classes outside the package, either by another team, or from external use, the only access is via the public methods, so the encapsulation is not broken where it matters.

Example

At work, we use a traffic generation tool. One of its features is that you can generate test traffic with a given rate and duration, for example 200 SMS per second for 1 hour. Both the rate and duration are parameters that can be set in the traffic generation script. However, it is often interesting to have some variation in the load. For example, it can be good to test how the system handles a load that starts at 200 SMS per second, then decreases to 10 SMS per second, and then increases again.

At our most recent hackathon at work, I added the capability to add “scaling values” to a script to achieve this. The scaling values are a number of integer values between 0 and 100 that indicate what percentage of the nominal rate should be used. So if the values [100, 80, 20, 100] are given, then the rate will vary linearly between 100% and 80% for the first third of the duration. The rate then varies linearly between 80% and 20% for the middle third, and between 20% and 100% for the last third.

There are three advantages with this format. First, it is not necessary to give absolute values for the rate. If the rate is 200 SMS per second, 50% of that rate is 100 SMS per seconds, but if the rate is 300 SMS per second, the same scaling value of 50 would instead give 150 SMS per second. Second, the scaling values are relative to the duration of the script. The same scaling values can be used both for a 10 minute script and a 1 hour script. Third, it is possible to get a more detailed shape of the rate by increasing the number of scaling values. For example, the values [100, 90, 70, 40, 20, 10, 0, 0, 10, 20, 40, 70, 90, 100] give a more detailed shape of the rate than the previous example did.

To find the scaled rate at any given time in the script, the class ScalingValue provides the following public method:

public double getCurrentRate(long now, double maxRate)

Given the time, the nominal rate, and the scaling values (provided when ScalingValue was instantiated), the current rate is returned. Internally, there are several helper methods. One internal method, getNextIndex finds which two scaling values the current time stamp falls between. Another method, getScalingFactor, takes two scaling values and the time stamp and calculates the scaling factor at that point. For example, if the scaling value at time 1000 is 60, and the scaling value at time 2000 is 80, then the scaling value at time 1500 is 70, and at time 1750 it is 75, and so on.

Clearly there is quite a bit of logic needed to get this to work. In cases like this, I prefer to divide the tasks up into small methods, and unit test each method individually. It helps a lot to have well-tested parts before they are combined to create the complete solution.

In this case, all the internal methods are package private, so they can be unit tested easily. Should they be tested indirectly, by only testing the public method getCurrentRate? It is possible, but that is making it unnecessarily difficult. The helper methods are much easier to both write and test as separate parts.

Should there be a separate class that provides getNextIndex and getScalingFactor as its own public methods? Again, no. They are the guts of the ScalingValue implementation, and are not useful to any other class. But aren’t we unnecessarily exposing the internals of ScalingValue? Once again, no. The class that uses ScalingValue is in the same package as ScalingValue. So nothing is gained by making methods in ScalingValue private. If a developer wants to use methods other than getCurrentRate (and all helper methods were private), there is nothing that stops him/her from changing ScalingValue to make the helper methods not be private anymore. You have to trust the other developer’s judgment.

Conclusion

Unit testing methods that contain algorithmic code is important. This can be achieved with a minimum of problems by just dropping the private access modifier. There are no downsides to doing this, and you get well-factored, coherent and tested code. Great, isn’t it?

EDIT: Discussing different coding techniques without referring to code can often lead you wrong. So I have added ScalingValue.java and TestScalingValue.java to make the discussion more concrete (I should have done this from the beginning, my mistake).

EDIT 2: I have been thinking about this case a lot lately, and I have to admit that I find the arguments below by eecolor and Arho Huttunen quite persuasive. When I apply their logic to my example class ScalingValue, I can make getScalingFactor and getNextIndex private, and still cover them well in the tests. So I will follow their advice in the future (keeping the methods private, and only test through the public interface). I will however leave this blog post as it is, since the discussion of it in the comments made me change my mind. Thanks eecolor and Arho for contributing!