Clean code matters. I’ve said this before and I’ll say it again, but how do you manage to maintain clean code when the people responsible for developing the tools we use for clean code constantly move the goalposts and in some instances, shift them so far out of the ballpark you need your own private access to the Hubble Telescope to even have a chance of spotting them. Hitting them on the other hand is a whole different ball game.

I speak today of PHPUnit. Up until today I have been happily using the old 3.7 stable release but today, as part of future-proofing research experiment, I upgraded from 3.7 to the latest 4.0.14 stable pear package.

This immediately caused problems on some legacy applications which were easily resolved by hacking at Zend Framework until the exceptions went away, but in running the tests I noticed something unusual.

Normally when PHPUnit runs, it has 5 separate states to choose from:

  • . Pass
  • F Fail
  • E Error
  • S Skipped
  • I Incomplete

These are the standard states for any unit testing framework regardless of the language, universal and instantly recognisable. We all know them, we all know what they mean. With version 4.0 however, PHPUnit has decided to introduce a new state and honestly, I have no idea what to make of it yet.

The new state they have introduced is R and running this on virtually any code will immediately mark almost all of your tests in this state.

So what is this new state? Well, the state introduced is called “Risky”, and Sebastian Bergman, in his infinite wisdom has so far failed to document this in either the 4.0 stable or the 4.1 beta documentation.

How many of you use phpunit without any flags given? How many of you ever use –tap to give better information of what is going on? Well, if you don’t, and you want to know what “Risky code” actually means, now is the time to start.

So what is risky code?

Risky code is code which is included in tests but contains code not covered using @covers. The exact definition given in the output is “RISKY This test executed code that is not listed as code to be covered or used”.

What does this mean

Initially and immediately this means that the 70 – 80% test coverage you thought you had is probably now down to at best 10 – 15%. This is most likely fine for your personal websites or projects but for business this spells bad news. Why? Because when you’re dealing with incredibly large code stacks, suddenly you can no longer trust the quality of your code. Suddenly, your codebase has 80% less coverage than it had before and this is disasterous.

Why is this bad?

When code is marked as risky, it’s code coverage is discounted. Even if a test covers paths through the code, it will not be counted in the final result.

Consider for a moment the following code and associated unit tests:

And the test class

Now, just running PHPUnit on its own will indicate that everything is OK, all tests pass, no warnings, no errors, no nothing:

And with tap:

Now, the minute we turn on code coverage, watch what happens:

Note the difference?

If you run with –tap, you get a pretty good idea what is going on with your tests:

At this point, nothing has changed, it’s a pretty standard class and a pretty standard way of writing tests but now our tests that would have pretty good coverage with PHPUnit < 4.0 now has no coverage with PHPUnit 4.0.

So how do we overcome this?

Well, the first thing to do is list every method the test covers:

Now, lets run phpunit with –tap –coverage-text

What we see here is that although we’ve now covered every method used in our class, test code coverage is still counting towards our final line count.

To overcome this, you need to mark the test as covering itself:

Now running PHPUnit will give you cleaner output:

Conclusion

On one hand I really want to praise Mr Bergmann for this particular feature. It’s quite brilliant in that it ensures that you understand exactly what methods are being hit by your code. If you don’t want to list coverage for every method, mock it away, otherwise list the coverage.

The bit I’m really distasteful of is having to list the tests in the coverage. Seriously a test does not cover it’s self. It can’t. If you want to test a test, you need to write another test to do that.

For enterprise level code, this really does help ensure complete coverage but at an enormous cost. I can’t see businesses moving to this any time soon because the technical debt it introduces to the codebase rises from e^n to e^(n^y)

I can understand exactly why this feature has been introduced but it is in no way viable. The only way to make this acceptable to business is to make it configurable. Either on, off or have a level of acceptable risk.

Simply put, when a business drives quality by lines of code covered by tests, this is a great feature but when balanced against project cost utilising legacy code, this is completely inviable.

Regardless of the benefits this feature brings, and as much as I want to use it, it is clearly too strict for general use. If this was turned on, quality metrics with this in place would fall through the floor. If this was enabled in an environment where code quality resulted in a failed build (see Sonar Build Breaker plugin) there would be developer revolt.

Leave a Comment