The Bug Hunt in FOSS - part ll: fixes in openfortivpn

22. maj 2018 - Avtor Tadej Borovšak

22. maj 2018
Avtor Tadej Borovšak

Ever wondered what it takes to get bugs fixed in one of the FOSS packages? You have come to the right place. Today, we are going to show you how we helped to fix a bug in the openfortivpn program.

How did we find the bug

This post is a sequel to the finding bugs post. Quick recap for those in a hurry: we replaced the VPN at XLAB, openfortivpn did not like our wildcard SSL certificate, which lead us to the bug in the build system of openfortivpn.

Now that we are all acquainted with the problem we were trying to solve, let us move on with the story.

Fixing the bug

We started fixing the bug by properly reporting it to the upstream. In this day of age, this usually means opening an new issue on GitHub, which is exactly what we did. The resulting bug report can be seen here.

Since we already diagnosed the bug pretty thoroughly, finding a fix for the problem was relatively easy. Our initial fix for the bug in question was to replace the PKG_CHECK_MODULES call in the configure.ac file with two AC_CHECK_LIB calls. We created a pull request with the fix and the maintainer merged it really quickly.

And this is the end of the story. Or is it?

Fixing the bug again

It turned out that our fix breaks the compatibility with OSX. Gosh darn it. Maintainer reverted our commit and we went back to the drawing board.

Next solution for the problem at hand that we came up with was to require recent enough version of OpenSSL. This would allow us to remove the function detection call in configure.ac file and simplify the validation function, since X509_check_host function is guaranteed to be present in recent OpenSSL libraries.

The fact that the older releases of OpenSSL (< 1.0.2) are not supported by upstream developers anymore, made this solution really promising. But when the maintainer implemented the fix, the tests failed, because Travis CI uses some really old Ubuntu images to run tests on.

So we refactored the test configuration and made it configurable enough to allow testing against a selected OpenSSL version. This finally made all tests pass and we were back on track.

This time around, the maintainer was a bit less eager to merge the fix into the master, which proved to be a wise decision.

Fixing the bug yet again

What went wrong this time? It turned out that some still supported GNU/Linux distributions use a heavily patched OpenSSL library that is older than our threshold of 1.0.2. And since this would mean that they cannot receive any updates from this fix on, the plan has been abandoned.

And this is how we arrive at the final ugly hack. This solution will not win any awards for the most elegant piece of code, but it works.

Parting thoughts

Are things more complicated as one expects? You bet they are. And although the journey to the final solution seems to contain a never ending stream of obstacles that prevent us from reaching the desired goal, the feeling of getting things fixed at the end makes it all worth it.

Another thing that usually does not get emphasized enough in such situations is the non-technical aspect of the bug-fixing process. The final solution that we used in this case is without a doubt the ugliest of them all as far as technical debt goes. But locking quite a few people out just because they use a bit less shiny and older operating system would be far worse for the project.

So, next time you come across an ugly piece of code, try running git blame first and do a bit of research why things got into such a bad state. The results might surprise you. You may even learn a thing or two about maintaining backwards compatibility, which is often forgotten about in the world of programming, where things change at a breakneck pace.

Of course, if the ugly piece of code is not checked into SCM, feel free to flame them to your heart’s delight.

Oh, and by the way, you can also talk to us. Click your way to twitter and/or Reddit and let us have it ;)

Cheers!