3/27/2013

Continuous Delivery - Part 3 - Feature Toggles

Filed under: — Aviran Mordo

Previous chapter:The Road To Continuous Delivery - Part 2 - Visibility

UPDATE: We released PETRI our 3′rd generation experiment system as an open source project available on Github

One of the key elements in Continuous Delivery is the fact that you stop working with feature branches in your VCS repository; everybody works on the MASTER branch. During our transition to Continuous Deployment we switched from SVN to Git, which handles code merges much better, and has some other advantages over SVN; however SVN and basically every other VCS will work just fine.

For people who are just getting to know this methodology it sounds a bit crazy because they think developers cannot check-in their code until it’s completed and all the tests pass. But this is definitely not the case. Working in Continuous Deployment we tell developers to check-in their code as often as possible, at least once a day. So how can this work? Developers cannot finish their task in one day? Well there are few strategies to support this mode of development.

Feature toggles
Telling your developers they must check-in their code at least once a day will get you the reaction of something like “But my code is not finished yet, I cannot check it in”. The way to overcome this “problem” is with feature toggles.

Feature Toggle is a technique in software development that attempts to provide an alternative to maintaining multiple source code branches, called feature branches.
Continuous release and continuous deployment enables you to have quick feedback about your coding. This requires you to integrate your changes as early as possible. Feature branches introduce a by-pass to this process. Feature toggles brings you back to the track, but the execution paths of your feature is still “dead” and “untested”, if a toggle is “off”. But the effort is low to enable the new execution paths just by setting a toggle to “on”.

So what is really a feature toggle?
Feature toggle is basically an “if” statement in your code that is part of your standard code flow. If the toggle is “on” (the “if” statement == true) then the code is executed, and if the toggle is off then the code is not executed.
Every new feature you add to your system has to be wrapped in a feature toggle. This way developers can check-in unfinished code, as long as it compiles, that will never get executed until you change the toggle to “on”. If you design your code correctly you will see that in most cases you will only have ONE spot in your code for a specific feature toggle “if” statement.
(more…)

3/23/2013

The Road To Continuous Delivery - Part 2 - Visibility

Filed under: — Aviran Mordo

Previous chapter: The road to continuous delivery - Part 1

Production visibility

A key point for a successful continuous delivery is to make the production matrix available to the developers. At the heart of continuous delivery methodology is to empower the developer and make the developers responsible for deployment and successful operations of the production environment. In order for the developers to do that you should make all the information about the applications running in production easily available.

Although we give our developers root (sudo) access for the production servers, we do not want our developers to look at the logs in order to understand how the application behaves in production and to solve problems when they occur. Instead we developed a framework that every application at Wix is built on, which takes care of this concern.

Every application built with our framework automatically exposes a web dashboard that shows the application state and statistics. The dashboard shows the following (partial list):
• Server configuration
• All the RPC endpoints
• Resource Pools statistics
• Self test status (will be explained in future post)
• The list of artifacts (dependencies) and their version deployed with this version
• Feature toggles and their values
• Recent log entries (can be filtered by severity)
• A/B tests
• And most importantly we collect statistics about methods (timings, exceptions, number of calls and historical graphs).

Wix Dashboard

We use code instrumentation to automatically expose statistics on every controller and service end-point. Also developers can annotate methods they feel is important to monitor. For every method we can see the historical performance data, exception counters and also the last 10 exceptions for each method.

We have 2 categories for exceptions: Business exceptions and System exceptions.
Business exception is everything that has to do with application business logic. You will always have these kinds of exceptions like validation exceptions. The important thing to monitor on this kind of exception is to watch for sudden increase of these exceptions, especially after deployment.

The other type of exception is System exception. System exception is something like: “Cannot get JDBC connection”, or “HTTP connection timeout”. A perfect system should have zero System exceptions.
(more…)

3/16/2013

The Road To Continuous Delivery - Part 1

Filed under: — Aviran Mordo

The following series of posts are coming from my experience as the head of back-end engineering at Wix.com. I will try to tell the story of Wix and how we see and practice continuous delivery, hoping it will help you make the switch too.

So you decided that your development process is too slow and thinking to go to continuous delivery methodology instead of the “not so agile” Scrum. I assume you did some research, talked to a couple of companies and attended some lectures about the subject and want to practice continuous deployment too, but many companies asking me how to start and what to do?
In this series of articles I will try to describe some strategies to make the switch to Continuous delivery (CD).

Continuous Delivery is the last step in a long process. If you are just starting you should not expect that you can do this within a few weeks or even within few months. It might take you almost a year to actually make several deployments a day.
One important thing to know, it takes full commitment from the management. Real CD is going to change the whole development methodologies and affect everyone in the R&D.

Phase 1 – Test Driven Development
In order to do a successful CD you need to change the development methodology to be Test Driven Development. There are many books and online resources about how to do TDD. I will not write about it here but I will share our experience and the things we did in order to do TDD. One of the best books I recommend is “Growing Object Oriented Guided by tests

A key concept of CD is that everything should be tested automatically. Like most companies we had a manual QA department which was one of the reasons the release process is taking so long. With every new version of the product regression tests takes longer.

Usually when you’ll suggest moving to TDD and eventually to CI/CD the QA department will start having concerns that they are going to be expandable and be fired, but we did not do such thing. What we did is that we sent our entire QA department to learn Java. Up to that point our QA personnel were not developers and did not know how to write code. Our initial thought was that the QA department is going to write tests, but not Unit tests, they are going to write Integration and End to End Tests.

Since we had a lot of legacy code that was not tested at all, the best way to test it, is by integration tests because IT is similar to what manual QA is doing, testing the system from outside. We needed the man power to help the developers so training the QA personal was a good choice.

Now as for the development department, we started to teach the developers how to write tests. Of course the first tests we wrote were pretty bad ones but as time passes, like any skill, knowing how to write good test is also a skill, so it improves in time.
In order to succeed in moving to CD it is critical to get support from the management because before you see results there is a lot of investments to be done and development velocity is going to sink even further as you start training and building the infrastructure to support CD.
(more…)

Powered by WordPress