<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Digital Dim Sum &#187; visualisation</title>
	<atom:link href="http://www.digitaldimsum.co.uk/category/computing/visualisation/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.digitaldimsum.co.uk</link>
	<description>Bite sized info snacks for the digital generation</description>
	<lastBuildDate>Thu, 03 Feb 2011 22:16:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
		<item>
		<title>Dealing with creaky legacy platforms</title>
		<link>http://www.digitaldimsum.co.uk/2011/02/03/dealing-with-creaky-legacy-platforms/</link>
		<comments>http://www.digitaldimsum.co.uk/2011/02/03/dealing-with-creaky-legacy-platforms/#comments</comments>
		<pubDate>Thu, 03 Feb 2011 20:10:44 +0000</pubDate>
		<dc:creator>jonny</dc:creator>
				<category><![CDATA[Agile]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[visualisation]]></category>

		<guid isPermaLink="false">http://www.digitaldimsum.co.uk/?p=79</guid>
		<description><![CDATA[<p>The following article, written by myself and my colleague, Matt Simons, <a href="http://www.cutter.com/offers/legacymod.html">was published</a> in the December 2010 issue of the <a href="http://www.cutter.com/itjournal.html">Cutter IT Journal</a> and is re-produced here with kind permission. It was also the subject of a <a href="http://www.thoughtworks.com/tackling-legacy-technology">talk we delivered in Santa Clara</a> in September.</p>
<h2>The landscape is changing</h2>
<p>Since the dawn of the software era, systems have generally followed a lifecycle of develop/operate/replace. For the type of systems our company, ThoughtWorks, specializes in (typically built over the past 10-15 years), organizations expect as much as 5-10 years between significant investments in modernization. And some of the oldest core systems have now reached 40+ years — far longer than the average life-span of most companies today!</p>
<p>IT assets are relatively long-lived largely because modernization often represents a significant investment that doesn’t deliver new business value in a form that is very visible to managers or customers. Therefore organizations put off that investment until the case for change becomes overwhelming. Instead, they extend and modify their increasingly creaky platforms by adding features and making updates to (more or less) meet business needs.</p>
<p>For decades, this tension between investing in modernization versus making incremental enhancements has played out across technology-enabled businesses. Every year some companies take the plunge and modernize a core system or two, while others opt to put yet another layer of lipstick on the pig.<br />
<!--more--><br />
We see this pattern being disrupted as the demands being placed on legacy systems undergo a fundamental shift. Previously, system changes were driven by business requests for incremental features, and IT had to deal with a major new technology platform or architecture only every five years or so. Today, the viable lifespan of any business model is shrinking, driving demand for wholesale feature changes on a nearly continuous basis. For many of these new features, the benefits of leveraging one of the ever-expanding varieties of new architectures and platforms are significant. For example, your current infrastructure was probably not designed to enable you to connect your supply chain directly to an e-commerce channel or provide customer self-service via mobile devices. Sure, you may be able to force that square peg into your round hole, but the chances of an elegant, extensible solution are slim.</p>
<p>The “lifecycle model” of software systems is becoming irrelevant. The companies that will excel in the future will be the ones that learn to incrementally modernize and then continuously evolve their core technology assets to thrive in an ever-more volatile business and technology environment.</p>
<h2>The first step is admitting you have a problem</h2>
<p>Organizations that are constrained by creaky platforms are often slow to identify this as the root cause of their trouble. Instead, as release cycles grow longer and delivery quality declines, fingers get pointed at IT or<br />
at product/service vendors who are getting bogged down trying to work around the underlying problems caused by mounting technical debt in a toxic systems environment.</p>
<p>Figure 1 illustrates some key indicators of an underlying creaky platform, grouped according to whether they are felt more acutely by business or IT stakeholders and by whether they are intuitive or measurable. As you look through the factors, keep in mind that none of these should be considered “normal” or something that “comes with the territory” in IT. In fact, there are organizations out there operating in complex and volatile enterprise environments that do not experience any of these problems.</p>
<p><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2011/02/signs-of-a-creaky-platform.png" title="Signs of a creaky platform" width="504" height="378" class="alignnone size-full wp-image-98" /><br />
<strong>Figure 1 — Signs of a creaky platform.</strong></p>
<p>We have often found that the intuitive factors manifest themselves in advance of the measurable factors. These can therefore be considered leading indicators that, should they appear, signal you to dig a bit deeper into your underlying technology infrastructure.</p>
<p>When evaluating a complex situation, organizations typically respond well to logical, fact-based arguments supported by quantitative data. So, as you consider whether you need to make a case to modernize, you should apply some measurements to areas of intuitive pain. There are different techniques for quantifying business and IT pain.</p>
<h3>Quantifying business pain</h3>
<p>Many of the business pain items in Figure 1 are straightforward to measure and don’t require special techniques. Simple counts of bugs, feature backlogs, and release frequencies help quantify the business impact<br />
of an underlying creaky platform. These items are especially effective when presented as trends over time versus point-in-time readouts.</p>
<p>However, one common type of business pain — cumbersome business processes — benefits from a more structured quantitative approach. There are many ways to do this, but one of the best is a technique from lean thinking called value stream analysis.1 This approach analyzes customer value–producing processes and identifies areas of waste. Measuring waste provides a powerful quantification of the business impact of cumbersome processes.</p>
<h3>Quantifying IT pain</h3>
<p><a href="http://www.martinfowler.com/bliki/TechnicalDebt.html">Technical debt</a> is the phrase used to summarize the pain caused by the cumulative effect of all the tactical, short-term, “band-aid” solutions we use on IT projects. Just as in life, where it often makes sense to take on debt, at times it makes sense to go into technical debt in order to reach certain delivery milestones. Organizations that take a conscious decision to increase technical debt are usually sophisticated enough to plan to pay it down over time once the short-term objective has been reached. The problem is that technical debt is often built up unconsciously and, like a runaway credit card account, <a href="http://www.alphaitjournal.com/2008/07/cross-technical-balance-sheet-part-i.html">reaches a point where it becomes very difficult to bring the balance down</a>.</p>
<p>Organizations often unwittingly amass technical debt because the major metrics managers have access to and are measured against are scope and budget-based and rarely include intrinsic quality metrics. The challenge is finding ways to balance these velocity-based metrics with quality metrics that can highlight the hidden cost of the continual tradeoffs that are made.</p>
<p>More objective measures of software quality do exist and can be used to track and control technical debt. Individually, none gives the whole picture, but together they begin to tell a cogent story. These metrics fall into three broad categories:</p>
<ul>
<li>Static code analysis (<a href="http://javancss.codehaus.org/">complexity</a>, <a href="http://pmd.sourceforge.net/cpd.html">duplication</a>, <a href="http://www.headwaysoftware.com/">cohesion, tangling</a>, etc.),</li>
<li>Rule violations/programming errors (which can be identified using <a href="http://checkstyle.sourceforge.net/">Checkstyle</a>, <a href="http://findbugs.sourceforge.net/">FindBugs</a>, <a href="http://pmd.sourceforge.net/">PMD</a>, etc</li>
<li>Test quality (<a href="http://emma.sourceforge.net/">coverage</a>)</li>
</ul>
<p>Later in this article, we highlight some techniques for using metrics to drive remediation efforts, but when attempting to quantify technical debt, we have found it helpful to compare metrics against a set of well-known applications. Benchmarking a bundle of metrics against a few open source and internal projects of varying, but known, quality provides a clear comparative view of an application’s quality.</p>
<h2>Sounding the call to action</h2>
<p>Having decided that a system needs to be modernized, you need to get money to do it. With many priorities competing for funding, spending on legacy systems can be a tough sell. In our experience, timing is everything, and specific situations or events can open the door to modernization:</p>
<ul>
<li><strong>New leader</strong>. Often new executives look to make their mark with a major initiative, and businesses give them leeway to do so. New people are also not vested in relationships and decisions made before their time, giving them more freedom to consider alternatives.</li>
<li><strong>New rules</strong>. New regulations or standards create non-negotiable drivers to update legacy systems. Depending on the scale of change, this may present an opportunity to make a case to deliver the change on a modernized platform as opposed to modifying the existing one.</li>
<li><strong>Business crisis</strong>. Most businesses respond assertively to threats and crises. One of our customers enjoyed years of a near monopoly, despite a core creaky platform that caused them to consistently underdeliver against their product roadmap. The impetus for modernizing that platform came when a major customer<br />
left for a competitor because it had lost faith in the roadmap.</li>
<li><strong>Opportunities lost</strong>. Losing prospective new customers because of the application’s defects and apparent age is a powerful motivator for modernizing a creaky platform.</li>
<li><strong>New strategy</strong>. Aligning modernization with a key business strategy is wise. For example, we worked with an organization that ran an ad-driven community Web site. When business leaders decided to sell that Web site as a reskinnable platform to multiple customers, the development team made a successful argument to invest in modernizing the site.</li>
<li><strong>Technology breakthrough</strong>. Sometimes a new development in technology changes the economics of remediation or creates a new source of return on that investment. Thinking through the applications of new technology to your remediation efforts is worth the time.</li>
</ul>
<h2>IT-business collaboration is critical</h2>
<p>Too many organizations make a fundamental error by approaching modernization as an IT-only problem. In so doing, they miss an opportunity to create new business value. They also tend to prioritize the work from a technical perspective, which often results in a quite different approach and solution than one created collaboratively with business stakeholders.</p>
<p>The perils of the IT-only approach were brought home to us during a consulting engagement in which the IT department of an investment bank asked us to validate their modernization roadmap. The roadmap was a plan to replace 29 systems over almost five years, resulting in a cutting-edge IT infrastructure. One of the first things we did was to ask key business stakeholders if the roadmap was aligned with their priorities. We were shocked to discover they weren’t even aware of the initiative. They were very concerned that by proceeding without business input, IT was likely to just rebuild all the redundant and inappropriate systems the business was struggling with, warts and all.</p>
<p>This is an extreme case, but it happens more frequently than you might expect. Fortunately, this story has a happy ending. We were able to broker a conversation between IT and the business that resulted in a major rationalization of the application portfolio and delivered a leaner, better-performing system much more quickly than the initial roadmap. The most successful modernization efforts are jointly planned and executed to deliver against IT and business priorities, incrementally evolving toward a better state for all stakeholders.</p>
<h2>Deciding how to proceed</h2>
<p>Once you’ve got your funding, you are faced with a decision about how to proceed. The two primary dimensions to consider are refactoring versus rewriting and “big bang” versus incremental. A rewrite can recreate exact feature parity with the existing application, just implemented in a new technology, or it can include redesigning the functionality. Despite the added complexity in testing, we strongly recommend taking the opportunity to identify what functionality is still really needed by the business. The keys to success in this scenario are:</p>
<ul>
<li>Working in tight coordination with the business and end users to gain constant verification of fitness for purpose</li>
<li>Working in very small increments that can be fully validated and vetted</li>
<li>Keeping the existing application in place to retain all existing functionality in other areas</li>
</ul>
<h3>Refactoring vs. Rewriting</h3>
<p>Our general advice is, where possible, to look first at incremental refactoring. Good development practices should always include a refactoring phase when each new feature is added to maintain a simple, elegant, and well-factored design.<br />
We find refactoring is best performed incrementally. Executing a large-scale refactoring exercise in isolation from the main code line (i.e., on a separate branch) should be considered dangerous. The key to a successful refactoring effort is doing it hand in hand with your normal project or production support team, integrating as you go.</p>
<p>If an application has deteriorated to such an extent that refactoring efforts are too big or painful to countenance, then you are faced with a total rewrite. Again there is a choice between big bang and incremental approaches.</p>
<h3>Big Bang vs. Incremental</h3>
<p>Replacing an application in a big bang is rarely our recommended strategy. Attempting to create feature parity with the legacy application extends timelines to the point that requirements are likely to have changed significantly between design and final delivery. Without feedback from live usage, it is likely the new version won’t meet all the business needs. The risk of the final cutover is also large, since the new application has yet to be battle-tested in production and a full data migration will be required.</p>
<p>Our preferred approach is a phased, incremental strategy. Though this may seem counterintuitive given the extra effort required to work around the existing application, our experience has shown that this minor cost is heavily outweighed by the reduced risk of the migration, the fitness for purpose of the resulting application, and the decreased disruption posed by the overall process.</p>
<p>Many business justifications for replacing an application include claims that the new application will be more “extensible.” There is a false premise that extensibility comes from up-front design activities that define modules, extension points, XML configuration, and the like. Our firm belief is that the best way to end up with an extensible platform is to extend it as you go. If you build your application incrementally, there is a good chance that you will make it extensible, particularly if you put in place the practices and patterns required<br />
to extend an application continuously, such as automated testing and simple modular design. Incremental approaches tend to increase the likelihood that your application will support ongoing extension.<br />
The real challenge, then, is effectively performing an incremental rewrite of an application. We have some recommended methods and advice on approach and coordination.</p>
<h2>Using metrics and visualization to drive remediation</h2>
<p>Technical debt, <a href="http://www.m3p.co.uk/blog/2010/07/23/bad-code-isnt-technical-debt-its-an-unhedged-call-option/">just like its financial cousin</a>, has the nasty habit of compounding. If you don’t pay it down regularly, then the ultimate recourse is declaring bankruptcy and reaching for the rewrite. The problem with the code metrics tools we mentioned earlier is that they tend to provide too much information to drive actionable remediation decisions. I remember attempting to run Checkstyle across a Fortune 500 client’s code base, and the program core-dumped before completing! In contrast, correlation and visualization are two particularly useful techniques for obtaining a holistic overview of the health of a system and also directing remediation activities.</p>
<h3>Correlation</h3>
<p>When one client was struggling to make an impact on their technical debt, we helped them by correlating multiple metrics to direct their remediation activities on the highest-priority problem areas. Our premise was that if an area was complex but rarely touched, then it was less dangerous than one that was under heavy development. Likewise, a complex area covered with good automated testing is less critical than a similar one with no test coverage. Following this thinking, we created an aggregated risk metric that correlates complexity, test coverage, and volatility. Volatility was defined as a function of source control commit activity on the area of code — frequent activity indicated high volatility. This definition allowed us to pinpoint a small set of high-risk areas to address first; in a haystack of millions of lines of code, it called out a few very specific places to begin refactoring and rationalizing.</p>
<h3>Visualization</h3>
<p><a href="http://erik.doernenburg.com/2008/11/how-toxic-is-your-code/">Toxicity</a>, another aggregated metric, has been playing a prominent role in our “system health checks” at ThoughtWorks. Toxicity charts stack multiple static analysis metrics for classes, methods, or components within an application, providing a combined “toxicity” score for each area of the code base (see Figure 2). This gives our clients guidance on where to start looking to fix problems.</p>
<p><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2011/02/toxicity.png" alt="" title="toxicity chart" width="353" height="224" class="alignnone size-full wp-image-99" /><br />
<strong>Figure 2 &#8211; Toxicity chart</strong></p>
<p>Using visualization in this way allows us to avoid drowning in a sea of data. The human visual cortex is much more efficient at complex pattern recognition than most programs we could write, so it makes sense to leverage that capability.</p>
<p>The basic stacked bar chart used for toxicity is a good start, but if you want to correlate multiple variables, then tree maps are a powerful tool in that their combination of size, location, and color allows you to overlay more complex information onto a single image (see Figure 3). The nested nature of the visualization maps well onto the hierarchical nature of most code bases; color and size are then used to aggregate other metrics such as lines of code, complexity, or coverage. Again, this provides a single-shot overview of health as well as forensic information on where to look for the smoking gun.</p>
<p><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2011/02/treemap.png" alt="" title="treemap" width="353" height="265" class="alignnone size-full wp-image-100" /><br />
<strong>Figure 3 &#8211; Treemap of code complexity</strong></p>
<p><a href="http://www.panopticode.org/gallery/index.html">Tree maps</a> show how the code is organized into packages and classes and visualizes their relative sizes. The size of the various rectangles represents the size of the class files and the encompassing packages. Coloring is then used to layer on an extra metric of interest — in this case, complexity.</p>
<p>Taking this technique one step further, a three-dimensional visualization called a “<a href="http://www.inf.usi.ch/phd/wettel/codecity.html">code city</a>” gives a real feel for the personality of the different neighborhoods in a code base (see Figure 4). This view of an application as a city supports the analogy that you need to pair program when the code base is so dangerous that you’re afraid to go in alone. A code city visualization is similar to a tree map (see above) but uses the third dimension to overlay an extra metric to correlate. An ideal combination is correlating complexity to test coverage. So, if the visualization maps lines of code to area, complexity to height, and test coverage to color, then a neighborhood containing large, tall, red buildings would represent an area of the code base that contains large, highly complex, and untested classes. Clearly this would be an area in which you would want to proceed with extreme caution.</p>
<p><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2011/02/code-city.png" alt="" title="code-city" width="350" height="232" class="alignnone size-full wp-image-101" /><br />
<strong>Figure 4 &#8211; CodeCity visualization</strong></p>
<p>As helpful as they are, automated metrics are only part of the story in avoiding technical debt. Code has to communicate effectively with both computer and human audiences. Automated functional and unit tests can tell you how well the code communicates with computers by verifying the expected behavior. However, automated metrics can only hint at how well the code communicates with humans. Human involvement is ultimately needed in the evaluation of any code base. We prefer doing this in real time through pair programming, though code reviews and other techniques can provide similar benefits. Remember: metrics should be the beginning not the end of the conversation.</p>
<h2>How to replace your legacy application</h2>
<h3>Team Considerations</h3>
<p>Before embarking on replacing an application, it is worth taking stock of your existing organization. <a href="http://en.wikipedia.org/wiki/Conway's_Law">Conway’s Law</a> states that the architecture of an application will come to mirror the communication patterns of the organization that created it. This can be summarized as “Dysfunctional organizations tend to create dysfunctional applications.” To paraphrase Einstein, you can’t fix a problem from within the same mindset that created it, so it is often worth investigating whether restructuring your organization or team would prevent the new application from displaying all the same structural dysfunctions as the original. In what could be termed an “inverse Conway maneuver,” you may want to begin by breaking down silos that constrain the team’s ability to collaborate effectively. Of course there are many situations where this may not be realistic, but remember that Conway’s Law talks of the “communication structures” of an organization rather than reporting structures. There are often opportunities to improve communication pathways in lightweight ways without having to grapple with thornier organizational issues.</p>
<p>A set of recurring themes emerges from teams that have successfully executed incremental application rewrites:</p>
<ul>
<li>Working on the main code line (or trunk) is vital to avoiding painful merges or missing important improvements occurring in the underlying application.</li>
<li>Having the team (at least partially) populated with people who have lived with the pain of the existing application and have a deep understanding of the subtleties of the business and technology domain<br />
is key in shaping the new product to meet the business’s needs.</li>
<li>Creating a small, colocated team is also recommended, as is having a clear and focused charter of the business need that each phase of the project is delivering.</li>
<li>Practicing and executing data migrations is best done from the outset of the project while it is still a tractable problem.</li>
</ul>
<h3>Technical Approach</h3>
<p>A favored approach for incrementally replacing a live application is the so-called “<a href="http://www.martinfowler.com/bliki/StranglerApplication.html">strangler application</a>” named after the family of tropical strangler figs. These plants grow quickly around an existing tree, using its existing structure for support and shape. Over time they thicken and fuse, completely surrounding and replacing the original tree, leaving a new version standing in its place as the old one withers and dies. The <a href="http://skizz.biz/blog/an-agile-approach-to-a-legacy-system/">strangler application</a> uses the same approach of creating a thin wrapper around an existing application, then gradually peeling or slicing off and replacing functionality. Features are gradually migrated from the legacy to the new application until nothing of value remains in the old, enabling a graceful retirement.</p>
<p>Common patterns include:</p>
<ul>
<li><strong>Intercepting requests</strong> at the front of an application, then redirecting certain ones to the legacy application and others to the new application</li>
<li>Sharing a single <strong>integration database</strong> or regularly trawling the old database to populate the new one</li>
<li>Peeling back the application vertically <strong>tier by tier</strong> (maybe replacing the database, presentation layer, or business logic first)</li>
<li><strong>Intercepting events</strong></li>
<li>Some combination of the above</li>
</ul>
<p><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2011/02/strangler-pics.png" alt="" title="Strangler application steps" width="500" height="750" class="alignnone size-full wp-image-115" /><br />
<strong>Figure 5 &#8211; A basic &#8220;strangler application&#8221; sequence</strong></p>
<p>Most of these patterns involve creating an extra piece of infrastructure, such as an indirection layer or a polling system, which may appear to be wasted effort. Interestingly, we regularly find that the strategies required to enable a gradual migration often prove to be valuable long-term architectural buffering devices that both provide resilience during system upgrades or outages and offer the seams needed for future enhancements. An example of this was when we provided a major ISP a mechanism to replace their application stack incrementally by separating the system that captures new client orders from the system that processes them. Later on this separation proved invaluable in enabling them to continue taking orders even while their main order processing system was down for maintenance.</p>
<h2>Breaking the cycle of pain</h2>
<p>The recommendations here provide suggestions for how to modernize, but that is only half the battle. To really elevate your enterprise to the next level, you need to break the modernization-degradation cycle permanently. Fortunately, the tools and approaches that help you incrementally modernize are exactly the same ones that can eliminate the need to ever do it again:</p>
<ul>
<li>By using sophisticated <strong>automated metrics</strong> to continuously identify degrading areas of your system, you can strike a better balance between incremental remediation and adding new features going forward.</li>
<li>By practicing an incremental, <strong>evolutionary approach to architecture</strong> (called a “strangler” in the context of remediation efforts, but just “evolutionary architecture” outside that context), you can avoid the need to embark on big-bang replacements.</li>
<li>By continuing to <strong>add tests</strong> at the same rate as you add new functionality, you can create a hygienic technical environment that gives you the confidence to make significant architecture or technology changes when a new business requirement necessitates them.</li>
</ul>
<p>Organizations that continue to think “system lifecycle” will, in the long run, lose ground to those that think “system evolution.” If you’re about to invest in a modernization effort, why not do so in a way that positions you for a fundamentally different future?</p>
]]></description>
		<wfw:commentRss>http://www.digitaldimsum.co.uk/2011/02/03/dealing-with-creaky-legacy-platforms/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Build Transformation across an Organization</title>
		<link>http://www.digitaldimsum.co.uk/2008/04/17/build-transformation-across-an-organization/</link>
		<comments>http://www.digitaldimsum.co.uk/2008/04/17/build-transformation-across-an-organization/#comments</comments>
		<pubDate>Thu, 17 Apr 2008 06:55:07 +0000</pubDate>
		<dc:creator>jonny</dc:creator>
				<category><![CDATA[build]]></category>
		<category><![CDATA[computing]]></category>
		<category><![CDATA[consulting]]></category>
		<category><![CDATA[visualisation]]></category>

		<guid isPermaLink="false">http://www.digitaldimsum.co.uk/?p=34</guid>
		<description><![CDATA[<p>My most recent project was helping a major online retailer  to mature their build process as part of a wider effort to improve their IT effectiveness through the injection of development best practices.</p>
<p>When we came onboard manual intervention was needed for any of their builds or deployments to work and so it was rare for more than a couple of builds or deployments to be completed successfully in a day. Now we often have up to 1,000 builds running every day &#8211; what’s more the majority of them now pass!</p>
<p>This article looks at a few of the techniques we’ve had to put in place to enable this transformation and what we’ve learnt along the way.</p>
<p><!--more--></p>
<h3>Dividing up builds &#8211; separation of responsibilities</h3>
<p>Initially we had two versions of the build &#8211; one that would run on commit and a second that would only run at night. The two versions did pretty much the same things, except the nightly builds would deploy to a shared testing environment. Deployments only happened at night to avoid disruption to the multiple teams working in this shared environment (another smell to be dealt with later). The on-commit build did compilation, unit testing and packaging &#8211; the nightly build did the same and then handled environment preparation and deployment.<br />
There were a few problems with this:</p>
<ul>
<li>The on-commit builds ran slowly because they were doing a lot of work</li>
<li>The nightly builds were very brittle since responsibility for different parts of the build (compilation, packaging, deployment etc) was split among multiple teams &#8211; a failure at any step would mean QA didn&#8217;t have a new build to test that day</li>
<li>It wasn&#8217;t <a title="Don't Repeat Yourself" href="http://en.wikipedia.org/wiki/Don't_repeat_yourself">DRY</a></li>
</ul>
<p>Our solution was to divide the monolithic build up into several smaller steps and tie the steps together into a &#8220;build pipeline&#8221;, with each step triggered by a successfully completed &#8220;upstream&#8221; build. Our pipeline was divided into quick, assemble, package, deployment and regression builds. “Downstream” builds would pick up the artifacts prepared at the previous step in the build.</p>
<p>This lead to various beneﬁts:</p>
<ul>
<li>Each build ran more quickly</li>
<li>We didn&#8217;t repeat steps</li>
<li>It was easier to divide up responsibilities for keeping the different builds green:
<ul>
<li>Developers :: <em>quick</em> build (compilation + unit tests)</li>
<li>Build team :: <em>assemble</em> build (jar, war, ear creation etc)</li>
<li>Deployment team :: <em>package + deployment</em> (preparing RPMs / configuration / deploying)</li>
<li>QA team :: <em>regression suites</em></li>
</ul>
</li>
<li>Because we used the &#8220;last known good&#8221; upstream artifact, deployments weren&#8217;t blocked by a last minute problem upstream in the pipeline.</li>
<li>We were now in a position to split the builds up over more boxes to achieve greater throughput through parallelism</li>
</ul>
<p>This concept of dividing the responsibilities (at least for ﬁrst and second line build support) had a major impact on improving success rates and response times to failures through an increased sense of ownership of the process.</p>
<h3>Tying the builds together &#8211; HTTP publishing</h3>
<p>In an ideal world the input for a downstream build should be the output from an upstream build &#8211; the appearance of a new artifact from one build would trigger the next build down the chain. In our situation that wasn’t quite possible (the server topology and optimal dividing lines between builds, from a speed point of view, didn’t favor communicating via artifacts) so we took a different approach.</p>
<p>Our solution was for builds to communicate with each other via small properties ﬁles, published locally and accessed over HTTP. On successful completion each build publishes a properties ﬁle (to the local Cruise web-server) containing the build number and SVN revision of the build. We built a custom Cruise publisher to handle this. We also built a custom Cruise HTTP modiﬁcation-set to watch for upstream builds and an HTTP label-incrementer to keep all the build numbers in the pipeline in sync. The same build number is used through each of the steps of the build, making it easier to check whether changes have ﬂowed through to the QA environment – more on this later.</p>
<h3>Watching the river flow &#8211; visualizing the build pipeline</h3>
<p>Once we had divided up the builds and added further applications, we ended up with close to 100 builds that needed to be managed and monitored. Since they were spread across multiple boxes to increase throughput this wasn&#8217;t as easy as just watching one large cruise page. So we built a centralized page for aggregating together the build results.<br />
<img class="alignnone size-full wp-image-35" title="Dashboard" src="http://www.digitaldimsum.co.uk/wp-content/uploads/2008/04/dashboard.gif" alt="" width="433" height="191" /></p>
<p>The dashboard page was comprised of a few separate components:</p>
<ul>
<li>Current build status grid</li>
<li>Graph of passing v failing builds over time</li>
<li>Graph of build times per type / step in the pipeline</li>
<li>Quick list of which builds are currently broken</li>
<li>Some basic code metrics</li>
</ul>
<p>We also provided similar drill-down pages, filtered just to show data on a single application or step in the pipeline. These then link through to the actual Cruise pages for log files, failure reasons and test results.</p>
<h3>Current build status grid</h3>
<p>The current build status grid  was probably the greatest breakthrough. The basic idea was to show the  current status of all our builds in a simply structured manner. Over  a few iterations we evolved a grid structure (a bit like battleships!).  The columns in the grid represented steps in the build pipeline and  the rows the different applications we were building. The columns were  repeated for each active branch. Cells were then green if the build  was good, red if it was bad and yellow if the last build was good, but  was more than 24 hours old (stale). Mouseovers give more details about  the selected build, such as when it ran and what its build number was.</p>
<p><img class="alignnone size-full wp-image-37" title="status-grid" src="http://www.digitaldimsum.co.uk/wp-content/uploads/2008/04/status-grid.gif" alt="" width="434" height="308" /></p>
<p>From this view informative  patterns were clearly distinguishable with a cursory glance. It became  obvious if problems were localized or spread across a specific application  or step in the pipeline.</p>
<p>Initially we emailed a snapshot  of the grid first thing in the morning and at the end of the day to  management and the teams. This had the wonderful effect of halting all  the annoying questions and sparking the interesting ones. Instead of  people asking whether a build was successful or what build number was  last deployed to QA we got questions like:</p>
<ul>
<li>Why are all the    assembly builds broken?</li>
<li>Why is all of application X broken?</li>
<li>Why hasn’t application Y built for 2 days?</li>
<li>Why is there a build on the prod support branch?</li>
<li>Wow – everything’s green – shall we go to the pub?</li>
</ul>
<p>This definitely shone the spotlight  in a few uncomfortable places, but the visible improvements this lead  to made it worthwhile. It became an invaluable tool for educating people  about how the pipeline works. Once people understood and were reliant  on the data we replaced the emails with a dynamic, self-service, application  hosting the data.</p>
<h3>Graphing build metrics over time</h3>
<p>We also had great success with  some of our time-based visualizations. Simply graphing the times of  each step in the build pipeline over the last month quickly highlighted  problems as they developed. In one case we realized that all the packaging  builds were getting geometrically slower and slower – a quick bit  of research showed that each build was adding new artifacts to SVN and  the ensuing checkouts were suffering. We fixed this problematic practice  and had a massive positive impact on the total throughput of the pipeline.</p>
<p>The historic view of successful  vs. failing builds over time was also a particularly useful tool while  we were focusing on improving the stability of our deployments to QA  / UAT. This gave us a simple report on how we were doing – 50% success  last week, 75% this week – woo hoo!</p>
<h3>Sparklines</h3>
<p>We found a simple yet effective  way of communicating a large amount of information about how a build  has been performing by using “sparklines”.</p>
<p><img class="alignnone size-full wp-image-36" title="sparklines" src="http://www.digitaldimsum.co.uk/wp-content/uploads/2008/04/sparklines.gif" alt="" width="434" height="242" /></p>
<p>For a given build these graphs  communicate the number, duration and success of a large quantity of  builds. You can quickly get an idea of how things are trending. Again  mouseovers provide more details, context and drill down.</p>
<h3>Under the hood</h3>
<p>Initially the grid was generated  and emailed by a scheduled round-robin script hitting all the build  servers. But this was slow, required us to configure all the build server  addresses in one place, wasn’t fault tolerant and stored no build  history. So we moved the build telemetry dashboard to a simple Rails  webapp employing CSS wizardry for the visualizations. The data was populated  via a basic RESTful API. Each build has a custom HTTP publisher (again  we built this) that fired off a simple set of information about the  build in a POST request:</p>
<ul>
<li>Project name</li>
<li>Build label</li>
<li>Date / time</li>
<li>Build duration</li>
<li>Build successful</li>
<li>Hostname of the Cruise server</li>
</ul>
<h3>Build naming conventions</h3>
<p>The one extra key to making  this all work was standardizing the format of the Cruise project name  so we could parse out the application, branch and pipeline step without  adding any extra complicated meta-data. We chose “&lt;branch&gt;-&lt;application&gt;-&lt;step&gt;”  as our format, e.g. “trunk-bigapp-quick” or “3.20-littleapp-assemble”.</p>
<h3>Where’s my build?</h3>
<p>Using our custom label incrementer  meant that each build in the pipeline for a given SVN revision would  have the same build number (e.g. “trunk-bigapp-quick.123”, “trunk-bigapp-assemble.123”,  “trunk-bigapp-deploy.123”). This made it much easier for developers  and QA to work out what builds and features had made it through to the  QA environments. We built on this by providing a customized version  of the currents status grid to answer exactly that question – where’s  my build?</p>
<p><img class="alignnone size-full wp-image-38" title="wheres my build?" src="http://www.digitaldimsum.co.uk/wp-content/uploads/2008/04/wheres-my-build.gif" alt="" width="434" height="157" /></p>
<p>The grid displays the latest  build number at each step – clearly showing how far changes have progressed  and also highlighting how a broken step breaks the pipe. This has become  a vital self-service tool for the dev, QA and deployment teams, and  has saved the build team from a lot of very dull questions.</p>
<p>A growing awareness of the  importance of the pipeline has allowed us to devote efforts towards  optimizing the total throughput. All the good lean principles definitely  apply at this point – looking at optimizing the whole system and working  out where we’re queuing or waiting too much nearly always deliver  the best results.</p>
<h3>And the rest  …</h3>
<p>These tools freed us up to  use the build team as a beach-head for spreading general coding, development  and automation best practices throughout the organization. Some of these  included:</p>
<ul>
<li>Untangling spaghetti dependencies with Ivy</li>
<li>Spawning multiple virtual machines for regression testing</li>
<li>Automating Cruise configuration and version updates</li>
<li>A whole host of black-belt Ant Fu</li>
<li>Correlating multiple code metrics to drive code quality improvements</li>
</ul>
<p>But those are stories for another  occasion …</p>
<p>[This article was originally written for an internal ThoughtWorks innovation newsletter - I've made some minor edits for this blog]</p>
]]></description>
		<wfw:commentRss>http://www.digitaldimsum.co.uk/2008/04/17/build-transformation-across-an-organization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blocks of Code</title>
		<link>http://www.digitaldimsum.co.uk/2007/05/16/blocks-of-code/</link>
		<comments>http://www.digitaldimsum.co.uk/2007/05/16/blocks-of-code/#comments</comments>
		<pubDate>Wed, 16 May 2007 20:11:07 +0000</pubDate>
		<dc:creator>jonny</dc:creator>
				<category><![CDATA[coding]]></category>
		<category><![CDATA[visualisation]]></category>

		<guid isPermaLink="false">http://www.digitaldimsum.co.uk/2007/05/16/blocks-of-code/</guid>
		<description><![CDATA[<p>Coding really is become child&#8217;s play. This <a href="http://news.bbc.co.uk/2/hi/technology/6647011.stm" title="Free tool offers 'easy' coding" target="_blank">recent BBC news article</a> points to some of the ways kids are being introduced to programming.</p>
<p>The good people at MIT have put together <a href="http://scratch.mit.edu/" title="Scratch @ MIT" target="_blank">scratch</a>  a visual tool allowing kids to do drag-n-drop coding. The <a href="http://llk.media.mit.edu/projects/scratch/help/" title="Scratch help screens" target="_blank">help screens</a> give a good idea of how it works. Basically differently shaped blocks are put together (lego style) to form programs: a loop looks like a capital C and holds all the nested statements; booleans are pointy ended and only fit in pointy slots &#8211; likewise numbers are round and only fit in round slots &#8230; a nice simple introduction to strongly-typed languages. It actually fits pretty closely with how I visualise blocks of code, so it looks like a great way to introduce children to the coding mind-set. The welcoming colourful blocks are non-threatening and simple to understand.</p>
<p><!--more-->If that all seems too point-n-clicky then there are also <a href="http://hacketyhack.net/" title="HacketyHack" target="_blank">efforts to get kids using Ruby</a>. It&#8217;s a simple hosted and sandboxed service giving children the chance to write real programs &#8211; like a blog in just 6 lines.</p>
<p>This block caught my attention:</p>
<blockquote><p><strong>WHITHER ART THOU, BASIC??</strong></p>
<p>In the 1980s, a language called BASIC swept the countryside.  It was a language <strong>beginners          could use</strong> to make their computer speak, play music.  You could easily draw a big smiley face          or a panda or whatever you like!</p></blockquote>
<p>I&#8217;m interested in their choice of Ruby as the simple &#8220;gateway&#8221; language (the first one&#8217;s free, tell your mates), since there have been various discussions about whether Ruby is too complex or powerful for many inexperienced QA and dev teams to be *trusted* with.</p>
<p>It&#8217;ll be interesting to see how these kinds of tools are adopted. Either way I think some people at my current client could use a more pointy-clicky programming language &#8230;</p>
]]></description>
		<wfw:commentRss>http://www.digitaldimsum.co.uk/2007/05/16/blocks-of-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Show don&#8217;t tell: Consulting with GraphViz</title>
		<link>http://www.digitaldimsum.co.uk/2007/05/06/show-dont-tell-consulting-with-graphviz/</link>
		<comments>http://www.digitaldimsum.co.uk/2007/05/06/show-dont-tell-consulting-with-graphviz/#comments</comments>
		<pubDate>Sun, 06 May 2007 18:12:18 +0000</pubDate>
		<dc:creator>jonny</dc:creator>
				<category><![CDATA[build]]></category>
		<category><![CDATA[consulting]]></category>
		<category><![CDATA[visualisation]]></category>

		<guid isPermaLink="false">http://www.digitaldimsum.co.uk/2007/05/06/show-dont-tell-visual-consulting-with-graphviz/</guid>
		<description><![CDATA[<p>I&#8217;ve often found that it&#8217;s much more effective to <em>show </em>clients what their problems are, rather than just <em>telling </em>them. Recently I&#8217;ve ended up using <a href="http://www.graphviz.org/" title="GraphViz" target="_blank">GraphViz </a>as a great tool for high-lighting complexity that needs to be addressed.</p>
<p>At the client I&#8217;m currently working for the complexity of the build scripts was getting out of hand. I wanted to goad the customer into prioritising some simplification work. So I turned to GraphViz to depict how complex the build was. The build we&#8217;re using is a large, centralised, Ant script that builds about 10 different applications. It manages everything through the process of compile, test, package and deploy.</p>
<p>I found the <a href="http://ant2dot.sourceforge.net/" title="ant2dot.xsl" target="_blank">handy ant2dot.xsl tool</a> that uses XSL  to transform an Ant build file into a DOT format graph representing the flow and dependencies between the various build targets.</p>
<p><!--more--></p>
<h3>Before</h3>
<p>The picture that emerged was pretty shocking:</p>
<p style="text-align: center"><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2007/05/build-scripts-before.png" alt="Before: build script target dependencies" /></p>
<p>Even printed out on the largest paper the client&#8217;s printer could handle you still couldn&#8217;t even read the target names. Images of the <a href="http://www.venganza.org/" title="Flying Spaghetti Monster" target="_blank">Flying Spaghetti Monster</a> kept jumping into my mind. But the tactic worked and when I showed the client the picture in our Iteration Planning Meeting they were quick to prioritise some work to simplify the build process.</p>
<h3>After</h3>
<p>After a month or so of gradual re-structuring and simplification I repeated the visualisation exercise to see how we were doing. The results were pretty striking:</p>
<p style="text-align: center"><img src="http://www.digitaldimsum.co.uk/wp-content/uploads/2007/05/build-scripts-after.png" alt="After: build script target dependencies" /></p>
<p>Needless to say the pictures told more than a thousand words could: the client was happy, we had a neater picture to help developers understand how the build scripts worked AND the builds were simpler.</p>
]]></description>
		<wfw:commentRss>http://www.digitaldimsum.co.uk/2007/05/06/show-dont-tell-consulting-with-graphviz/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

