If “Buy Before Build” is the answer then what was the question?

If you made a list of the “Top Five All Time Favourite” principles, then I’m sure “Buy Before Build” would be on the list. It just seems like one of those obvious statements. Why wouldn’t you buy of the shelf – proven – software to reduce delivery risk, outsource (non-core) software development, and gain incremental improvements through upgrades.

Buy before build
Buy before build

That’s all very good in theory, but it only works well if the off-the-shelf software provides a good functional fit in terms of business processes, data and organisation, as well as a good technical fit in terms of non-functional requirements and existing technology environment.

The Buy-Before-Build Anti-pattern

This anti-pattern exists when an organisation deploys off-the-shelf software with a poor fit. The result is a pile of custom developed code to change the off-the-shelf software (as illustrated below).

Buy before build - the reality
Buy before build – the reality?

Why would this happen? A poor fit exists when:

  • The (assumed) business processes built into the off-the-shelf software requires significant business process re-engineering. A system implementation project can accommodate smaller changes, but larger changes are unlikely without a dedicated effort. And that’s of course assuming it is in the organisation’s interest to align business processes. Off-the-shelf implies commoditisation, and no one would like to commodities their business advantage.
  • Integration with existing IT systems will be difficult and expensive, where the off-the-shelf software implicitly assumes itself to be the data and/or process master. Even if the new system has a perfect functional fit, the custom code required to ensure good synchronisation between two masters is the perfect recipe for cost blow out.
  • Operation management of an off-the-shelf software system designed for business hours only operation may prove impossible to do within an 24 by 7 environment. Gaps are likely to occur due to long-ish maintenance windows, lack of true redundancy and insufficient configuration management.

Buy-Before-Build is the answer, if the off-the-shelf software provides a “good enough” functional, operational and technical fit – and if future upgrades remain feasible and without too much complexity. If it isn’t then the organisation might have bigger problems to tackle (technical debt?) or it’s trying to standardise its competitive advantage – In those cases, build-before-buy might be the better option (at least in the short-term).

Waterfalls don’t exist…

“We use the waterfall methodology”. You can literally see the scorn on the developer’s face. No one wants to use the waterfall methodology. Except if you are a project manager, of course. They secretly prefer the waterfall, because it shows progression. All these agile iterations, rework, and daily stand-ups are difficult to fit into the weekly status report – when it is all going to finish?

But here’s the thing. There is no waterfall methodology.

I’m not the first to claim that there never was a waterfall model. Some say Winston Royce described the original waterfall method, but he never used the word waterfall nor did he describe a waterfall like project management approach. It does look like one on page two in his original paper, but the rest of the paper describes all of the recommended iterations and feedback loops. Not very waterfall like. I’ve never been part of a project which used a strict waterfall model. Yes, there was stages which we completed, but there was always iterative reviews, change management and several releases of the software.

And the other thing: Agile methodologies are as much a waterfall as traditional methods (as described by Royce).

All agile methods have a start, a middle and an end. Whether you describe the middle as spirals, iterations or re-factoring activities, they are about bringing the project towards its completion. It is true that agile methods are the opposite to the waterfall model, but that is also true for all other software engineering development methods. It is just not possible to develop software in a strict, waterfall fashion. Software development is a creative activity, not a construction activity. We can of course pretend and just iterate away, but that is a risky proposition. As I have written before, it is really not about agile versus traditional methods.

What really matters is the time it takes to make the right set of decisions.

If you can do it fast then you are agile. If not then… well, you are slower. Agile methodologies are good at making decisions fast. But for larger projects, they lack the structure and overall decision authority. This, I believe, introduces the risk of people being forced to make decisions that aren’t really theirs to make, at the wrong time and without the appropriate context. There is nothing agile about incorrect design decisions; think mud.

Not that traditional methods are much better. They often get in the way of decision-making. So will too many stakeholders, design by committee or a lack of authority delegation.  Process and methodology don’t make decisions, people do. Architecture, process and methodologies help to structure and organise the required work. Agile methodologies, prototypes and general feedback help to evaluate the design at certain points.

But true project agility is only achieved through efficient decision-making; decisions made by authorised people at the right time within an appropriate context. This is why architectural decisions are so important to document and manage.

Good, well-documented architectural decisions makes you agile – even if you like waterfalls.

Why do we have balls of mud?

IT complexity is often described using metaphors such as “the big ball of mud” or the “software hairball syndrome” to illustrate a haphazardly structured software system. What intrigues me about these systems is their stubborn ability to morph into existence and their persistence. Almost all the companies that I’ve worked with over the years have at least one of these systems and many are several systems intertwined. And they are clearly problematic to maintain and change.

While searching for answers, four approaches repeatedly turned up in different disguises:

  • Delegate – typically to the Enterprise Architecture team, so when it doesn’t work, we can blame them (win either way).
  • Let’s move it to the cloud – it is same shape, so it should fit (Mud as a Service?).
  • New tool – where is that mud removing feature?
  • New method – the IT department is incompetent, so they’ll need a bit of extra help.

Leaving the tongue-in-cheek behind, all the approaches seemed to me like going on a diet. Sure, if there is a lot of complexity then we need to remove it. But diets are rarely much fun, which makes it hard to motivate people and easy to stop regardless of achievements. In fact they often make the problem worse, because they don’t stop you from regaining weight. Diets also don’t address the reason behind the excess weight – we need to look at our habits to find (one of) the reasons.

People don’t create complexity on purpose or out of incompetence. Complexity is rarely in people’s own interest. And if we were incompetent then large corporations shouldn’t be able to function; let alone reasonably well. So why do we end up with complexity despite our best efforts?

While I cannot claim to have the answer, I do think the problem is closely related to how an organisation makes design decisions.

Decision Context

Conway observed that the organisation’s communication structure is mirrored in its software systems. That is the organisation’s accumulative set of design decisions agreed through the organisation’s communication structure is what forms the systems. Software architecture is about guiding and constraining design decisions, which architects facilitate through the available communication structures.

But there is obviously more to it than just structure – what (the decision) we communicate are also important.

Tyree and Akerman’s article, “Architecture Decisions: Demystifying Architecture” (PDF) describes the elements (“what”) of an architectural decision well.  Their definition stress the importance of capturing the decision context such as the associated requirements, constraints, related decisions, and rationale (labelled “context D” in below diagram); as well as the consequences (also referred to as implications). We tend to think of consequences as something relative localised, but architectural decisions often have a much wider impact in four different dimensions: project, system, domain, and enterprise. The same decision will have different consequences depending on its context (“C”):

  1. the project; the context for making and implementing decisions
  2. the system; that is the changed system or impacted another system change
  3. the domain; a horizontal aspect such as security, operations, or a single, enterprise wide business function
  4. the enterprise; the synergy of all domains

The third part is decision priority. Some decisions cannot be made before others, while others are closely interlinked. We structure decisions into decision dependency trees (explicitly or implicitly). In a pure architect only world, we’ll expand and prune the decision tree, as we work our way through it while carefully evaluating the available options. In reality, there are many influences on the decision priority including schedule, cost, available information and the organisation itself (labelled as context P).Decision ContextThe above diagram seeks to illustrate the three parts: Priority, Decision, and Consequence with their respective contexts. For smaller teams, context C, D, and P are just “the context”, but this is rarely the case within larger organisations. What is the result of this less than perfect context overlap (between P, D, and C)?

Context Misalignment Creates Complexity

Most architects have experienced the scenario, where project schedule drives the decision priorities. This forces the need to take shortcuts through the decision tree, and we risk making less than ideal decisions, either because we don’t have time to find the required information or have insufficient time to perform proper analysis. Another key factor is the organisational structure itself. Bass and others found that projects run into problems, when the chosen architecture required subsequent communication needs which the organisation structure couldn’t support – the anti-pattern of Conway’s Law if you like. What’s interesting about Bass’ findings is that we give priority to decisions that we can solve with a given organisational structure. We do this rather than focus on changing the organisational structure to support the actual, required decision structure.

Making decisions without consideration for their multi-dimensional impacts will also cause problems. While we tend to blame the project manager for forcing us to consider the schedule, cost and available resources, architectural decisions made without appropriate consideration for all four dimensions also adds to the complexity – for example:

  • A functional focus without a view to operational domain (project vs domain) will add to the system support and maintenance cost
  • Accepting system capabilities without questioning the functionality integrity (system vs domain) will compromise domain level capabilities
  • Driving a strong domain strategy risks compromising core systems and the enterprise itself (domain vs systems vs enterprise), and;
  • Ivory tower syndrome (enterprise vs the rest)

While IT complexity might be one of those impossible problems to solve, I believe that it goes along way to simply acknowledge and attempt manage the decision context.

Is “architecture” the best metaphor?

We often rely on metaphors and analogies to explain software, its structure and function. We cannot see software (except for its user interface), so we use metaphors to illustrate. It is true that we can print out the source code and look at it, but such static representation can mislead as far as the software’s true behaviour (we call these defects).

Metaphors can also mislead. A single metaphor can rarely illustrate all aspects leaving us with an incomplete illustration. Integrating several metaphors is tricky, as it can easily confuse. But without, a single metaphor will be over-interpreted and extended to cover the gap inappropriately. Consider the word ‘architecture’ as a software metaphor.

We use the word ‘architecture’ as a metaphor to illustrate and explain the purpose of ‘software architecture’ (and feel free to replace ‘software’ with ‘enterprise’, solution, technical, etc.). Most people have an intuitive understanding of ‘architecture’ as representing structure, important, and quality. More generally, the word ‘architecture‘ is defined as the “formation or construction of a unifying or coherent form or structure” and can refer to “both the process and the product of planningdesigning, and constructing buildings and other physical structures” (wikipedia).

Software consists of structures including the one found in its source code, although it is not the only structure of concern. Coherence is a desired property of software, that we seek to maintain through the formation and construction of the structure. Without coherence, the software is unlikely to work properly and be maintainable. The building architecture metaphor is frequently used to define ‘software architecture’ and sometimes made explicit – see Clement et al, Spewak, The Open Group, or Perry and Wolf as well as Garlan and Shaw.

There is one problem. Software is not a building.

Philippe Kruchten argues in his paper, “The Nature of Software“, that software has a very low manufacturing costs. In a strict engineering sense, software manufacturing cost is only the cost of creating the distribution media. The act of programming is similar to creating a cast in mechanical engineering.

This changes the purpose of ‘software architecture’ dramatically from ‘building architecture’ – in other words, the metaphor breaks down.

Software architecture is about guiding and constraining subsequent design decisions and not the subsequent construction (aka manufacturing). Architectural documentation, patterns, and styles guide the design process, while architectural decisions such as “we must use Java” constrain subsequent design decisions (as we can no longer use C# and probably not Microsoft application servers). Architectural design is about defining a suitable Design Space to support subsequent design decisions (which in themselves can also be considered an architecture; enterprise vs solution, solution vs technology etc.).

It occurred to me that many aspects of creating a useful architecture relate as much to cartography (map making) as to building architecture. According to wikipedia, the fundamental problems of traditional cartography cover:

  • Map editing: Set the map’s agenda and select traits of the object to be mapped. We must tailor our architecture according to our stakeholder concerns and choose appropriate views.
  • Map projections: Represent the terrain of the mapped object on flat media. We must choose the appropriate software element representations (such as those described in UML or ArchiMate) and how they can be mapped between different views.
  • Generalisation: Eliminate characteristics of the mapped object that are not relevant to the map’s purpose to reduce the complexity of the characteristics that will be mapped. If there ever was a single critical activity for software architects, then this is it – generalisation to reduce complexity.
  • Map design: Orchestrate the elements of the map to best convey its message to its audience. If you cannot communicate your architecture then there is no point.

Maps express structure often with several layers of information. Their intent is to guide a decision process. Cartography works well as a metaphor for software architecture. On the other hand, the (building) architecture metaphor does not encapsulate many of the challenges we face when designing software systems, while implying activities typically associated with IT management methodologies – a consequence of building architecture’s intend to guide construction and not design.

But I guess “software cartographer” does not have the same ring to it – as “software architect” does – even if, as a metaphor, it provides a better representation.


The Elusive Knowledge

If only they had written a better design document” complained the (to remain unknown) architect in frustration after a series of “he said”, “she said”, and “no, that’s not what I meant”. Software development is tricky business and more so the larger the system. Large systems mean more people with smaller pieces of responsibilities (relative to the whole) necessitating more communication to ensure a cohesive system. And our faulty memory makes it wise to document the design.

But what stops us from writing better documentation? What do we mean by “better”?

Architects write documentation to communicate what we know about the software. Software is difficult to observe, see or feel unlike the way we can with buildings. Yes, we can see its user interface, but that’s like attempting to assess the shape, size, location, direction and speed of an iceberg based on an aerial photo. Software architecture is, at its core, a knowledge management problem.

Knowledge is the awareness, understanding and experience with the collected information, where information convey data and their relationships (Zins, 2007). Within a project context, either three (knowledge, information and data) is: documented or undocumented; and implicit or explicit – as illustrated in figure 1 and described by Kruchten, Lago, and Vliet (2006).

Figure 1 – The Relationship Matrix of Knowledge

The implicit, undocumented knowledge includes what we don’t know what we don’t know. But it also includes people’s experience or intuition (gut feel), as well as the “of course” knowledge. We use checklists and methodologies to cover us against what we don’t know, we hire good people to gain the best experience, however, the “of course” knowledge is difficult to quantify, as we might be unaware. “Of course” is a risk to the project; what is obvious to you is not obvious to the team. If projects are large enough, then people will also build up a certain amount implicit knowledge about the project, its deliverables, and involved teams and systems.

The implicit, but documented knowledge is the unattainable – either because we lost or cannot find the manual, incomprehensible (source code), or inaccessible due to contractual or intellectual property limitations. Old systems or third-party developed software will contribute to this category.

The explicit, yet undocumented knowledge cover our observations, the water cooler discussions, and general awareness which we can articulate. This knowledge will quickly become implicit knowledge if it is not documented. And similar to the implicit knowledge, people working as part of large projects will build up this type of knowledge about the project and associated systems.

The last category is what we deliver – explicit, documented knowledge about the new system. However, the problem is to get all of the relevant knowledge into this category.

Can we do that? No.

Nonaka (1991) identified four knowledge creating processes in his seminal HBR article “The Knowledge-Creating Company”. Rather than just walk you through the four processes, figure 2 attempts to illustrate them.

Figure 2: The Knowledge Creation Process

The first part involves observations to gather tacit data. If we observe data (aka analyse) then we may find information. This we may be able to articulate into explicit information, which we can then organise into documents with text, charts and diagrams. We can then take the explicit information and apply it within a context to solve a problem or support a decision (the tacit knowledge). And observations about tacit knowledge form a critical part of architectural evaluations such as ATAM.

The problem is that these knowledge creating processes (illustrated as arrows) don’t retain information or knowledge perfectly – some gets left behind. And each of them creates new implicit knowledge – typically in the form of experience – which can be difficult to capture accurately. Knowledge is elusive by nature and therefore challenging to capture perfectly.

To create better documentation, we need to “deliver” all three types of sharable knowledge:

  1. Tacit/implicit – that is mostly in people’s heads. We may be able to express it and thereby make it explicit; or we may not. The portion we can articulate, we deliver through presentations, workshops and other forms of meetings.
  2. Documented – it exists in a written form (email, notes, whiteboard, or word processing document). While these are important during data gathering and analysis to create information, we need to create a formal home either as an appendix or separate support documentation.
  3. Formal refers to not only the documented form, but also that it is structured in some systematic way using a modelling language (e.g. UML, ArchiMate, SysML, etc.).

Better architectural documentation must address all three types of knowledge, their role in the creation process, and their relationships.


The Social Enterprise – what problem are we trying to solve?

Social Computing along with Cloud Computing is one of the hot IT buzz words – i.e., the Social Cloud must then be the ultimate in buzz word compliance. This is in fact what Andrew McAfee from MIT’s Management school and Mike Gotta from Cisco are discussing.

Andrew presents his Enterprise 2.0 the Indian Way in a recent blog post. He describes a project done internally at Tata Consulting Services, where they build a social collaboration tool to rate and share the broad collection of project derived knowledge. It sounds deceptively simple, but on the other hand, I have seen the results from a number of similar projects deploying a very structured, formal approach to knowledge sharing – and none of those worked very well – so why not? The real trick at TCS didn’t seem to be so much about the tool, but what motivated the TCS consultants to engage. You could call it a bottom up approach to the Social Enterprise.

The opposite example is presented by Mike Gotta in his presentation: Build an Architecture of Participation. I have to warn you, it is heavy on models, slides etc. Although he is discussing the same thing, it is probably more of what you’d call a top down approach to the Social Enterprise.
Continue reading “The Social Enterprise – what problem are we trying to solve?”

IT confuses (again)

I occasionally read Nick Malik‘s blog, Inside Architecture, and his latest post about ‘Business Capability’ reminded me of IT people’s general ability to take a perfectly understandable word, such as capability, and turn it into something confusing. This is not a criticism of Nick or Paul Harmon who wrote the article, Capabilities and Processes, that promoted Nick to write – but merely used as an example to illustrate my point.

Now, IT’s definition of ‘Business Capability’ is ‘what a business does at its core‘, and its description (e.g., model) captures ‘what the business does (or needs to do) in order to fulfil its objectives and responsibilities‘. The idea is to focus on ‘what‘ an organisation needs to do, rather than the actual ‘how‘. A conceptual view, if you like. And so the discussion continues in search of the ‘what’ and what it really is.

I think the confusion around ‘Business Capability’ stems from the fact, that a noun can refer to an entity, a quality, a state, an action, or a concept. Continue reading “IT confuses (again)”

Architects as facilitators

Arthur Wright, a software architect from Credit Suisse, wrote an interesting article in an issue of the IEEE Software magazine, called: Lessons Learned: Architects Are Facilitators, Too! He describes a number of divergent behaviours causing the architecture to fragment through unauthorised interfaces, ill-considered technologies and protest designs. The article is an ‘anti-pattern’ to Conway’s Law. The form and structure of an architecture is often – when you deal with a certain level of complexity – closer related to the (human) organisational communication patterns and structure then a direct realisation of the (wishful) thinking of an architect – competent or not….

As Wright points out, technical skills are important, but if you cannot convince people to collaborate and follow your ‘architectural vision’ then those skills really aren’t to much use. Your skills as an architect needs to go ‘beyond the tools of the trade‘ including knowing how to ‘visualise your architecture‘ and be prepared to have the ‘software is not a building‘ conversation. Wright also provides a useful list of SWOT and cause-‘n’-effect analysis techniques.

If you are in a reading mood then I’d suggest reading some of my previous blog posts – all related to this topic: