HTML5 for Mobile – Will One WebView Ever Rule Them All?

HTML5 is no longer a new technology. But despite having been around a few years, its potential has never been fully realized, especially on mobile platforms. Since its introduction something has always been missing, preventing HTML5 from ever being fully embraced as a serious platform for mobile app development.  


First it was the lack of implemented standards – I remember checking with each new browser version release, eagerly hoping for a score above 300. Then came the era of “performance nits”: developers wrestled long lists to get a smooth scroll and did backflips to get transitions that felt “native”. Fast-forward to today (mid-2015): performance is less of an issue but still requires very close attention – one sloppy CSS rule or DOM modification and the veil is lifted.

And JavaScript memory management has always been a sand trap. You might think JavaScript will handle things smoothly with dynamic memory implementation and garbage collection. Not on mobile devices. If you’re developing for devices where resources are limited, the JS GarbageCollector becomes your enemy, sucking performance from your app any time it pleases and leaving you with no control over when it springs to life and blights your seamless animation. And although you can say “it’s not me – it’s GarbageCollector”, no one will listen: you’ve just gotta learn to deal with it. Here are a few tricks JS developers can perform to mitigate this issue:

All in all a bit long and unless you’re a JS developer you may not want to sit through it. Let me just cite the narrator’s definition of JS GarbageCollector: “waste monster which keeps your applications from going too fast … a blind beast filled with hate and rage.” I can only grin and applaud that characterization. So if you are a JS developer watch this video carefully. It contains the best description of JS’s GarbageCollector algorithm I’ve ever come across together with an effective approach for taking control of this “beast”.

Imagine trying to keep memory under control with a few junior JS developers on your team: throw in a few folks with HTML-level coding skills and no clue about memory heaps, stacks and other low level stuff and the beast roars bigger than ever.

And returning for a moment to the era of “performance nits”, speaking from experience I can confidently say that the only way to get 60fps from an HTML5 mobile app is to avoid HTML and CSS entirely, relying instead on a single hardware-accelerated canvas which runs WebGL underneath. Yes, you’ll have to implement the UI yourself but resulting app will be fast. And there are several good libraries which can help.

But this post is not about WebGL libraries since the state of the art puts “performance nits” in the rearview mirror. Instead there’s a new threat building around mobile HTML5 – WebView fragmentation.

WebView for HTML5 is a Common Language Runtime – that means it’s supposed to be “common” on different platforms while allowing us to run Javascript HTML and CSS. That’s the theory. Practice is another matter. Taking a closer look at what’s out there now:

Android 2.2  to Android 2.3.7 – WebKit version 533.1

Android 3.2.1 – WebKit version 534.13

Android 4.0.1 to Android 4.3 – WebKit version 534.30

Android 4.4.x – WebKit version 537.36

Android 5.0.x – WebKit version 537.36, upgradable separately from Android version!

Today, developers can expect Android 5 devices that, despite their common base OS, all run different WebView versions. Compounding the problem is the fact that some Android device vendors are “tweaking” default WebView, leaving developers to sort out “device specific issues”. And unfortunately there are many of these.

Now let’s look at the Android versions market shares:

Screen Shot 2015-05-11 at 8.47.12 PM

Data taken from here

As for Apple – with iOS 8 they introduced fast WKWebView instead of regular UIWebView but, as of this writing, about 17% of Apple mobile devices were still running iOS 7.

iOS fragmentation

And that’s just looking at Android and Apple. Matters get even more complicated once you consider platforms like Tizen, Blackberry, Amazon Kindle, and Windows Mobile, which has its very own WebView implementation.

With so many WebViews in the field the reality is that each has specific defects that developers need to be aware of to then apply corresponding workarounds. And as noted here, nobody’s going to fix previous versions of WebView even when bugs are really severe. And if big bugs get glossed over you can imagine how much attention minor malfunctions get…

More than ever before, HTML5 developers have to deal with a bewildering array of issues: Webkit version specific, OS/Platform specific, device specific and in addition to all this – wired memory management. As a lead engineer who’s been working on a big HTML5-based project for about three years I can confidently say that navigating this labyrinth and while building automated test systems to maintain control on product quality absorbs significant time and resources. Time and resources that are comparable to those required to create two native applications…    

So welcome to the era of WebView fragmentation. With each new WebView version in the field, added to manufacturer-specific “tweaks”, HTML5 supporters are losing their last and best argument in favor of “hybrid HTML5” vs. native. “Build once run everywhere” is becoming a form of wishful thinking. 

Business Continuity Planning: Turning Crisis into Opportunity

For the past three years, we’ve been operating our premier development center in Kharkiv, Ukraine. With world class universities and abundant talent, Kharkiv has a vibrant technical community of young coders, developers and system architects, talent that attracted us to Ukraine in the first place. Over the past year however, Ukraine has unfortunately experienced war and substantial loss of life in the central and southeastern parts of the country, about 200 km (125 miles) from our office in Kharkiv, but thankfully with few disturbances in Kharkiv itself.


We made the choice to operate in East Ukraine, and we have to be accountable to our clients and our employees for that decision. As the conflict in the Donbas began, we realized a business continuity plan was warranted. In March 2014, we began to test our plan and had some interesting experiences as a result. We’re continuing to watch the situation and plan accordingly. Obviously, responding to a crisis is not an experience we would have chosen, but it’s nonetheless made us a better organization and left us much more flexible and prepared for the future.

To design our continuity plan, we looked carefully at how our organization can move staff or transfer mission-critical business responsibilities to different physical locations. We observed that moving a development project staffed entirely in a single location to a new location is very hard to do. Team members are also people with family ties and relationships and while a move to a distant location may be appealing for some, it’s rarely the case for every member of a team. Additionally, Waverley and client investments in the team’s skills and project-specific knowledge as well as “tribal knowledge” gained from a team’s long-term engagement are critical to the success of the deliverable.

One just can’t wind down a team in one location and restart it in another with all new people. However we can and have taken steps that benefit everyone: our clients, our developers and their families.

The process of learning how to make this transition started by chance at the conclusion of 2013, before initial unrest in southeastern Ukraine developed. At that time, a group of my top developers came to me with a request to work a few months in Montenegro to find relief from the cold Kharkiv winter. It was an opportunity to reward our team with working in a remote location that happened to be near the Mediterranean.


The developers knew that they would have to work successfully in a distributed environment, so the motivation to think and plan for a great outcome came from the bottom up rather than the top down. It all worked fantastically well. We measured no loss of productivity. In fact, the Montenegro-located teams showed a marginal increase in productivity.

People working from Montenegro and their counterparts in Kharkiv built the ability to work in a distributed environment. They learned how to communicate, plan and count on one another. So when tensions in Ukraine started to become apparent, Waverley already had this successful experiment in its back pocket, and we knew we could grow a new location quickly and with confidence in our ability to perform. Without the prior experience in Montenegro, temporarily relocating a significant number of people on short notice would have been near impossible. Our staff was very grateful to have the option to spend part of the 2013/14 winter in a warm place. With this one experiment, we were able to demonstrate concern for their well being and at least one means to respond in case the Donbas crisis spread.

As a separate effort, we accelerated our plans to grow build a development center in a second Ukrainian city. In mid 2014, we opened an small office in Lviv, a vibrant city in West Ukraine, far from the Russian border and pro-Russian southeastern provinces. We have developers who have relocated to Lviv from Kharkiv and we are actively recruiting new positions in Lviv. Staffing a new office with existing people presents major advantages over a new office with all-new staff: existing staff bring with them the culture and internal know-how that allows Waverley to perform so well.

We’ve taken our plans even further by actively working on multi-country teams between Ukraine and our Vietnam location. Integrating both locations into one Waverley been important to growth and opportunities for both Waverley and our clients. Our teams in Ukraine interview staff for positions in Vietnam, and wherever possible, they work together on client projects. This experience enhances our ability to move responsibility for deliverables as needed and provides an important “shadow” capability on a global scale.

During the last year, we’ve come to realize that the opportunity to plan and execute continuity plans has many benefits. The ability to accept a difficult situation and look for opportunities for improvement is core to our culture. We know we can never plan for all eventualities, but adaptability, flexibility, and performance, in our code and in our operations, directly influence how we go about the day-to-day business of building great software for our customers. We will continue to respond to events and work hard to continue to be a dependable partner who can be trusted to deliver.

Motivation, Performance and Career Growth in Scrum/Agile

Teams using Scrum on regular basis can get bored. This is especially true of higher-performing teams. The smoother the process, the easier it is for team members to start feeling they’re working in a factory: demos and retros become mechanical, all failures are in the rear-view mirror, all disputes are resolved, top-management is happy and the team is held up as a model for less mature teams. You’d imagine such a team would live happily ever after but…

Experience has shown that a team that has attained such a high-performing state requires extra attention from its project manager. Yes, let’s assume this team has a manager looking after each member’s career path and keeping them motivated to do a great job. And at some point this manager starts to notice signs of boredom: the discussion of the project lacks enthusiasm, creativity is thin on the ground, and challenges are met with a shrug. This is the right time to do something differently.

Certain professions observe the tradition of hanging diplomas and certificates on an office wall for clients and patients to see. These symbols, some might call them trophies, indicate competence, success in one’s training or work, excellence, professionalism.



Consider a similar approach to designate that someone did great during a Sprint. This can be a “star badge” (for example) given to an engineer during the Sprint’s retrospective meeting by the rest of the team. It can be for anything the team wants to appreciate about that person. 

Scrum talks about team commitment, team performance and team responsibility on deliverables. And this is all good when things are going fine and no problems occur. On the other hand when a team fails a Sprint, it’s often hard to pinpoint a weak link. Team members will tell the project manager that all of them are “good guys” working hard against incomplete requirements, unexpected difficulties in deployment etc., and if possible they will blame anyone outside of the team for the failure: managers, the product owner (especially if the product owner is customer-side). It’s a pretty rare case that a team will admit failing a Sprint delivery due to their own overestimation of their competence, wasted time during the Sprint, lack of discipline, miscommunication and/or not escalating issues early enough to other team members or the Scrum Master.  

And here starts the person-by-person performance analysis, work review etc in order to try to separating well-performing team members from, em, ballast. A little humor goes a long way towards lightening the mood here. During sprint retrospective meetings the Team can decide which members (if any) deserve a “Zombie” badge for their meh performance, lack of feedback, blocking work of other team members, etc. 


Here’s another reason this personalized, trophy-based approach is useful. Teams often consist of developers of unequal experience and seniority and line managers need to do performance reviews for each engineer as a member of one or more Scrum teams. While these reviews constitute  important feedback in each engineer’s career plan you’ll be hard-pressed, as a line manager, to get negative feedback from the other members of that engineer’s Scrum team. Handed out at regular intervals, the trophies are fun but also serious: something simple and agile allowing the tracking of each team member’s successes and failures without taking lots of time from PMs, Scrum Masters (or whomever is doing this job in your company).

Here is the simplest evaluation system: have a table that’s shared with all the team members, then assign an excellence mark in front of each team member name using the following rules:

  • Assign 3 if the team member performed extra work or took something to a higher level on-time and with good quality.
  • Assign 2 if the team member did good work and accomplished all the tasks assigned in time and with good quality
  • Assign 1 if the quality was not so great, leaving some bugs for future Sprints.
  • Assign 0 if tasks were left incomplete without a valid reason.

Then, in a brief discussion with Team Members grant the Stars and Zombie Badges if any are deserved =)

The idea is to use a developer’s total score to rank his/her performance over an extended period of time. Calculations are simple: people with top scores and more Star badges are the first to get salary raises, bonuses, days off, go to a conference at company cost… anything you use in your company as positive motivation. On the other hand if for some reason your project budget gets cut you know which of the current team members you can release without remorse. And your choice will be supported by these simple metrics.

Finally, I have already described the importance of having FUN while working. Add some gamification to your Retrospective meetings and I promise you wouldn’t get bored during Planning nor during the Sprints =)


C-THRU: Transparency and Simultaneity in App UI Design

A recent app UI design effort had us face what has become a familiar problem: optimizing the use of screen real estate on mobile devices. Given the limited screen size typical of most mobile devices, here’s the dilemma:

1) increasing app sophistication often leads designers to want to display more UI elements

2) any elements displayed must be sized to allow easy operation with one’s fingers

Compounding this problem is the drive of most designers to “simplify”, a bias towards minimalism artfully represented in Apple’s “a thousand no’s for every yes.” 


In this particular case we were designing a consumer-facing native iPad app called Kinoke. Part social network, part private journal, part photo/video album, Kinoke’s proposition is to get users to reflect on, comment, and share personal letters, photos and videos in a totally private, non-commercial way (invitation only, no ads, no collecting and re-selling user data). Given that the demographic includes older users, the UI had an even greater need to be simple and uncluttered. 

The heart of the app is where users comment on a particular item, say an old family photo. We call this the Comments screen. We were designing this screen at the pixel level, and we hit this impasse to do with balancing simplicity, complexity, and accessibility.

The Comments screen needed to be simple so users would not give up trying to use it the first time they tried, and so that they would return to it often and with pleasure. It was also going to represent something complex, a display in which the old photo and related comments could coexist without one compromising the other. And users had to be able to quickly shift their focus from the old photo to the  comments and back again … both had to be instantly accessible.

We considered devoting about half the screen to the old photo, placing user comments adjacent. But this forced both the photo and the comments to be smaller than we wanted. We considered flipping or panning the view to alternate between comments and photo. But this was going to put a lot of distracting movement on top of what we knew would be a moment of concentrated contemplation. Having elements fly in and out of view is great sometimes, but maybe not when you’re trying to put words to a memory or feeling, especially if you’re 70.

We knew we had to get the Comments screen right, because that’s where the users create value in the app. We wanted no movement, we wanted the photo and the comments to each be as large as possible, we wanted near-simultaneity. All that led us to a control that would let users modulate a two-tiered space using transparency. We call it C-THRU.  

thumbnailHere’s how it works. User comments, whether typed, audio or video, are displayed on a dark background through which a full-screen version of the item being commented is just visible: in the background but very faint. Near the left edge of the screen sits a circular button labeled C-THRU, and this button does exactly that. When touched and held, C-THRU rapidly cranks the transparency of the dark comment layer up to about 90%. The underlying item becomes clearly visible, but this effect lasts only as long as the C-THRU button is pressed. As soon as C-THRU is released, the dark layer returns, the item fades back behind it, and we’re back to typing or speaking or videoing our comment as before.

One user described C-THRU as “putting on x-ray glasses.” We feel it lets users remain fully immersed in a task while giving them the chance to observe two things seamlessly. 

And C-THRU really paid dividends when we took on the task of porting the Kinoke iPad app to iPhone a few weeks ago. With even less screen real estate on tap, the C-THRU button’s effect of quietly transitioning between the two tiers of Kinoke’s Comments screen seems indispensable.

My First Trip to Vietnam

Amongst Waverley’s multiple development centers, the largest are in Kharkiv, Ukraine and Ho Chi Minh City, Vietnam. Engineers at both offices work on similar projects (at times they collaborate on the same projects) and have been getting positive feedback from our clients. As the lead QA Engineer in our Kharkiv location I’ve developed a good working relationship with our QA team in Vietnam but didn’t know any of them personally. On top of that I didn’t have any first-hand experience of Asia. I discussed this with two of our executives: Matt Brown (CEO) and Patti Gosselin (COO). Soon after that meeting I ordered tickets to Ho Chi Minh City and started planning my trip. 

I landed in Ho Chi Minh City at midday. A project manager from our Vietnam office met me at the airport. He was so friendly and happy to see me that I decided to visit Waverley office first and check in to my hotel later in the day. 

It was mid-September so while still warm in Ukraine it’s nothing like the 77-86F I was met with in Ho Chi Minh City. But I’d checked the weather and I expected it to be hot. What I didn’t expect was a perfect taxi service. I’ve never encountered better taxis than in Vietnam. The drivers are instantly recognizable in their green uniforms and just need to hear the address or see a business card. As in Kharkiv most people in Ho Chi Minh don’t speak English but those “guys in green” do. If you don’t see one nearby you just call a taxi service (Vinasun is the best one) and ask for a car.

VN triptychThe heat out on the streets contrasted nicely with the temperature inside the office. Good air-conditioning was common in Ho Chi Minh – seems like air conditioners are everywhere. In fact, regarding work places, work stations, equipment in the office – everything felt similar to offices in Europe and the US. I’m not sure whether this applies to all offices in Vietnam or only the Waverley office but staff report to work at 8-9 AM and go home around 5-6 PM (in Kharkiv most of our staff arrive and leave two to three hours later, to synch with clients in the US). During the working day the folks in our Ho Chi Minh office have coffee breaks and a one-hour lunch. What I appreciated in their working process: synchronization. At any time you can find a technical specialist in the office; no need to call or chat via Skype to ask a question. Also people in the office prefer to have a lunch all together: it’s like small team-building exercise every day. And what was unusual for me: people sleep in the office if they don’t want to go for lunch or finish lunch early. So during the lunch break it’s possible to have a meal and sleep a bit to refresh brain and body.

During my visit I asked to have a one-on-one meeting with each QA team member to get a sense of strong and weak areas of their knowledge. After completing all meetings I concluded that the team is very motivated to work in IT and have good educations, almost everyone has a technical background, they understand the testing process and generate proper reports, and all are willing to learn more and grow like into true technical specialists. I did at times have difficulties understanding their English pronunciation. Often Vietnamese omit final consonants and medial sounds, confuse sounds etc but I think it’s just a question of time and practice on both sides. The more one communicates with people from Asia the better understanding of their English one has. And any problems are really limited to pronunciation. No problem with their writing and they also understood my spoken English well. 

For me in my first visit, Vietnam felt like an unusual country. Food is different, a lot of scooters and motorbikes and innumerable very small shops – not like in Europe or the US where we can buy everything we need in a supermarket. People are very polite, friendly and ready to help at any moment. What I really appreciated and what stayed with me was the sense that anything a Vietnamese person does, he or she does out of consideration for the welfare of the family, rather than for themselves alone. 

I am definitely interested in visiting Vietnam one more time to shake hands with the people I met, to meet new people and get a feel for Vietnamese culture one more time. And I hope to do it soon!  

Future of JS – As Discussed in Barcelona

This past May the JavaScript community gathered at the FutureJS conference ( in Barcelona, Spain. After a few years of semi-stagnation JavaScript has been seeing renewed interest amongst developers. With browsers approaching the status of operating systems for the Web, JS as the principal browser language is getting a lot more attention. 

FutureJS addressed contemporary issues of JS development. Speakers included Jeremy Ashkenas – creator of CoffeeScript language, Reginald Braithwaite – author of the book JavaScript Allongé, Patrick Dubroy – Google Chrome engineer, guys from Facebook and others.

I was keen to get a “vision of JS’s future” directly from the people who call the shots. And it’s always helpful to step away from  everyday work and broaden one’s understanding the current state of JS and Web technologies.

The event was well-executed with balanced time for talks and coffee breaks. Evening meetups in hip bars and, during the day, long lunches (yes, it’s Spain and they take their siestas seriously) provided ample opportunities for informal communication amongst participants, organizers and speakers, including a great wrap-up party in one of the best night clubs in Barcelona (Razzmatazz).

Overall there were fifteen talks over two conference days (video archive is here). Some were dedicated to the JavaScript language itself: its history and possible future evolution, including the newest features of the just-emerging ES6 standard. Summing up the highlights for me: 

Reginald Braithwaite on Functional Programming and OOP

Reginald’s talk emphasized JS’s inherent minimalism. The language doesn’t contain ready functional constructs or concepts and doesn’t force us to use a functional approach, but has enough tools for developers to code in a functional style if they prefer (functions as first-class entities). The same with OOP – JS doesn’t support all the concepts of classical OOP, but Reginald showed how we can emulate them using objects and prototypal inheritance.

Reginald also explained the idea of creating modular programs based on functions, thereby making code more reliable and reusable. According to this approach a program consists of two groups of functions: ones that implement business logic, main building blocks of an application; and service functions (composers, transformers) – general-purpose routines applied to the business logic blocks or to another service functions (they are going to be the same for different applications). For this approach to be successfully implemented the business logic functions have to be properly isolated and encapsulated.

Jeremy Ashkenas on Using JS in Commercial Projects

Jeremy, author of the CoffeeScript language, the Backbone.js JavaScript framework, and the Underscore.js JavaScript library described lessons learned using JavaScript in big commercial projects. He listed the main evils of JS: incorrect polyfill implementations, prototypes hell, different types of functions, scoping of var and others. Also Jeremy explained how CoffeeScript allows developers to avoid all those issues. And CoffeeScript being very similar to JS (in syntax and main concepts) makes it a relatively easy step towards more reliable, intuitive and comfortable programming (the caveat being the additional step of compilation).

Patrick Dubroy on ES6

Patrick toured us through its new features (specs can be found here) with an emphasis on how to use all these goodies just now when most browsers don’t fully support them. While such features as new methods of API can be easily polyfilled, the new language syntax and constructions require more cunning approaches. Enter the compiler Traceur. It takes code containing new ES6 features and transforms it to ES5 (or even ES3) compatible code. Patrick also demonstrated, through examples, exactly how transformations from ES6 to ES5 are done, from  elementary ones like the => (lambda) operator to more complex stuff like generators.

Jaume Sanchez on the new Web Audio API 

Jaume explained its main idea and constituent concepts. Web Audio ( enables the mixing, processing, and filtering tasks that are found in modern desktop audio production applications. The model of Audio Nodes – audio processing nodes connected into the processing net (or graph) – is the key concept of the Web Audio API.

Martin Naumann on Web Components 

Martin described the use of Web components ( to build modular Web applications. The coolest thing about Web components is that developers no longer need specialized frameworks and tools (like angular directives) or components built with other languages and technologies (for instance, Java applets) to create reusable, well-isolated, reliable widgets for Web applications. Standardized technologies like Shadow DOM and custom HTML elements can be used instead.

Pete Hunt on the Virtual DOM 

Peter presented the virtual DOM as an alternative approach of organizing data binding in situations where current approaches were not ideal from a performance perspective. The classical implementation of data binding is based on the key/value collections observation (Ember, Knockout). The main competitor of this approach is dirty checking (Angular). The virtual DOM bumps performance while working effectively with the data binding update history: the current state of bound UI elements is determined as a collection of changesets applied to their initial state.

Matthew Podwysocki on Event-based Programming

In Matthew’s memorable talk on reactive JavaScript programming he explained the idea of streaming and event-based programming using FRP (Functional reactive programming) and RXJS. He described the main principles of reactive programming: observable and observers, query operations and schedulers. Through vivid examples Matthew demonstrated how RX (reactive extensions) works in practice. The API for reactive programming in JavaScript is offered in the RXJS library. The idea of reactive programming is not new (I first came across it two years ago when I worked with MS .NET technologies), but there are interesting trends in its development, like using it in conjunction with JS generators (ES6).

All in all, attending the FutureJS conference gave me a clearer understanding of which aspects of JS and Web programming I should learn in depth and start using in real life. High on my list is functional programming implemented in languages other than JavaScript – Haskell, Scala, Closure.

I came away feeling that the future of JS is filled with promise, excited to get back to work!


Thoughts on Tradition, Appreciation and Teamwork

I believe work should be fun. And I think I can prove it. Some of you may have heard of the trend towards gamification in day-to-day project management. Following on that trend, here are some battle-tested tips that you can use as-is or modify as you like, Creative Commons, I wouldn’t get offended, promise:

  1. Make sure your team members have something to be proud of regularly. This can be tasks completed, a challenging problem solved, a three-pointer shot straight into the wastebasket =) Seriously without getting too far away from useful wins,  appreciate all of those.
  2. Shout out loud when you succeed! Remind the team to not be shy when a tough bug gets squashed or when the client is delighted! As a manager, make an example of yourself: do the wave after a successful demo, bust a move after holding out against scope changes mid-sprint, invent a traditional “winner dance” for your team or encourage a unique expression of happiness for each. Whatever you do, do it together and make it visible! I can shout “I’m the master of the world, boo-ha-ha” or just stand up and moonwalk.
  3. Most important, when someone celebrates his or her win, the rest of the team should applaud. Clap loudly for your team-mate’s moment of glory =) Yes, it is a Moment of Glory, nothing less, so give this feeling to your team members, they deserve it. 

That’s it! =) Repeat each time someone does a great job =)

 A few more comments…

When you get a similarly gorgeous appreciation system working, you will also need something “opposite” – a way to acknowledge dumb mistakes or failed team work without humiliating or belaboring the mistake. Allow the team to laugh together while still acknowledging that a mistake was made. Buy a “stupid” hat, or a Blondie doll (people with light hair, I do understand that you are as smart as people with other hair colors, even those who dye their hair), or Gold medal for anti-clever solution … Turn your imagination on! Try to remember what word is used in your team for such “hits” and give it a material symbol.

This “trophy” can be used as award going from one person to another =) But don’t forget #3 – give an applause for it as well.

And finally, here is a list of questions to bear in mind:

  • If someone doesn’t get to do a “winner dance” for a long time, what can I do as PM?
  • Number of winner dances vs team performance, any correlations?
  • How can these methods aid in team-building?
  • Do team members know what went well and what failed?
  • Will it add more fun to our work?

Building Scalable Systems

With this article I want to shed more light on a vital aspect of any computer system: scalability. Why scalability is important? The answer is very simple – it gives the business which is based in or supported by the system freedom to grow. An unscalable system is like a tree with very weak roots – as the load on it grows it will eventually fall.

Before diving further into the topic let’s define the term “scalability” for a computing information system. 

I personally like this definition: scalability refers to a system’s ability to handle proportionally more load as more resources are added. Scalability of a system’s “information-exchange” infrastructure thus refers to the ability to take advantage of underlying hardware and networking resources, as well as the ability to support larger systems as more physical resources are added.

Here I need to mention that there are two types of scalability – horizontal and vertical, where vertical scalability means the ability to increase the capacity of existing computing unit hardware. This approach is limited and quickly becomes unacceptably expensive.

Instead horizontal scalability refers to a system’s ability to engage additional hardware computing units interconnected by a network.

But here is the catch: systems that are built using classic Object-Oriented methodologies and approaches for system software design which work superbly for local processing begin to break down in distributed or decentralized environments.

Why? Because a distributed computing environment brings a whole new class of challenges to the scene. 

Distributed systems must deal with partial failures, arising from failure of independent
components and/or communication links (in general the failure of a component is
indistinguishable from the failure of its connecting communication links). In such systems, there is no single point of resource allocation, resource consumption, synchronization, or failure recovery. Unlike local processes, a distributed system may simply not be in a consistent state after a failure. In the “fallacies of distributed computing” [Van Den Hoogen 2004], summarized below, the author captures the key assumptions that break down (but are nonetheless still often made by architects) when building distributed systems.

  • The network is reliable. 
  • Latency is zero. 
  • Bandwidth is infinite. 
  • The network is secure. 
  • Topology doesn’t change. 
  • There is one administrator. 
  • Transport cost is zero. 
  • The network is homogeneous (it’s doubtful that anyone today could believe this)

I prefer to treat this list not as a set of fallacies but as challenges a software architect has to meet to create a horizontally-scalable system. As an architect who has had a chance to work with large-scale systems, I can attest that if one attacks those challenges directly and adds code that resolves the issues one by one, the result is a heap of wiring code which has nothing to do with the business idea. And that code can easily become more complex than the system itself! Implementing communication transactions, zipping/encoding/decoding data, tracking state machines, supporting asynchronous communication, handling network failures, creating and maintaining environment configuration and update scripts, and so on… all this stuff evokes despondency when it comes to maintainability.

So – is there any good solution to make a system easily scalable?

Luckily, yes. In three words: data-oriented programming.

The main idea of data-oriented programming is exposing the data structure as the universal API between system parts and then defining the roles of those parts as “data producer” and “data consumer”. Now, in order to make such a system scalable we just need to decouple data producers from data consumers in location, space, platform, and multiplicity. Here the trusty old “publish/subscribe” pattern comes in handy.

Here’s how it generally works – a data producer declares the intent to produce data of a certain type (lets call it Topic-X) by creating a data writer for it; a data consumer registers interest in a topic by creating a data reader for it. The data bus in the middle manages these declarations, and automatically routes messages from the publisher to all subscribes interested in Topic X.

It’s time to draw a picture to illustrate how the classic client-server architecture would look had it been designed as data-centric system 

As you can see all system components are isolated and have no knowledge of each other. They only know the data structure or “topic” they can consume or produce.

And now imagine that the number of clients that wanted to consume information from our system increased so that our system could not resolve all the requests in time. – Let’s try to scale this system horizontally.

On the figure above you can see that I have increased number of business logic processor units. This is easily done because the system doesn’t care which computing unit will do the job and doesn’t even need to know that the units actually exist. Each system unit just waits for the data it can consume or publishes data it has declared. Also I’ve easily decoupled client input and client output, spreading the burden to different servers. Since only the number of clients that want to consume information from our system increased, we add more servers that will handle read requests. Also in order to avoid bottlenecks on DB access side I’ve decoupled DB writes and DB reads and allocated more computing power to the ‘read’ side. Of cause in reality those things are more complex,  but this figure shows basic principles of system scaling.  

There are several more important benefits of the data-oriented approach:
1) It’s easy to make system more reliable by adding redundant processing power. If one of
the business process units fail nothing critical will happen because other units of the same type continue to handle requests.
2) The system becomes more flexible – new functionality can be added on the fly by adding new data producers/consumers.
3) Maintainability goes to a whole new level since components are very well isolated one from another.
4) It’s easy to work on the system in parallel.

You can say that it’s all good but what should I do with my existing system?

Fortunately we can isolate all this data-centric publish/subscribe magic into a middleware layer that will handle all communications. And there are a wide variety of such solutions:

What you need to do is define a system data model (most probably its entities will be very similar to the DB model you already have) and then create data readers/writers for each system component which will publish or consume data to/from the middleware.

In my opinion, most prominent and promising messaging solutions that support the publish/subscribe model are: 

1) for web-based solutions

2) (or any other DDS implementation) for TCP/IP or in-memory real-time peer-to-peer communication. No brokers or servers in the middle. Instead leverage TCP/IP and IP multicast for real peer-to-peer message transportation.

But you are encouraged to conduct your own research. 

Practical hint: keep your messages small. Don’t try to push megabytes through your data bus in a single message. The data bus is a vital component and big messages can turn it to a bottleneck causing the whole system to struggle. If you need to transfer a significant amount of data from one system component to another, data producers should prepare and provide a link to those data, so that the data consumer can access them. 

Happy data-oriented programming! 

User Manual for Distributed Software Development Part 2

Continued from Part 1: How, in the day-to-day, does a distributed team share a codebase in a way that does not have members block each other?

This is the time to ask your in-house project leader, “Is it possible to split system development into independent chunks that could be implemented in parallel?”

If the answer is anything but “yes” – it’s a cause for concern. “No” likely means that system components are very dependent on each other, thereby making the system tightly-coupled. And a tightly-coupled system is an unscalable, hardly maintainable, inflexible system.

One of the key factors driving this grim reality is that Object Oriented Programming is, by nature, tightly-coupled. To meet this problem, the software system architect (project leader) has first of all to employ loosely-coupled design techniques to achieve system scalability, maintainability, flexibility and testability. If this task is solved – incremental and independent development will come by itself.

The main point is this: a properly architected system consists of fairly separate and independent modules or classes that have little to no knowledge of each other. Given such an architecture, it becomes easy to split the work by components and avoid interference amongst team members.

An optimized distributed team development process can be boiled down to the following 5 points:

1) Define task, describe, discuss and estimate it

2) Define team (project) roles and agree on formal communication paths

3) Balance implementation efforts of one portion of the team with code reviews from the other

4) Demonstrate (ongoing) results to the project stakeholders

5) Retrospect and review: what went well, what went wrong, identify points for improvements.

And there are many smaller, but still important points that will enhance the remote team’s output:

* Trusted engineer is interested in remote team’s success

* Both sides understand and appreciate a transparent and tailorable development process

* Trusted engineer provides feedback to the remote team regularly

* Use technology to improve collaboration (screen sharing, video conferencing, etc.)

* Leaders of both teams meet in person to align their vision on project goals, create an achievement roadmap, and, ideally, build the project backlog together.

 If you decide to use an “external muscle” to strengthen your product development, don’t forget to ask the remote team for their “user manual” and development process before things get going. Then make the investment to move your system towards loosely-coupled design principles and practices. If these things are done right, the “trust gap” will be bridged very soon, typically in 5 to 10 sprints. And it will result in a pleasant sensation as you lay down to sleep each night, knowing that your project keeps growing and moving in the right direction while you are sleeping.

User Manual for Distributed Software Development Part 1

Having worked as an offshore software development team leader for ten years I’ve often seen the same situation arise when engaging with new clients, and it’s no different at Waverley. It goes like this: a company (client) decides to hire an outsourcing company to help their internal team with product implementation. As business terms are ironed out, the client’s internal team checks the technology knowledge of the offshore team and if everything seems alright they start working together.

Almost immediately the problem of trust arises. In the first stage of building the relationship there is no trust for the new offshore team. This is absolutely normal, it is a given, a matter of human nature. To fill this “trust gap”, the client often names a trusted engineer as the intermediary between his company and offshore team. Typically, this technical person is busy enough with tasks that pre-date the engagement of the offshore team, has little idea how to manage a remote team or how to set up a productive distributed development process. Moreover, these “management” activities are just boring for an engineer (having been an engineer myself I understand this perfectly). Now add 7-12 hours time difference between the client team and the offshore team and you have a perfect recipe for disaster.

The question is how does one make the “Business owner <-> Trusted engineer <-> Remote team” model work effectively?

The short answer is: with the trusted engineer you have to introduce an Agile development process and the entire team needs to embrace loosely-coupled system design.

 Now to make a short answer longer…

 When we buy something complex it typically comes with a user manual which explains how to use and troubleshoot it. And when you hire a remote team you are buying something complex. So you should check not just business terms, technical parameters and qualifications but also ask to see the offshore team’s “user manual”. Any remote team that’s been on the market for more than a couple of years has its “client interaction patterns”. Understanding those patters is a very good starting point for building a new relationship. The converse is also true!

Here are a few questions you might ask the remote team leader:

1) What will you do to build my confidence that you are going in the right direction and building the thing I need?

2) How can I know current status of the project at any given time?

3) How can I know what you are working on right now?

4) By what procedure will be manage system changes if (when) we decide to make them?

 I’m not going to write another SCRUM handbook! But from my experience on the offshore side of the equation I can say that having a “Vision & Scope” document, a product (user story) backlog, sprint planning meetings, sprint backlog, daily standups, and demo and retrospective meetings helps a lot to make the development process transparent and predictable.

So the first thing to do with a remote team is align around a transparent and tailorable development process. This is a must – without a development process things will fall apart very soon.

 Now imagine you have that user manual: you’ve agreed on a development process, you’ve created a “Vision & Scope” document where you’ve captured your goals and metrics to understand which goals have been achieved, and you and your off-shore team have started moving toward those goals.

Here a second problem arises: working on the same project requires a lot of communication amongst members of distributed teams. While there are strategies for organizing this communication there is also the question of how to work in a way that doesn’t require permanent communication. How, in the day-to-day, does a distributed team share a codebase in a way that does not have members block each other?

Loosely-coupled design to the rescue! (continued in Part 2)