Friday, December 5, 2008

Hard-coding, soft-coding and configuration, Thoughts about system flexibility

Throughout my career I have encountered time and time again a basic predicament, regarding the practice of configuration and customization of software products to the needs and requirements of its customers.

Every employer I’ve worked for had his own approach, and philosophy toward the issue. And rather than discovering the “right way” of doing things I’ve notices the set of considerations and logic behind each approach.

The focus argument is on applications that have to deal with business data. The handling of which determines the character of that particular business or aspect of business.

For this domain we tend to see our products structured as such; we lay down the foundations of the application as a rigid, hard to change baseline and then layer it with supposedly easy to change pieces of software, where we allow changes and customizations per customer.

The considerations one takes in mind, for the customization layers are:

·Managing the customized code, and versioning it (if at all), a good thing when customizations are done “at home”, extremely difficult to impossible if performed on site

·Customizations that are either independent or require a build process, since the build process is an expensive overhead.

· The ease of delivery of updates to the customer without disrupting the customizations and adjustments he has there

·The speed and effort it take to get a baseline system to be fully compatible with customer requirements

·The level of expertise, required from the staff that is customizing the system, this is important when you employ field engineers that usually do not have the same background as your R&D.

·System flexibility and deviation from baseline behavior, maximizing this aspect widens the relevance of the product, this increasing sales

I’ve seen several approaches that I would like to address in this instance.

Code base replication

This is when for each customer, that has a contract for the particular solution, a development team is allocated, and the entire code base is replicated, usually from the closest customer project available.

While the newly assembled team works on the new branch, the two code bases drift apart.

There is absolutely nothing good that I can say about this practice, apart that it works.

The companies I saw that use or used this as their way of customization allocate a hoard of developers on each customization task.

No architecture is too holy so not to disrupt. Just make it work. And it does.

This approach is of brought force, very costly in terms of manpower, but for agreements what are not in fixed priced (like in the old days before 2001), this is the best way to get things to work as fast as possible.

Coding behavior in XML

Coding behavior in property files, xml or any other textual format is a great way of separating strategy from base line functionality.

Adopting this approach, results get nice clean code; that does all the logic on the one hand, and on the other, pure (configuration) data, stacked neatly in external files, describing choices in behaviors.

A disadvantage to this form of separation is that the application’s “base line” code actually has to include all the possible behaviors. When adding a new feature, it gets coded into the base line and switched in a property file.

Another aspect that may count as a drawback is the maintenance of the external property files. In all the projects I’ve worked on, it didn’t take long for the number of external values to exceed the hundreds. It sometimes gets to the point where the development environments are so loaded with configuration that the development and testing are bogged down as a result.

A workaround this splinter is to require of the base line code to be property resistant, meaning that in the absence of a property, the system would fall back to default functionality, without crashing. This, from my experience is not a simple task.

In some odd cases, I’ve seen elaborate structures of properties that by themselves define a limited coding language. This, to my mind, is a miss, since the product does not benefit from a well formed product as the available scripting languages, the maintenance of the hippo scripting is very expensive, and so is the time one has to spend learning this unique language.

Extension points

The use of extension points actually comes in a variety of flavors. Some solutions I’ve seen, may in fact be categorized closest to the code replication, others use advances scripting as I would discus later on.

Building extension points into your applications base means that once you have determined precisely, the path of a generic flow of events, you expose your data structure for tweaks and adjustments by the extending implementations.

Determining these exit points in the baseline flow is far from trivial. It’s a task that requires superb understanding of the business domain.

When different departments are responsible for the determination of the extension points and the actual customization, as is the case for most companies, these discrepancies in expectations may result in limited access on the one hand and criminal abuse on the other.

In situations where scripting is not used for the implementation of the extension points, customization is yet strongly linked to build cycles and deliveries, and one may view this as just a nicer, more polite way of reenacting the code base replication.

Another issue might rise if the base line application categorically performed the base line regardless of the existence of customization. The reason I’m even mentioning this, is because I’ve seen this strategy in action. The performance was appalling.

Embedded scripting

Embedding a scripting engine into the knees and elbows of the application is to my mind, an excellent way of achieving system flexibility, fast customization time, low or no need of involving build cycles.

It does however require a higher degree of experience and skill of the system engineers. Yet the skills required are of industry standards and not a proprietary of a particular solution.

I adore this solution; it combines almost all the good aspects in all the approaches, while minimizing the less comfortable aspects.

The application’s behavior can be determined externally and the code is precise and strait forward.

In terms of performance, there are currently available some scripting engines that are surprisingly efficient.

By choosing your scripting engine you can determine whether the scripted elements would have limited or full access to the application internals, such the difference between JavaScript’s RINOGH and Groovy.

Case study:

Let’s think about a solution for a converting module. This module is needed to convert a structure that is particular to every customer and the task at hand is to have it mapped into the base-line data structure.

This of course is a very simple example, but I feel it can illustrate the most of the differences between the approached listed previously.

Drawing the outline of our module, we can point the following elements:

· A listening service, poling or waiting for new data to arrive to the system

· A converting adapter, where the customized permutation should take place

· A persistence module or gateway to the systems data structure

Using code base replication:

Having a code base replication, the entire enema could be coded as a single chunk of code. When replicated, this code would undergo changes to match the data structure at hand. The implementation is very simple and strait forward.

One may, of course, have some partition to generic services and more applicative code, but for the matter being, it’s the same.

Using external properties

Having external property files define this kind of transformation is a very tricky implementation, especially when the conversion involves more than just mapping fields to fields. The base line code has to cover all the conversions utilities and a very smart piece of code that knows how to use the described transformation in a property file.

Using extension points

Extension points in this case, may be implemented either as a flavor of the code base replication, or as invoking the embedded scripts.

The advantage in this case is that the points of customization are very clear and easy to point out.

Embedding scripts

Embedding scripts to handle the transformation and conversion is a smashing solution. It has the benefit of the strait forward implementation since after all, its code. To top that, the conversion is completely external to the baseline code and is independent of any delivery.