The Quick and the Ordered

A Dilemma

Let's start with a couple of questions:

  1. Is it better to be quick or dead?
  2. That’s a bit of a no – brainer really.  Most of us would prefer to live.
  3. Is it better to have order or chaos?
  4. Most of us would choose order.

For many organisations today it seems as if those questions are linked, and that the real choice is between speed or order.

Both options have consequences:

  • Choose order and you risk going the way of the dinosaur.
  • Choose speed and you risk drowning in chaos. 
Neither is particularly attractive.

So how have we come to this point?

Digital transformation

As the economy becomes essentially digital, businesses must deliver a differentiated, personalised experience to their customers.

Fail and you will not survive.

If your offer isn’t differentiated, customers won’t notice you.

If your offer isn’t personalised, then you’re not building a relationship with the customer. At best the interaction is transactional and “non – sticky”. At worst, it’s spam.

Differentiation

One of the most interesting things about digital transformation is how it changes an organisation’s relationship with software.

To differentiate yourself, to stand out from the competition, you need to create digital assets (typically software) that are unique to you.

This means that organisations that were traditionally consumers of software, now become producers of software.

But given the fast – moving competitive landscape, that software needs to be delivered at speed.

The software also needs to have high – performance and scalability built in.

Cloud Native

These factors have been behind many organisations adopting a “Cloud Native” style of development, building on methodologies such as Agile and DevOps.

It is worth remembering that cloud native deployments can be on premises, in private or hybrid clouds as well as on public clouds.

Cloud native uses approaches such as microservices, API - first and CI / CD to create and deploy loosely – coupled systems.

The aims are productivity and performance, to “allow engineers to make high – impact changes frequently and predictably with minimal toil”. The fundamental idea behind cloud native is that a system built from independent, loosely – coupled components can be developed more quickly be more scalable and resilient.

Each component will be developed by an independent team who can make changes and deploy more quickly than a larger team working on a tightly coupled system.  Since deployments are more frequent, the changes between each release will be smaller, reducing the risk of failure and time to repair.

Such a system can be scaled more easily by running more instances of individual components (fine – grained scalability).

Such a system is more resilient, since failure of one component instance will not affect the rest of the system, and the failed instance can be easily replaced.

Organisations that have adopted this approach have seen significant improvements in development productivity, application performance and reliability.

Data Fragmentation

Unfortunately there are no silver bullets.  In this case there is a data fragmentation grenade to beware of. 

To achieve the loose – coupling between components that makes cloud native work, each component needs to have its own independent data store.

Let’s just think about that: to be independent, each instance, of each component such as a microservice must have its own data store.  If you deploy a new instance, you deploy the whole stack for that microservice, including the data store.

Such a design has significant consequences for data management:

  • The number of data stores for an organisation will increase (dramatically).
  • The data a microservice instance holds is queried via the API for that instance.

When I first described this approach to a colleague with many years of experience in data management, he nearly fell out of his chair.

He said: “You can’t just deploy a new database for a new service instance. I can see how you deploy a new instance of the application, but not the database!”

But I said “data store” not “database”.  And it is crucial to consider what type of store we should use and when.

A cloud native development team typically:

  1. Work with a data store that will only be used by a single microservice instance.
  2. Need to develop as quickly as possible.
  3. Likely have a changing or partially understood data model and/or “unstructured” (semi – structured) data.

For them a data store that is local to the microservice, has schema flexibility and can store the data in a format that is developer friendly is very attractive.

The downside is that your data is fragmented across these multiple stores, which is less attractive for others who want to consume that data.

Personalisation

For years organisations have struggled to harness the data that they have and to derive value from it.  They have long wanted to have a “single source of truth” or “single view of the customer”.

Now to succeed in the digital era, our ability to use our data and personalise our offer is key build relationships with our customers and increasing their loyalty.

As we discussed above, our data has now been fragmented.  Just when we need to unify it to drive effective personalisation.

Squaring the Circle – Personalisation and Differentiation

So how can we reconcile the requirements to deliver:

  • Personalisation (needs unified data, easy to consume)
  • Differentiation (needs loosely – coupled components which in turn need independent data stores)

I think the key is to think about the lifecycle of data and to recognise that it needs to change its shape and location over that lifecycle.

Think about a DNA data packet, such as a person or an ant or a crocodile.  That’s what we, and all living things, essentially are – data stores for our genes.

The DNA remains the same, but the animal changes over the course of its life, in terms of form, function and location.

So consider the lifecycle of the data.  

For differentiation, we need a minimal subset of the data.  It needs to be held close to the processing, ideally co - located.  It needs to be in a form that is easily consumed at runtime, ideally in the same format that it will be used by the code.

For personalisation, we need to be able to access all the data we have for the customer, and the ability to correlate and enrich it.  Now the data needs to be in a more central location so it can be accessed by multiple systems.  It needs to be held in a neutral format that can be read by all systems, and it makes sense to move the processing to the data, rather than the other way round.

So, with those considerations in mind, if I was creating a microservice architecture, at a minimum I would have the following data storage tiers:

  1. Instance level stores that each hold the data for a single instance of the microservice, in a “runtime format” (e.g. JSON).  Each store would exist for the duration of the instance.
  2. Service level stores that hold the data for a service that needs to exist separately from individual instances. This would be aggregated from / supplied to individual instances and should live as long as the service. The data should be held in a runtime format.
  3. Domain level data stores hold data aggregated from and shared with the microservices for a specific domain.  This type of data store would exist for the lifespan of the domain. The data would be held in a common format for the domain.
  4. Organisation level stores that hold data, in a domain and application neutral format so that it can be consumed across different domains.

The movement of data between the different tiers should be automated.

The timescale for data movement should be configurable for individual data items, ranging from (near) real – time (e.g. “write – through”) to scheduled tasks (“batch”).

The direction of movement could be uni, bi, or multi directional.

In this way we can deliver our digital offer with:

  • Differentiation
  • Personalisation
  • Speed to Market
  • Performance
  • Scalability
  • Reliability

Good news

One of my favourite quotes is from Yogi Berra: "In theory, there is no difference between theory and practice.  In practice, there is."

The good news is that this post is written from my own experience, and that of colleagues at Oracle.

We've all been working on distributed systems, with challenging requirements around scalability and performance.  For my part much of this has been in financial services, online gaming and stock trading applications.  Some of my colleagues have been building solutions for industries such as healthcare and travel, or domains such as IoT (Internet of Things).

So this isn't just a theoretical approach.  It's what has worked for us.

Toolkit

As I said before, there are no silver bullets, but my colleagues and I  have some tried and tested tools that we have used to build these systems.

  • For instance or service level stores:
    • Oracle Coherence
    • Oracle NoSQL DB
    • Oracle TimesTen
    • Berkeley DB
  • For domain / organisation level stores:
    • Oracle DB
    • MySQL

These are obviously Oracle tools, but I work for Oracle, these are the tools I used, and I can only speak from my own experience.  Others with different backgrounds or experience might suggest other tools.

I plan to go into more detail on how to use these tools in future.

Comments

Popular posts from this blog

Wot no FDK?

The Case of the Vanishing Dockerfile

Honey I Shrunk The Container