Why every Kafka message must be versioned?

Posted by

Apache Kafka – is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

It is not the secret that Apache Kafka became a standard of communication between microservices last time.

Becoming a standard entails the need for proper operation and using.

The hint of being proper used is to provide every message sent to Kafka with version information. And here is why.

What exactly is the version information?

It is any information about version of datum sent. The goal of this information is to separate whole the flow of incoming messages to Kafka Consumer into two ones by the very version: old messages and new messages.

I suggest you to use plain Strings as version in every message sent to Kafka. You may use “version.subversion” notation or not – let it depend on your choice.

Using “version.subversion” (or “major.minor”) notation gives you the possibility to distinguish updates that break or update protocol and ones that does not.

Moreover, Strings are more flexible in comparison of integers. And numbers are difficult to be operated (read about Money). And objects are complicated at all. Strings are the minimal and sufficient format for version information.

What exactly is the version information intended for?

Application constructing is continuous process. Every programmer evolves application adjusting to changing business needs. So the application once started as one Kafka producer, writing messages only into one Kafka topic, may be evolved someday and will do have several producers and even several consumers in one.

Therefore, the points of versioning are:

  1. Every message must be versioned
  2. Do not combine versions of messaged sent into several topics – don’t use the same version into topic A and topic B

What version is the first one?

Just use “1.0” in every new topic. If you implement minor changes, just increment minor digit – “1.1”. If you implement big changes into code, you should increment major digit in version – “2.0”.

Follow the Software versioning approaches in any doubt.

How to integrate version into Kafka messages?

… sending objects …

In case of sending objects in Kafka (e.g., JSON) the version may be placed into another one field of object – right in main depth.

  "id": 12345,
  "name": "Product #1",
  "version": "1.0"
}Code language: JSON / JSON with Comments (json)

Or you may wrap the message with the version layer like this.

  "payload": {
    "id": 12345,
    "name": "Product #1"
  "version": "1.0"
}Code language: JSON / JSON with Comments (json)

It will be depend on your needs and programming possibilities only.

… sending plain data …

If you send plain data to Kafka topic you may use Kafka Message Headers to store the version.

How exactly does the version help?

Evolving microservices

At present every enterprise application are developed to be run in containers. Enterprises do many things to improve availability, reliable and fault tolerance of their applications.

There are rules of thumb related to the theme of this article in fault tolerant and reliable production services:

  1. tolerate with evolving changes
  2. do not run any application in single instance only

The first one makes the application to react on evolving datum.

The approach of running several instances of any application helps to spread the load over all running applications equally, more or less.

At present the overwhelming majority of enterprise applications is places in the Cloud infrastructure, in clusters being automated in deployment using Kubernetes or AWS. Applications are updated in steps like:

  1. Starting new version of application
  2. Moving the load from old application instances to new ones
  3. Shutting down all instances of application of old version

Keeping in mind all these points you must teach your application not to read the alien datum.

Updating the version of messages producing to Kafka helps to distinguish them on Consumer side.

Consider the whole process of updating producers and consumers both.

pic. 1 – simple producer-to-consumer messaging system

Assume producer is being updated first, updating the version of messages produced.

pic. 2 – updating producer: old and new instances are running both

New producer is started as new instance and runs in parallel with old producer instance – they produce messages into one topic both. Only versions of messages are different. Consumer reads old and new messages both but operate only old ones as the check is adjusted on version == "1.0".

Next step is updating the consumer.

pic. 3 – updating consumer: two instances of consumer are running

Next step is shutting down old instances of producer and consumer.

All is fine at the glance but what if I say you loose your data updating your applications this way?

Do not loose your data!

Take a look at pic.3. Don’t you see the gap between this step and the previous one depicted on pic.2?

When new producer A-2 starts to send new messages versioned as “2.0”, old consumer A eats them as well as messages versioned as “1.0”. Yes, the consumer filter them and operate only messages versioned as “1.0”, but it commits the last offset it reads! Therefore the new consumer A-2 will not reread all the messages versioned as “2.0” because Kafka will not send them to the new consumer A-2 because all of them are already successfully read by Consumer Group 1! Here is the worst thing to be happened.

How not to loose your data?

Manual deployment

First thing you are able to do is to shutdown old producer A before starting new producer A-2. But it is not the way of choice because it breaks the very idea of automatic deployment.

Use different consumer group

The reasonable way is to create new consumer group 2 in order to start new consumer A-2 into this group. This way Kafka will send all old and new messages into Consumer group 1 and Consumer group 2 both until old producer be stopped. After stopping old producer, old consumer stops reading old messages due to stopping of their producing. Only at this time old consumer should be stopped.

pic.4 – second consumer group

Think in advance

Different cases happen in enterprise. As big the enterprise is, as much interesting cases are. Sometimes the case is urgent, the enterprise starts loosing its money or reputation and solving the issue must be implemented as soon as possible.

The version is such property which helps you in production stage. It helps you to separate data flows just with updating this property.

Leave a Reply

Your email address will not be published. Required fields are marked *