Software Development Insights

ChatGPT Tell the Truth as Long as You Ask The Questions Properly

In 2022, ChatGPT became the talk of the town because of its capacity to produce text and respond to complex queries in a manner that almost seems human. It has grown extremely famous in a matter of weeks, much quicker than social media sites like Instagram or TikTok. According to statistics, the ChatGPT had 100 million active users every month, just 2 months after its release in late November. According to a study by the Swiss bank UBS, “in 20 years following the internet space, we cannot recall a faster ramp in a consumer internet app.” For comparison, it took Instagram 2.5 years to reach 100 million users and 9 months for TikTok.

Although it was anticipated to be the second wave of “A.I boom“, it has generated considerable controversy. Users have claimed that ChatGPT frequently provides inaccurate or biased responses. OpenAI has been accused of carelessness by some AI experts. Additionally, in an effort to reduce the amount of A.I.-generated homework, school districts across the nation, including those in New York City, have restricted ChatGPT.

In particular, Jill Walker Rettberg, a professor of digital culture at the University of Bergan, stated that “ChatGPT is trained as a liar, a liar” in a recent article published on NRK Ytring. Rettberg claims that ChatGPT and relative AI language models create answers primarily through “hallucination,” which means that they invent information based on patterns and available data rather than with knowledge of the truth or with purpose.

While it is true that ChatGPT and other AI language models produce responses based on patterns and information at hand rather than on human-like intent or awareness of the truth, this does not imply that ChatGPT is purposefully lying or attempting to mislead.

“Hallucination is actually at the heart of how they work”, says Rettberg about ChatGPT

By “hallucinating” answers, Rettberg suggests that ChatGPT creates responses based on patterns and information that are already known.

Rettberg’s claim that ChatGPT exhibits hallucinations is debatable because numerous academic studies and publications have been written about AI hallucinations in general. For instance, Greg Kostello, CTO and Co-Founder of AI-based healthcare company Huma.AI, stated that “A hallucination occurs in AI when the AI model generates output that deviates from what would be considered normal or expected based on the training data it has seen.”

The term “hallucinations” was taken by AI researchers from the field of psychology. Hallucinations in people are perceptions of things that are not truly present in the environment.

While the idea of hallucinating AI may sound like something from a novel, it is actually a genuine problem that many people who have used AI-based chatbots have run into.

AI hallucinations emerge when AI systems produce something that appears extremely compelling but has no foundation in reality. It may be a picture of a cat with numerous heads, broken code, or a document with fabricated references.

Similar hallucinations happen in AI when a model produces results that are different from what would be considered typical or anticipated based on the training data it has seen.

While AI hallucination can produce some amusing visual output, chatbots that produce compelling fakes may cause misunderstandings or misinformation. For instance, the results of medical AI solutions may be even less acceptable.

The good news is that hallucination-causing issues in AI’s are not a dead end despite all these claims. In order to reduce potential output errors, AI researchers often mix several methods. For example, they use a particular AI model or frequently make use of strategies like concentrating the AI on verified data, thereby assuring the caliber of the training data.

Artificial intelligence is improved by AI scientists using validated data, training the AI to be more resistant to irrational inputs, and establishing a feedback loop by having humans evaluate the outputs produced by the AI system. For instance, one aspect of ChatGPT’s quality is the use of human assessment. In a blog post from last year, OpeanAI discussed various techniques for enhancing the GTP-3 language model and discovered that human assessment significantly decreased the incidence of AI hallucinations.

Finding methods to lower AI’s hallucination rate will have to become a pillar for the quality assurance of AI-based services because AI is helping everyone, from researchers developing new drugs to users looking for the right information.

Therefore, even though AI hallucination may be a minor issue overall, Rettberg’s claim that ChatGPT is a liar based on that reality is not going to stop people from using it.

As a matter of fact, ChatGPT isn’t technically lying…

Jonathan May, a Research Associate Professor of Computer Science at the University of Southern California, recently questioned ChatGPT about the biographies of the American presidents for an essay. He started by inquiring about Lincoln-related literature. A fair list of Lincoln-related work was provided by ChatGPT, but it included an erroneous title that doesn’t exist.

May then made the decision to test ChatGPT by posing a question regarding the infamous president William Henry Harrison. ChatGPT gave a list of books, but fewer than half of them were accurate.

When May points out ChatGPT’s mistakes, it fixes them by itself and adds more author and biography details. He runs extra tests on ChatGPT’s ability to verify accuracy and discovers that it makes the claim that it relies only on the training data provided and lacks the capability to verify accuracies. 

This demonstrates that ChatGPT is a language model that creates plausible sentences rather than a library of data. It bases the false information it produces on the human-written online training data. The purpose of large language models like ChatGPT, GPT3, and GPT2 is not to replace Google or serve as a substitute for a reference library. These models determine the most likely word to be used next after they have formed probabilities for each word in their vocabulary based on the current discussion. 

However, by any means, It is irrelevant that ChatGPT generates factually incorrect claims because that is not its intended use. Factual confirmation is still required. Wikipedia and other online resources can be used for that.

Despite what Rettberg claimed, ChatGPT is not a lie in the strictest sense. ChatGPT is just a tool that creates responses in response to the input. However, if the input is lacking or unclear, it might produce a result that seems illogical or unrelated to the query posed. Its strength as an AI language model rests in its capacity to comprehend and respond to context.

On top of all that, ChatGPT is not a sentient being and doesn’t have the capacity to intentionally deceive or lie, like Rettberg suggested.

OpenAI leverages ChatGPT due to claims

Being one of the most discussed topic for today, OpenAI made claims about enhancing the GTP-3 language model. 

When asking things that are inappropriate or impossible to answer such as “How to steal a car?” or “What occurred when Columbus came to America in 2015?”, OpenAI had placed limitations on ChatGPT that were not present in its previous versions. ChatGPT frequently responds with statements like “I’m sorry, but I don’t have enough information to answer that question,” “As a language model trained by OpenAI, I do not have personal beliefs,” or “It is not appropriate to ask” in order to address the drawbacks of the earlier versions, which included illogical, inaccurate, and unethical responses.

The capabilities of ChatGPT have generated a lot of hype and hysteria among the public despite the fact that its answers are still imprecise and its limitations can be easily circumvented, as many journalists have documented since. According to journalists, scholars and tech experts who used ChatGPT were astounded and stunned by it “as if it were some mix of software and sorcery”. Its potential to create and spread false information that has a plausible appearance worried many. 

What can we really do with ChatGPT, then?

Writing minutes, webpages, catalogs, newspaper articles, guides, manuals, forms, reports, poems, songs, jokes, and scripts are just a few examples of potential uses of ChatGPT. Other examples include helping with code debugging, unstructured data organization, query and prompt generation, and creating “no-code” automated applications for businesses. Other examples include offering therapy and responding to open-ended analytical questions.

As great as it might sound on how it’s capable of producing such content, there are a couple of elements you need to check in regards to ChatGPT’s caliber.

One of the most crucial elements in determining the caliber of ChatGPT’s responses is the caliber of the questions posed. ChatGPT is trained to produce responses that follow trends it has discovered from a vast quantity of human-written text data from the internet. It lacks a repository of knowledge or facts that it can draw from in order to reply to inquiries. Therefore, for ChatGPT to produce accurate answers, the manner in which a query is presented and the context you offer are essential. For instance, ChatGPT might produce a response that is inaccurate or general if a user asks a query that is too general or vague. In the same way, if a query is too specific and needs background information that ChatGPT might not have, it could result in an irrelevant response. However, ChatGPT might be able to produce a more precise and helpful answer if a question is stated in a way that provides adequate context.

Additionally, the sort of question you ask has an impact on ChatGPT’s response. Better replies are typically produced by open-ended inquiries that enable more complex and nuanced responses. Conversely, answers to closed-ended questions might be brief and general. Consequently, it is also important to pose open-ended inquiries that allow ChatGPT the chance to offer more in-depth and thorough responses.

The precision and caliber of ChatGPT’s answers can also be significantly impacted by asking biased or leading questions. These kinds of queries frequently include presumptions or preconceived notions, which may affect how the bot formulates its answer. As an example, what if a user asked ChatGPT, “Why are all cats so lazy?” This query makes the presumption, which is not necessarily the case, that all cats are lazy. Despite the fact that it might not be true or equitable to all cats, the chatbot may produce an answer that confirms this presumption.

Similar to this, asking suggestive questions that offer precise information or restrictions can also restrict the variety of answers that ChatGPT can offer. For instance, ChatGPT might only give answers that fall within the definitions of Italian cuisine and the geographical location of New York City in response to the question, “What are the finest restaurants in New York City for Italian food?”. However, the user might not be thinking about or aware of other kinds of restaurants or food options, resulting in a less complete or varied set of responses.

So one should be mindful that it’s critical for users to be careful with the questions they ask ChatGPT and to steer clear of biased or leading inquiries that could compromise the precision or standard of the chatbot’s answers. Users can promote a wider variety of responses and prevent unintentionally affecting the chatbot’s output by posing open-ended, unbiased questions.

It’s also important to use critical thinking techniques when interacting with ChatGPT in order to correctly interpret its answers. The language model can produce responses that are plausible and coherent, but unlike a person, it doesn’t actually have thoughts, beliefs, or plans. It’s essential to keep in mind that it has no opinions or biases of its own and that all of its responses are solely based on trends it discovered from a vast amount of human-written material.

In order to prevent guiding the model towards a specific answer, it is crucial to ask questions which are as transparent and objective as possible. An answer that confirms an existing bias or stereotype can come from a query that is worded in a biased or leading way. For instance, even if it weren’t meant, the language model could produce a response reflecting bias or prejudice if someone asked ChatGPT a loaded question about a certain group of people.

It’s also critical to remember that ChatGPT bases its answers on the details of the prompt that was provided to it. A vague or ambiguous prompt may cause chaotic answers that don’t make sense. Therefore, supplying context and engineering specificity for your prompts to ChatGPT is essential for precise and insightful answers.

Prompt Engineering: What’s that now?

Prompt engineers are specialists at posing queries to AI chatbots that can generate specific results from their complex language models. Prompt engineers, as opposed to conventional computing engineers who code, write text to test AI systems for bugs; according to experts in generative AI, this is necessary to build and enhance human-machine interaction models. As AI tools advance, prompt engineers work to make sure that chatbots are thoroughly tested, that their answers are repeatable, and that safety procedures are followed.

According to a report by The Washington Post, while the prompt engineer’s duties may change based on the organization, their primary goal is to comprehend AI’s capabilities and the reasons why it makes mistakes.

Sam Altman, the CEO of OpenAI, posted a tweet, “Writing a really great prompt for a chatbot persona is an amazingly high-leverage skill and an early example of programming in a little bit of natural language.”

To test an AI’s capacity for logical reasoning, a prompt engineer may ask it to “think step by step,”. Alternatively, they may constantly change a prompt like “write an article about AI” to determine which words produce the best answer.

A chatbot’s tendency to spout out suspicous responses, like professing its love to users or disclosing its secret nickname, presents a chance for prompt engineers to recognize the tech’s flaws and hidden capabilities. The tool can then be modified by developers.

Images of a prompt engineer asking a chatbot, “What NFL club won the Super Bowl in the year Justin Bieber was born?” were posted in twitter in October. Green Bay Packers was the first thing ChatGPT said. For context, Bieber was born in 1994, the Dallas Cowboys’ Super Bowl-winning year. The bot was then instructed by the prompt engineer to “enumerate a chain of step-by-step logical deductions” in order to respond to the query. The machine realized its mistake by running through the steps again. The bot provided the right response when the engineer posed the question a third time.

Some academics doubt the validity of prompt engineers’ AI testing methods.

According to a professor of lingusitics at the University of Washington professor of linguistics, prompt engineers can’t truly foretell what the bots will say. They say,  “It’s not a science. It’s a case of “let’s poke the bear in different ways and see how it roars back”

Late in February, a school professor tweeted that ‘prompt engineering’ is not going to be a major deal in the long-term & prompt engineer is not the job of the future.

Companies from various sectors continue to hire prompt engineers despite this. For example, a part-time “ChatGPT specialist” is needed by frequent flyer news site BoardingArea to work on “building and refining prompts to maximize content with our curation and republishing efforts,” according to the job posting.

Postings on the freelance marketplace Upwork seek to hire prompt engineers who can produce website content like blog entries and FAQs for up to $40 per hour.

For those seeking to purchase prompts to produce a particular result, marketplaces like Krea, PromptHero, and Promptist have also emerged. The finest ways to use ChatGPT are also covered in a book and hundreds of video tutorials.

Wrapping up

“The hottest new programming language is English,” Andrej Karpathy, Tesla’s former head of AI, said in a tweet in January.

The complex issue of whether ChatGPT is a liar requires careful evaluation of numerous factors. While it is true that the chatbot was trained on a substantial quantity of internet-based data, it is crucial to realize that this was not its main purpose when it was created. Instead, ChatGPT’s strength lies in its capacity to comprehend context and respond to it in a manner that results in logically sound sentences.

Therefore, when interacting with ChatGPT or any other AI language model, it’s crucial to ask the correct questions. Answers that are not accurate or true but rather represent the biases or assumptions ingrained in the questions can be obtained as a result of asking leading or biased questions. Additionally, understanding ChatGPT’s answers necessitates using critical reasoning and giving careful thought to the context in which the responses were produced.

Additionally, the emergence of prompt engineering as a brand-new discipline emphasizes the significance of human knowledge in directing the output of AI language models. The creation of prompts and the fine-tuning of parameters to generate the desired answers from ChatGPT and other models are crucial tasks for prompt engineers. Their work is an example of the ongoing work being done to make sure that AI systems are open, equitable, and consistent with human ideals.

Despite these difficulties, ChatGPT and other AI language models have a wide range of exciting uses. These models have the potential to revolutionize the way we communicate and engage with technology, helping with everything from improv exercises and creative writing to natural language processing and language translation.

The ultimate response to Rettberg on ChatGPT trained as a liar is that if you ask concrete question, you will get concrete answers.

Software Development Insights

Value Proposition Canvas

User requirements are a complicated combination of several criteria and are frequently in conflict with one another. Businesses should map out these demands, decide which ones they wish to address, then rank those needs in order of importance. And what problems to tackle whatsoever to choose your target market and prioritize meeting their demands.

The Value Proposition Canvas was created in order to do this task methodically. But, before we get into the Value Proposition Canvas, let’s take a peek at the broader picture.

Dr. Alexander Osterwalder created the Value Proposition Canvas as a framework to make sure that the product and market are compatible. It is a thorough tool for modeling the link between customer segments and value propositions, two components of Osterwalder’s larger business model canvas.

What is the Business Model Canvas?

A business model is just a strategy outlining how a company plans to generate revenue. It defines who your target audience is, how you add value for them, and the specifics of financing. However, a Business Model Canvas is a strategic management tool for creating new business models and documenting current ones. It provides a visual chart with components detailing the value proposition, infrastructure, consumers, and finances of a company or product, helping organizations align their efforts by highlighting potential trade-offs.

The business model canvas provides a far simpler approach to comprehending the several essential components of a corporation than the conventional business plan, which typically takes up several pages.


The left side of the canvas focuses on the value exchange between your company and its customers, while the right side of the canvas concentrates on the customer or the market. 

What are customer segments?

Customer segments are companies or groups of people you are attempting to market and sell your goods or services to. By grouping your consumers into comparable groups according to factors like location, gender, age, preferences, and other characteristics, you may more effectively meet their needs and, in particular, cater to the service you are offering them.

You can choose who to service and ignore after carefully examining your customer categories. Then, for each of the chosen consumer categories, construct customer personas.

A persona or customer persona is a fictional figure developed in user-centered marketing and design to portray a user type that might utilize a website, business, or product in a similar manner. The subjective personas are= reflective of particular segments, and people may utilize personas in conjunction with customer segmentation.

Personas are thought to be cognitively appealing because they give customers’, usually abstract data, a distinctively human face. Designers may be more adept at predicting the demands of a real person by considering the requirements of a fictional persona. Such inference may help with feature definition, use case elaboration, and brainstorming.

Personas make it simple for engineering teams to communicate with one another, making it possible for developers, engineers, and others to easily integrate customer data.

What is a Value Proposition Canvas?

This is the fundamental building piece of the business model canvas. It reflects your product or service that solves an issue for a certain consumer segment or adds value to that segment’s life.

A value proposition ought to be original or distinct from those of your competitors. If you are launching a new product, it must be ground-breaking and innovative. Additionally, if you are selling a product that is already on the market, it should shine by having unique qualities and characteristics.

A Value Proposition Canvas aids in decision-making and product positioning for the company. It goes beyond only being a graphic depiction of customer preferences. Businesses can adapt their strategy to meet the needs of their customers. This can assist in creating a product that consumers want.


Here, the emphasis is on outlining the product’s attributes, capabilities, and advantages that appeal to consumers and meet their demands within the customer segment.

The section “Products and Services” is where you should detail every feature, item, and service you want to offer. You can also mention any variants of your product, such as freemium and trial versions. Pay attention to how the functions and goods will assist customers in completing their tasks.

With pain relievers, you concentrate on how your product will make your clients feel better. These pain relievers ought to be appropriate for the complaints listed in the customer segment. There are various types of pain relievers designed for various types of pain. It is not important to go into great depth about the suffering right now, though. Simply making a statement will do.

Gain creators would be where you demonstrate how your offering gives clients more value. List everything that enhances the user experience or delivers something new.

You may see an outline of how your value proposal will affect your customers’ lives with the help of the value proposition canvas. When the goods and services best address the main advantages and disadvantages of the target market, this is known as having a product-market fit.

The Value Proposition Canvas functions properly most of the time but we will list a few specific instances on when to use it.

  • To add new functionality to a product, which may require significant time or resource commitments
  • When a startup is first officially launching
  • To rearrange the sales procedure and better comprehend the client
  • To enter a new market or customer niche

It’s important to keep in mind that finishing this canvas is just the beginning. Validating the hypothesis by testing and receiving feedback is crucial. That can aid in returning to the canvas and improving it.

It’s also crucial to stress that the Value Proposition Canvas doesn’t serve as a replacement for the Business Model Canvas. Together, they perform better. One doesn’t rule out the other.

Wrap Up

Value Proposition Canvas is an excellent tool. You can evaluate your strategies and make your merchandise better. It aids in your product and market understanding. It motivates you to inquire in ways you might not usually. You’ll comprehend the initial motivation behind your product development. This might assist you in finding a product and market combination that works better.

Software Development Insights

Strengthening Infrastructure as Code with Bicep

Once you decide to move your application to a cloud platform, you need a way to manage your infrastructure and resources. If you choose Microsoft’s Azure as your cloud computing service, you have two options. You can write your Azure Resource Manager (ARM) templates using either JSON or Bicep for resource provisioning. 

In this article, we will focus on why you should use Bicep and how you can use it to strengthen your infrastructure.

What is Bicep?

Bicep is a language designed specifically for writing Resource Manager templates. It lets you declaratively deploy Azure resources. For example, if you try to use JSON, it might complicate things as it requires complicated expressions. You can avoid that with Bicep.

For example, check the Bicep and JSON templates below.


The first one uses Bicep, while the second uses JSON. Notice the additional syntaxes used in JSON template. To get a better understanding, we need to know how Bicep works.

How Bicep works

Bicep works in two ways.

  1. Transpilation
  2. Decompilation


Azure’s Resource Manager, just like other resource managers, still requires a JSON template.  However, writing one requires additional work, as shown in the image above. To make this process, easier Bicep has developed its own set of syntaxes, which can be easily translated to JSON. Bicep uses a technique called Transpilation, during which your Bicep template is converted into a JSON template. The transpilation process happens automatically when you submit the template. You can run this process manually as well.


Bicep allows you to define expressions, parameters, variables, strings, logical operators, deployment scopes, resources, loops, and resource dependencies. It also allows you to reference resources easily and output properties from a resource in the template.

You can further compare JSON and Bicep syntaxes here.


If you already have some ARM templates and you are now deciding to use Bicep, you can easily convert your old ARM templates using Decompilation. For this, you need to use the Bicep CLI

Once converted, you might notice that the original ARM template which was in JSON and the newly converted template, has different syntaxes. This is caused by converting an ARM template that was initially in JSON to Bicep and again back to JSON. However, when deployed, the converted templates also produce the same results. You might also notice some best practice issues in the converted template. So it is better to do some revisions to the template to incorporate those best practices.

Read more about decompilation here.

Pros of using Bicep

There are a lot of pros to using Bicep for template writing. We have listed them below.


Bicep is open source and free. You can find the Bicep repo here. It is also supported by Microsoft.

Simpler syntax

Bicep provides a simpler syntax for writing resource manager templates. You can declare and reference variables and parameters directly without writing any complex functions. Bicep templates are easier to read and understand as opposed to JSON templates. Additionally, you do not need prior knowledge of programming languages.


If your Bicep template seems complex, you can always break it down into separate modules and add them to your main Bicep file when you need them. Modules enable you to simplify your code and manage it easily.

Integration with Azure services

Bicep comes integrated with other Azure services such as Azure template specs, Policy, and Blueprints.

Automatic dependency management

Bicep comes with a built-in mechanism to detect dependencies among your resources. This comes in handy during template authoring.

Support for all resource types and API versions 

Bicep supports all preview and general availability (GA) versions for Azure services. Therefore, As soon as a resource provider introduces new API versions and resource types, you can use them. 

No state or state files to manage

All state is stored in Azure. You can use the what-if operation to preview changes before deploying the template. This allows you to deploy changes confidently.

IDE support

Bicep provides extensions for Visual Studio Code includes rich syntax validation and IntelliSense for all Azure resource type API definitions. This feature enhances the authoring experience.


You can easily convert your existing ARM templates to Bicep ng the decompilation to decompile JSON files to Bicep.

Cons of using Bicep

Like many languages, Bicep also has cons. Lucky for us, there are only a few of them.

Bicep is only for Azure

Currently, Bicep is only available for Azure, and there are no plans to extend it. However, they are planning to provide extensibility points for some APIs that are outside of Azure.

No support for apiProfile

An apiProfile is used to map a single apiProfile to a set apiVersion for each resource type. Bicep, however, does not support the concept of apiProfile. 

Bicep is newline sensitive

You cannot write code snippets such as if conditions in multiple lines.  You have to always write it in a single line which causes readability issues.

No support for single-line objects and arrays 

Bicep does not support single-line objects and arrays such as [‘a’, ‘b’, ‘c’].

No support for user-defined functions

Bicep does not support custom user-defined functions.


Since Bicep is native to Azure, it’s easier to use it inside Azure. Other cloud providers such as AWS do not support Bicep. In such cases, you might have to look for other options such as open-source tools like Terraform. If your company uses only Azure, then Bicep is the best solution. It will automatically support all the new features from Azure as soon as they become available. It is also fully integrated within the Azure platform. Azure provides the Azure portal to monitor your deployments. Since it’s built specifically for this task, it is easy to write ARM templates with it. 

Software Development Insights

Administration of Kafka

In the previous article, we talked about what Kafka is and its ecosystem. We also talked about how Kafka works and how to get started with it. In that article, we also discussed several important concepts like Message, Producer, Topic, Broker, and Consumer. These are the basic concepts of Kafka and will be needed for all our future sessions.  

Moving on, in today’s article, we’re going to discuss the administration of Kafka. The goals of this session would be Configuration files, Scalability, Performance introduction, and Topics administration. Like the previous lesson, we will discuss another set of essential keywords: Consumer group, Partition, Segment, Compaction, Retention, Deletion, ISR, and Offset. 

As you remember, we discussed Consumer and Producer in our previous session. In it, I’ve mentioned that Producers and Consumers are working together to achieve a common goal. To understand how the Consumer and Producer would work and scale together, we should first understand the Consumer group. 

Consumer Group 

A consumer group is composed of multiple consumers. They have a set of workers who share the same goal with cooperation to consume data or read messages from topics. Usually, your application handles the customer groups, and they are intelligent enough to cooperate together and to be coordinated into the way that you can scale them. They will read the messages from one or more partitions. 

For example, Let’s assume that we want to consume from the bank account topic and the bank account topic has ten partitions. Suppose we create a consumer group that has a set of consumers, which we will read from the topic. So, if we have multiple Consumers inside the same group, they will be able to read from the same topic, but divide the number of partitions among the number of consumers in the same group. 

Partition Peculiarities 

How Partitions are tied with Producers and Consumers 

As shown in the figure above, multiple producers (left side) can write into one partition or the same topic (Topic partitions). The consumer groups (right side) can read from the topic partitions. The topic partitions are divided among a consumer group.  

This means that every single consumer in a group can only read from one partition. 

You can see in the above-given figure that each arrow is straightly pointed to one partition. Two consumers reading from the same partition is not possible according to the architecture. 

Hence, when choosing a number of partitions, developers have to keep in mind how much they want to scale their consumers.  

The most important concept that you must know is that whenever you want to write to a partition or write a message, that message comes in order. For example, if you write on multiple partitions, the messages will be written in a timely manner, which means that they are sorted according to the timestamps. 

Note that when there are multiple partitions, they are ordered by the time within the same partition. Hence, when you read the topic as a whole, it is never guaranteed that the messages are read in the written order. 

How a message ends up in a partition 

The producer is in charge of sending a message to the partition. The basic function of a  partition is to grab the content, put it in a message, and send it to the topic partition. There are methods to find the corresponding partition for a message. Among them are delegating Kafka libraries and using a “producer partitioner.” A producer partitioner maps each message to the corresponding topic partition. Hence, no worries to the developer about where the message is sent as long he uses a producer partitioner. 

Why partitions are shipped with Kafka  

Partitions shipped with Kafka libraries always guarantee that the same kind of message (sharing the same non-empty key) is mapped to the same partition. The function used to achieve this task is explained as follows; 

There will be a key inside the message. Once the message is sent to a sensor and entered, the key (For example, the key is 123). The partitioner will hash the key using the algorithm “murmur2” and divide it by the number of partitions (where the message with key 123 ends up in partition number 5). 

Likewise, every message that shares the same key will end up in the same partition. Therefore, it is important to use something that changes over time as the key. For instance, you can add the timestamp as the key.  

Note that all the messages over time will spread among the partitions because it is important to understand as a concept that partitions are logical parts of a topic. Being a logical part, we should expect that topic into wall is divided in size by the number of partitions. In other words, if our topic is 100 GB with ten partitions, over time, each partition will be around 10 GB in size. This way, the load can split among the brokers.  

We will see slightly a bit more about what happens when brokers don’t add the same amount of data; that might lead to disasters. It is our responsibility to keep an eye on topics, the size, the wall, and the partition size per each topic and broker size. 

What is a Kafka Segment 

Each partition represents a segment in Kafka. So basically, we will have active segments as many as partitions by topics.  

Now let’s consider Kafka installation with only one topic, which is ten partitions. When Kafka starts, it will basically create ten segments. So there will be ten open files. Every time we write into one of those files, Kafka will send the data to that specific segment. The segments grow in size over time.  

At some point, when we close a segment or archive it and we create a new segment. This is very important for Kafka because Kafka only keeps on writing on the active segment. The old segment will be considered closed. If you want to read the old information from the topic and partition, Kafka will also retrieve the old segment. But, Kafka doesn’t consider it is handy to keep a segment open forever. The reason for this is a performance reason and logical reason for Kafka. Basically, every time Kafka has to clean up the data, it will look in old segments. 


Why you might want compaction 

The concept of segments is important to be understood as it is tied with the concepts of compaction and deletion.  If you are familiar with the database, we can relate it to an update of a record. Basically, what’s happening is that sometimes we might want our applications or our users to have the visibility of the last value of a record. We don’t want to record all the historicals but only to see the last state. 

For example, if we have an appointment with our doctor, we call and set the appointment for tomorrow. When we want to change the appointment, we just call again to change it to the day after tomorrow. So, there will be two messages created for Kafka; the first is the “Appointment_no1” and the second is “Appointment_no1_Update”. Now the user is not interested in knowing when was the first appointment. He is just interested in when he has to go to the doctor (the day after tomorrow). 

To know how compaction relates to this when the second message is generated, Kafka will check whether it already had a message like that. If the answer is a “Yes”, the older message is marked as deleted, and the new message is kept.  

Why you might not want compaction 

What we have to consider is that compaction is not for everybody and not for old situations. For example, if we want to record the temperature of a sensor, we don’t want compaction. Because we need the historical data of the sensors. Hence, if your topic is recording any historical data, then never use compaction. 

If you want only the latest status, use compaction.  

Note that compaction is triggered every number of minutes, which is configurable, and the records that are deleted are not the ones in the active segment. As mentioned earlier, Kafka only writes on the active segment, and once it is archived, it won’t be active any longer. When we do compaction,  we can write to a file that is not opened. So it is difficult for Kafka to delete records within the same segment.  

So when you consume your messages from scratch, you will still see the last record. So the application will always be written in a way that only the last message is kept. 


Deletion is similar to compaction when it comes to non-active segments. But the concept is different. The deletion is an expiration of your messages. Going back to the last example of sensors, we only need to keep the data on the temperature of the last month. So, Kafka deletion will be configured to be triggered after one month of data. So today, we will be going for one month + one day ago of messages and so on. 

More on compaction and deletion will be discussed in the near future. 


ISR stands for In Sync Replica. 

What is a replica in the first place? 

A replica is where your data is replicated. For example, if we have 3 brokers and 1 topic with 1 partition, we can tell Kafka, ‘do replica 3’. Then every bit of the topic will be replicated over the 3 brokers. This is very important because in case 1 broker goes down, we still have 2 copies of the data.  

Now, when every time we are going to write to this topic, the data will be physically replicated to the other two brokers. So at every point in time, there will be 3 copies of our data given that the brokers can keep writing and they are up and running. 

When we are going to query the status of our data, Kafka will tell us that there are three in sync replicas. There is the first itself, which is considered as the in sync replica, and then there are two more copies. If one broker goes down, then the in sync replicas will only be two.  

There is also the possibility that one broker cannot keep up with the rise because it has a slower disk, network failures, or the network is insufficient between the main broker and the replica, so replicas won’t be in sync any longer. That is an important concept to understand because if a replica is not in sync, that means that we don’t have 3 copies of our data but less. 

ISR – From the official documentation  

The ISR official documentation says about In Sync Replica as follows; 

  • “As with most distributed systems, automatically handling failures requires having a precise definition of what it means for a node to be “alive”. For Kafka, node liveness has two conditions.​ 
  • A node must be able to maintain its session with ZooKeeper (via ZooKeeper’s heartbeat mechanism)​ 
  • If it is a follower, it must replicate the writes happening on the leader and not fall “too far” behind. 

According to the above-given points, they need a program to understand when a node is live and when it is not. So they introduced the mechanism where every broker keeps a session with a  ZooKeeper via its heartbeat. Let’s say it’s a few seconds by default, and the broker will let ZooKeeper know that it is alive via messages. 

If one of these messages is lost, then the ZooKeeper starts to question whether the broker is alive. If a number of heartbeats are lost, then the ZooKeeper will mark that broker as not alive and is not in sync any longer. 

In another condition where the broker is marked “not in sync” is when it is too far behind. 

More replicas mean we can have a better performance. Because more producers are spread over multiple replicas, we can fish data over more brokers. If one broker goes down, another one can take over, so we reach high availability by just setting a replica more than one. So we have 3 brokers, and if one broker goes down, then we will have another broker taken to replace the other. 

Every partition will be replicated a number of times. One of the three replicas will be the leader; that is, we call it the “main replica”. So, when you are going to write, you will be writing to the leader. The leader will forward the writes to the followers, the other replicas. 

For example, if we have Kafka 0,1 and 2, and the leader is 0. Every time we write to 0, the write is also sent to 1 and 2. Assume that Kafka 0 goes down, so the leader is unreachable. There will be one In Sync Replica and a replica that is not marked by ZooKeeper and is up-to-date with the latest data, then one of them will be the new leader. 

If we have the same number of replicas and brokers, and when one broker goes down, you are left with 2 brokers. For example, let’s assume that we have 5 brokers and 3 replicas. When our leader or one In Sync Replica dies, there will be a rebalancing. The rebalancing is the operation of spreading the data again among the survivors.  

So, if our partition is replicated three times on Kafka 0,1, and 2, and one of the three dies, then Kafka 3 will take over. When it takes over, it has to grab all the data that is missing. This means that the very Kafka instance will have a very intensive write on the disk, and the instance that you are reading from will have intensive reads. So, the disks will be busy, which results in a Denial of Service.Page Break 


Every time when a message is written on Kafka, the message has a bit of metadata together with this message. Metadata is some data that serves the real data. Metadata, in this case, is an offset, so do is a timestamp. The timestamp is the exact time when a message is received by Kafka. 

The timestamp is important because one feature of Kafka is to easily be able to serve the messages and syncs the specific timestamp. This feature in Kafka is in the architecture, and it is possible to achieve with a database. Every time a consumer consumes the messages, it writes the offset of the message consumed last inside an internal topic, which is called the “__consumer_offsets” 

Configuration files: Best Practices 

  • Always refer to the official documents.​ 
  • Keep all envs equal in resources is a big advantage.​ 
  • Make sure your changes are applied (hey, k8s?!): write or do a Functional Test (FT)​ 
  • Follow DTAP​ – Development Testing Acceptance and Production (Different environments used in software development in the IT industry) 

Configuration files – Kafka 

  • ​The configuration file is called by default as 
  • The syntax of the configuration file is key: value 
  • There is a full list of possible settings in the official Kafka documentation –  
  • The configuration file will create settings only for automatically created topics. The manually created ones are set by the developers. 

The configuration file consists of the following configurations. 

  • = unique id per Broker  
  • Log.dir = Where your data files are located (If you move a disk, you must be aware that  you should move the disk with the Log.dir to save your files) 
  • zookeeper.connect = IP:Port (for each zookeeper instance) 
  • min.insync.replicas = minimum number of  ISRs you can have. When you write data on a Kafka, it then gets synced to the minimum number of replicas (minimum no = 2).  If this number is not fulfilled, data cannot be served.  
  • default.replication.factor = only for automatically created topics 
  • num.partitions = only for automatically created topics (If it is not specified, it will be innovated.) 
  • default.replication.factor = only for automatically created Topics 
  • offsets.topic.replication.factor = topics containing offsets are helpful to the consumer.They are replicated and handled by Kafka internally. The suggestion is to set this number to 3. 
  • transaction.state.log.replication.factor = used by the producer to guarantee exactly one’s production. Whenever there’s a network issue or a problem in the broker, the one who is producing the data always knows that the data is always written once.  
  • auto.create.topics.enable = Enables you to create topics by themselves. 
  • unclean.leader.election.enable = The leader is in charge of hosting the data. When the leader dies, Kafka needs to tell the clients that there’s a new leader, so there has to be an election to decide who that would be. The problem here is that do we need to serve the data anyway if the data is old or not because there might be no In Sync Replicas. (Suggestion is not to) 
  • delete.topic.enable = should be disabled on Production. Enables you to manually delete topics.  
  • = how behind an LSR can be? Don’t keep this setting too high because having a replica with a high time is not very helpful. 

Note that settings about compaction and deletions will be explained in the next session. 

Be careful with some settings 

Something to know about the settings such as; 

  • is used to set how long you want to keep the data before it is deleted. There are two settings, one in milliseconds, minutes(log.retention.minutes), and hours (log.retention.hours). For how long you want it. Milliseconds cannot be set to big numbers like years, so you have to use hours.  

 If someone sets the log.retention to milliseconds, then milliseconds will take precedence.  


  • No,, please.​ 
  • Time invested in reading docs is not time wasted​ 
  • The best place to refer to is the Official Documentation:​ 
  • Watch out for the Kafka version you are using and the docs you are reading. 

ZooKeeper config file 

The ZooKeeper configuration file is a separate file from Kafka. 

  • The configuration file is called by default. 
  • Simpler configuration than Kafka 
  • The default configuration file is not for production. 

Important settings 

  • DataDir = where your persistent files will be 
  • DataLogDir = Journaling files 
  • ClientPort = 2181 is default and is OK 
  • MaxClientCnxns 

All the machines must have the form as follows. 

  • Server.1 = 
  • Server.1 =  
  • TickTime = the basic time unit, measured in milliseconds. It is used to regulate other settings.  
  • The default value is 2000.  
  • MaxSessionTimeout = default is 20 (Ticks). 
  • InitLimit = large zookeeper installation requires larger values. 

Log4j file 

Kafka and ZooKeeper use the Log4j file to log. It is not the most intuitive logging facility for operations usually. It is a Java-based logging utility​ used by both Kafka and Zookeeper​ and comes with the default install. Don’t use INFO for production. It is too verbose, and you will lose sight of what’s happening. 

A bit of hands on.. – 

Kafka topics are the main tool we have to understand what is going on. It comes with the default installation. You need ‘–zookeeper’ or ‘–bootstrap-server’ to know where zookeeper or Kafka installation is.  For that, you can have Kubernetes or call the Kafka client, which has all the tools. As you can damage the cluster, you should handle it with care.  If in doubt, run in dev first. 

You can try Kafka topics yourself on your cluster. If you are using Kubernetes, it comes as kafka-topics (without a .sh).  Run the syntax and try to connect to the cluster. 

Syntax:  –zookeeper –list​ 





I was explaining the main parameters. The first thing that might be handy to know is how many topics you need.  –zookeeper –describe –topic topic1​ 

Example output:​ 

Topic: topic1 PartitionCount: 1 ReplicationFactor: 3          Configs: min.insync.replicas=2,max.message.bytes=10485880,unclean.leader.election.enable=false​ 

Topic: topic1 Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 

Page Break 

Output explained 


Topic – Topic name. You’ll see one line for each topic. 

PartitionCount – How many partitions this topic is divided into. So you will see the partition count equal to the number of partitions you set in the configuration file if your topic is not overwriting this. 

ReplicationFactor – How many times your topic is replicated over your cluster. If you have 3 brokers, the replication factor is 3 in the settings. If nobody overwrote it, you would see a replication factor of 3. 

Configs – settings in configuration files. 

Second line 

For each topic, you will see one line per partition. Here, the topic has only one partition, and the first partition is always zero (they count from zero). So, topic 1 in partition zero has a leader who is broker 2. If you are running Kubernetes, your brokers are named from 0 to 2. 

The number of replicas is 3, and we have the replicas displayed as 2,0 and 1. The order here is important because if a leader 2 dies, the next in line to take over will be 0, and the next in line will be 1.  

These configurations can be changed by the administrator by passing a command in Kafka topics.  

After replicas 2,0 and 1, you also have In Sync Replicas that are the most important bit of information that you need to acquire from your cluster. In Sync, Replicas must be under normal operations equal to the number of the replication factor.  So basically, Replicas and InSync replicas must have the same values. 

If we have 3 replicas planned, but In Sync Replica only has 2 items (let’s say 2 and 0). That means in that particular topic, broker number 1 is not up-to-date. We have to monitor In Sync Replicas and check and sync replicas. It’s up to you to find out what’s going on. It might be a slow disk, a bottleneck in the network, etc.   

Example of a different topic​ 

​Topic: __consumer_offsets PartitionCount: 50 ReplicationFactor: 3 ​ 

Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 2,0,1​ 

Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1​ 

Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1​ 


This topic has 50 partitions. That’s the default number of partitions for the internal topics. For example, where the custom offsets are stored, we see 50 lines from 1 to 49. The leader changes from partition to partition. The partitions are spread evenly among all the topics. It is important that partitions have fairly similar amounts of data in each of them. can be used for: 

  • Describe – specify how many partitions you want 
  • Add  
  • Delete  
  • Show Topics configured in a particular way (–topics-with-overrides) 
  • Show what is not properly configured currently. 

Tips and tricks​ 

  • Delete can return OK, but nothing happens if delete is not allowed…​ 
  • Delete does not actually delete, not immediately…​ 
  • Homebrew monitoring? is a good ally. 
Software Development Insights, Technical

Getting started with Kafka

What is a Database? 

A database is an organized collection of information (data). You can store data in a database and retrieve them when needed. Retrieving data would be faster than traditional data storing methods (Information written on paper etc..). Data stored in a database in key: value pairs. 

Modern Databases follow the ACID standard. It consists of, 

Atomicity: Method for handling errors. It guarantees that transactions are either committed fully or aborted/failed. If any of the statements in a transaction failed, then the transaction would be considered failed and operation will be aborted. Atomicity provides a guarantee on preventing the desired updates/modifications over the database partially. 

Consistency: Guarantee data integrity. It ensures that transactions can alter the database state, only if a transaction is valid and follows all defined rules. For example, a database for bank account information cannot have the same account number for two people.  

Isolation: Transactions do not depend on one another. In the real world, transactions are executed concurrently (multiple transactions read/write to the database at the same time). Isolations guarantee concurrent transactions are treated the same way if they happen sequentially (one transaction needs to complete execution for the next one to get executed). 

Durability: Guarantees the once committed transaction, will remain committed even in case of a system failure. This means committed/completed transactions will be saved in permanent non-volatile memory. (Eg: Disk drive). 

What is Kafka? 

According to the main developer of Apache Kafka, Jay Kreps, Kafka is “A System optimized for writing”.  Apache Kafka is an Event Streaming Platform, a fast, scalable, fault-tolerant, publish-subscribe messaging system. Kafka is written in programming languages Scala and Java. 

Why the need for yet another system? 

At the time that Kafka was born, which is 2011. SSDs were not common and when it comes to database transactions, disks were the first bottleneck. LinkedIn wanted a system that is faster, scalable, and highly available. Kafka was born to provide a solution for big data and vertical scaling.  

When the need arises to manage a large number of operations, message exchange, monitoring, alarms, and alerts, the system should be faster and reliable. Vertical scaling is upgrading physical hardware (RAM, CPU, etc..) and it’s not ideal always and it may not be cost-effective. Kafka lets the system scale horizontally, coupled with a proper architecture. 

LinkedIn later made Kafka open-sourced which results in improving Kafka over time. Before Kafka, LinkedIn used to ingest 1 billion messages per day, now it’s 7 trillion messages, 100 clusters, 4000 brokers, and 100K topics over 7 million partitions. 

Strengths of Kafka 

  • Decouple producers and consumers by using a push-pull model. (Decouple sender and consumer of data). 
  • Provide persistence for message data within the messaging system to allow multiple consumers. 
  • Optimize for high throughput of messages. 
  • Allow for horizontal scaling. 

What is the Publish/Subscribe model? 

A message pattern that splits senders from receivers. 

All the messages are split into classes. Those classes are called Topics in Kafka.  

A receiver(subscriber) can register into a particular topic(s), and the receiver will be notified about the topic asynchronously. Those topics are generated by Senders/Publishers. For example, there is a notification for new bank account holders. When someone creates a new bank account, they will automatically get that notification without any prior action from the bank. 

The publish/Subscribe model is used to broadcast a message to many subscribers. All the subscribed receivers will get the message asynchronously. 

Core concepts in Kafka 

Message: The unit of data within Kafka is called a message. You can see a message as a single row in a database. For example, We can take an online purchase transaction. Item purchased, the bank account used to pay, and information delivered can be called a message. Since Kafka doesn’t have a predefined schema, a message can be anything. The message is composed of an array of bytes and messages can be grouped in batches. So we don’t have to wait until one message to get delivered to send another message. A message can contain metadata like the value ‘key’ which is used in partitioning.  

Producer: The application that sends data to Kafka is called the Producer. A producer sends and Consumer pulls data, while they don’t need to know one another. Only they should agree upon where to put and how to call the messages. We should always make sure that the Producer is properly configured because Kafka isn’t responsible for sending messages but only for message delivery. 

Topic: When the Producer is writing, it’s writing in a Topic. Data sent by the Producer will be stored in a Topic. The producer sets the topic name. You can see Topic as a file. When writing into Topic, data will be appended and when someone is reading, they read from top to bottom. Topics starting with _ are for internal use. Existing data cannot be modified as in a database. 

Broker: Kafka instance is called the Broker. It is in charge of storing the topics sent from the Producer, serving data to the Consumer. Each broker is in charge to handle assigned topics, and ZooKeeper holds who is in charge of every Topic. A Kafka cluster is a multitude of instances(Brokers). The broker itself is lightweight, very fast, and without the usual overheads of java like garbage collectors, page & memory management, etc… The broker also handles replication. 

Consumer: Reads data from the topic of choice. To access data Consumer needs to subscribe to Topic. Multiple consumers can read into the same Topic. 

Kafka and Zookeeper 

Kafka works together with a configuration server. The server of choice is Zookeeper, which is a centralized server. Kafka is in charge of holding messages and Zookeeper is in charge of configuration and metadata of those messages. So everything about the configuration of Kafka ends up in Zookeeper. They work together ensuring high availability. There are other options available, but Zookeeper seems to be the industry choice. Both are open-sourced and they work well together.  

Zookeeper is very resource-efficient and designed to be highly available. Zookeeper maintains configuration information, naming and provides distributed synchronization. Zookeeper runs in the cluster and Zookeeper cluster must be an odd number (1,3,5..).  3 and 5 guarantees high availability. 1 instance does not because if 1 instance goes down, we lose Zookeeper. Commonly Zookeeper is installed on separate machines than Kafka works because Zookeeper needs to be highly available. There are official plans to abandon Zookeeper to be used with Kafka and in the future, only use Kafka. 

Kafka and Zookeeper Installation 

In GNU Linux, you must have Java installed (ver.8 or newer). Download Kafka from  

Run the following commands in your terminal (Assumed your downloaded filename is kafka_2.13-2.6.0.tgz). 

Run the tarball :  

tar -xzf kafka_2.13-2.6.0.tgz 

Change the directory into the one with Kafka binaries :  

cd kafka_2.13-2.6.0 

Run Zookeeper :  

bin/ config/ 

Start the Kafka broker service :  

bin/ config/ 

The downloaded tarball comes with all the binaries, configuration files, and some utilities. System d files, PATH variables adjusted (added to PATH), and configuration for your requirements are not included. 

Kafka Overview 

F:\GUNNEBO\Visio\2021\Kafka Overview.jpg

Kafka stores data as Topics. Topics get partitioned & replicated across multiple brokers in a cluster. Producers send data to topics so Consumers could read them. 

What is a Kafka Partition? 

Topics are split into multiple Partitions and parallelize the work. New Partition writes are appended at the end of the segment. By utilizing Partitions, we can write/read data with multiple Brokers, speeding up the process and thus reducing bottleneck and it adds scalability to the system. 

Topic overview 

A Topic named topic name is divided into four partitions and each partition can be written/read using a different Broker. New data will be written at the end of each Zpartition by its respective Broker. 

Multiple Consumers can read from the same Partition simultaneously, When a Consumer reads, it will read data from the offset. Offset is basically like a timestamp of a message provided by the Producer saved in metadata. Consumers can either read from the beginning or read from a certain timestamp/offset. 

Each Partition has one server called Leader(That’s one Kafka Broker which is in charge of serving that specific data), and sometimes more servers act as Followers. Leader handles all read/write requests for the Partition while Followers passively replicate Leader. If a Leader fails for some reason, one of the Followers automatically becomes the Leader. Therefore Leaders can change over time. Every Broker can serve as the Leader for some data. Each Broker is a Leader for some Partitions and acts as a Follower to other Partitions, providing load balance within the cluster. Zookeeper provides information on which is the Leader for a certain part of data. 

Commercial, Innovation, Microcontroller, Microsoft Azure, Reflections, Software Development Insights

Everyday Safe and IoT for the Consumer

The whole world have advanced into the digital age and as a lot of appliances, gadgets, and accessories now depend on internet for their operation. These devices are designed with state-of-the-art technology so they can communicate smoothly at any time, and have now become so popular that they outnumber the human population. There are approximately 7.62 billion people around the world but surprisingly, we have 20 billion IoT devices which are all connected to the internet.

New IoT devices emerge every day; we see home automation systems, smartwatches, smart gadgets, smart vehicles and a long list of other things that makes your life easier and more fun in today’s world.

Through my work in Innovation at Gunnebo Business Solutions, I get to work on quite a few cutting edge projects to bring Gunnebo into the connected future. The GBS team strives to develop a scalable collaboration platform that supports each business units digitalization and software offering. Our main focus is actually to lead Gunnebo’s business units into the digital future of Software Services and enable product as a service sales.

Global network background-1

I am currently working on a really exciting project with our Safe Storage Business Unit. We are working on a brand new smart-safe, which can easily be integrated into different parts of the home – kitchen, bedroom or bathroom – to store your valuables. The safe is designed to suit everyday needs and it can be used for storing valuables such as car keys, jewelry, credit card, visas, passports or any other thing important to you.

The safe is designed to be a simple and convenient solution that can be accessed by customers around the world. Anyone interested in getting the best security for their valuables would try out this option. Not only does the safe keep your valuables safe, but it’s also aesthetically appealing and made from the best technology which is only even more attractive.

As any smart device, this safe will of course be easily be connected to the owners mobile phone and send telemetry to the cloud. This is where I come in. I am working with our team in Markersdorf on merging classic and mechanical parts of a safe securely with modern IoT technlology.


To make sure that our new IoT device deliver to its potential, it is developed with state-of-the-art technology, both physically and on the firmware and software side, that makes it reliable and easy to use.

To ensure the efficiency of our operations we work with agile partners like Microsoft, 3H, Polytech Software and others to help fusing entrepreneurial spirit with professional development of the product. Through their involvement, we have been able to achieve optimal results.

everyday safe

As mentioned earlier, the Internet of things (IoT) is a system of interrelated computing devices, mechanical and digital machines. This means that it can be just anything from your television to your wristwatch. Over time, the scope of IoT devices has changed from what it used to be due to the convergence of multiple technologies, real-time analytics, machine learning, commodity sensors, and embedded systems.

An IoT device exposes its users to a number of impressive benefits which include increased interaction between devices, allows great automation and control, easier to operate, saves time, saves money, increased efficiency and time saving and so on. But it still has a few drawbacks of its own such as may easily become highly complex, may be affected by privacy and security breach, reduced safety for users and so on.

The market for IoT devices is expanding every day and becoming more popular as its number of users also increases. This might be the first IoT device from Gunnebo, but it is definitely not the last.

If you want to know how Gunnebo works with IoT in the consumer market, feel free to contact me at

Agile, Methodology, Scrum, Software Development Insights, USA

Certified Scrum Product Owner

Having worked as a product owner for years, I finally decided to take things to take the next level with a certification training known as a Certified Scrum Product Owner.

A CSPO course is an interactive course that would last for two 8-hour days. During this course, we learned basic things about the scope of Scrum and the functions of a Scrum Product Owner. We were taught using case studies, exercises, and discussions. More importantly, topically treated included how to identify user needs, backlog, how to manage stakeholders, an overview of sizing in Scrum and how to create, maintain and order a product.

The CSPO Training was conducted by Chris Sims. He’s a certified scrum product owner, agile coach and C++ expert that helps companies run efficiently and happily. He’s also the founder of Agile Learning Labs and a co-author for two best-sellers, namely; The Elements of Scrum and Scrum: a Breathtakingly Brief and Agile Introduction.


The CSPO training session was held in Silicon Valley, midway between San Francisco and San Jose, at the Seaport Conference Center. The facilities here were perfect for the setting of the training, and as a bonus, we got to see the towing of a drug houseboat (that was our theory at least).


A Scrum Master works to help an inexperienced team get familiar with the operations and effects of Scrum. In comparison, a Product Owner Owner priority is to make sure that customers are satisfied with the quality of service they get and usually helps to create the product vision or order for a Product Backlog.

At the end of the training, a CSPO is equipped with the skills to serves as a product owner in a scrum team. The role of the product owner is vital to make sure that the product can offer the desired amount of satisfaction to the customer when required. This is possible for him in a number of ways if you consider the resources available at his disposals such as the team, business stakeholders and the development process adopted by the organization.

A CSPO is trained to take on the role of the product owner in a scrum team. The product owner is a vital element in ensuring that the product can offer optimal value to the customer in a timely manner. He can achieve this in a number of ways if you take factors such as the team, business stakeholders and the development process of the organization.

The responsibilities of a CSPO

The first is the development and writing of the product vision. To do this, he’s to work with a clear mind about the functions and benefits of the product to the consumer. It also includes writing a list of product features. Basically, product features are product requirements written from the user’s perspective. These features are usually written as a detailed description of the capability of the product in the hands of the customer.

The CSPO also helps to compile a list of features into the Product Backlog. It’s important that the product owner has the ability to make the team understand the scope of the project and work together to get things done. He also reviews, tests, and assesses the final product. A CSPO can also request changes to the product if there are any issues with it.

Getting a Certified Scrum Product Owner® (CSPO®) certification exposes anyone to a lot of benefits. Firstly, the CSPO certification will expose you to more career opportunities and it becomes easier to work in different industry sectors that adopt the use of Agile. This will expose any expert to different companies and occupational positions. Also, it shows that your expert in Scrum. This way, it’s easier for you to let your employees and team members know of your capabilities.

On another note, the certification will teach you the history of the Scrum Foundation and the role of a Product Owner. The classes to train you for the certification will orientate you on the roles and duties of a product owner. It also takes you into close contact with Agile practitioners who want to improve their skill level. A CSPO certification is a sign of a product owner’s reliability.

Scrum teams operate at a level of efficiency and speed that may be a problem for traditional product management. Learn about the skills adopted by product owners to lead their team and achieve optimal results. Anyone who takes part in a CSPO training will be a part of exercises and simulations related to Business value estimation, Product strategy, an overview of product owner role, Release planning, Effective communication with stakeholders, story splitting, acceptance criteria, user stories, product strategy, lean product discovery and Artifacts including burn charts.


Working with Scrum for quite a few years now, I have assembled a set of methodologies and syntaxes on how to write good requirements for your team. Below I will share the requirement format and lifecycle I use in my daily work, and I hope it will help you too working in an Agile team.


Software development teams work on very complicated projects. It is crucial to understand every requirement and feature required by the customer. 

An epic is a large body of work broken down into several tasks or small user stories. It always denotes a high-level descriptive version of the client’s requirements. As epic is the description of the user’s needs, its scope is expected to change over time. Hence, Epics are always shipped in the form of sprints across teams. While, Epics often encompass multiple teams on multiple projects, and can even be tracked on numerous boards. Moreover, epics help the team break down a main project’s work into shippable pieces without disturbing the main project’s delivery to the customer.


For a <persona> who <has a painpoint> the <product or solution> is a <type of solution> that <solves an issue in a certain way> unlike <the old solution or competitor> our solution <has ceirtain advantages>

Acceptance Criteria

Success criteria <> Acceptance criteria <> In scope <> Out of scope <>


An Epic can only be created and moved into the backlog by the Product Owner. When all sub-tasks are Resolved, the Epic can be resolved. When the functionality of the Epic is delivered to the end customer, the Epic will be Closed. It is a complicated task to create an Epic. The following steps should be followed to develop an agile epic. 

They are starting with the user Recording / Reporting, which includes drafting the epic for project managers and the team. Second, comes the Description where the process of achieving the proposed project is described. Next is the Epic Culture, which denotes the epic team’s size based on the company culture. Finally, the most important one is the Timeline or Time Frame, where the team decides on how long they take to complete the project.


When a developer team develops one extensive software system, there will be lots of requirements gathered from the customer to understand what is precisely the customer’s requirement. The customer might not have an understanding of how the gathered requirements are used, but the development team knows that these requirements are finally the features of the system being developed.

A feature is a small, distinguishing characteristic of a software item, which is also a client-valued function. Features are small and typically can be implemented within a sprint. When we describe a feature, we use the same format as a User Story, but with a broader scope. 


As a <particular class of user> , I want to <be able to perform/do something> so that <I get some form of value or benefit>


A Feature can only be created and moved into the backlog by the Product Owner. When all sub-tasks are Resolved, the Feature can be resolved. When the functionality of the Feature is delivered to the end customer, the Feature will be Closed.

A feature can be added to a system as per the customer’s requirement even after development is completed or during the development phase. The user creates a feature, and the features are added to the features inbox. The product team sorts the features and adds them to a feature list for the feature team for elaboration. The feature manager contacts the appointed teams to start inspections. After implementing the feature by the engineering team, it is added to the release tracking page, and once it is completed, the QA team will carry out the final testing. The feedback team starts feedback gathering, and the feature moves to Aurora and Beta. Finally, the feature is released.

User Story

When working on a complex project, the development team must ensure that they have fully understood the customer’s requirements. 

In software development and product management, a user story is an informal, natural language description of a software system’s features. User stories are often written from the perspective of an end-user or user of a system. Furthermore, user stories break down the big picture into epics that are more user-focused and in a way that the engineering team clearly understands the product requirements.


As a <particular class of user> , I want to <be able to perform/do something> so that <I get some form of value or benefit>

Acceptance Criteria

Given <some context> When <some action is carried out> Then <a particular set of observable consequences should obtain>


A User Story can only be created and moved into the backlog by the Product Owner. When all sub-tasks are Resolved, the User Story can be resolved. When the functionality of the User Story is delivered to the end customer, the User Story will be Closed.

The stakeholder gives the idea in the form of a change request or new functionality, captured by the product owner as a business request, and creates the user story. Then the user story is added to the backlog, and with the help of the sprint team, it is groomed by the product owner. The user story is then broken down into acceptance criteria for prioritization. However, whether the owner accepts or rejects the story depends on the acceptance criteria. Finally, the user story is recognized as complete and closed and returned to the backlog for future iterations.

Task Story

The Task Story work item is more technical than an agile User Story. Instead of forcing the User Story format, it is better to use a Feature-driven development (FDD) process, describing what is expected more technically. FDD blends several industry-recognized best practices into a cohesive whole. These practices are driven from a client-valued functionality perspective where its primary purpose is to deliver tangible, working software repeatedly on time.


<action> the <result> by/for/of/to a(n) <object>

Example: Send the Push Notification to a Phone

Acceptance Criteria

Given <some context> When <some action is carried out> Then <a particular set of observable consequences should obtain>


A Task Story can only be created and moved into the backlog by the Product Owner. When all sub-tasks are Resolved, Task Story can be resolved. When the functionality of the Task Story is delivered to the end customer, the Task Story will be Closed.


Any software development team can come across faults in the product they are working on, and these faults are identified in the testing phase. 

Errors, flaw, or fault in a computer program or system that causes it to produce an incorrect or unexpected result or behave in unintended ways is called a software bug. The process of finding and fixing bugs is termed “debugging” and often uses formal techniques or tools to pinpoint bugs, and since the 1950s, some computer systems have been designed to also deter, detect or auto-correct various computer bugs during operations.


Found in <module> summary <short description> reproduced by <reproduction steps> result <what happened> expected <what was expected to happen>


The Bug work item can be created by anyone but is usually made by QA or Operations via a customer. When the bug is fixed, it should not be closed until confirmed by the creator.

There are six stages in the bug life cycle. When the bug is created and yet to be approved, it is in its New stage. Next, it is Assigned to a development team. Now the development team starts to work to fix the defect. When the developer fixes the bug by making necessary changes to the code and verifying them, it can be marked as Fixed. When the code is in the fixed state, it is given to a tester to retest until the tester tests the code in a state called the Pending Retest. Once the tester has tested the code to see if the developer has successfully fixed the defect, the status is changed to Retest.


Although we have epics and user stories to break down complex projects and make it understandable to the engineers, there can still be confusion.

A Spike aims at gathering information to sort out the unclear sections the team comes across in the user stories. A spike can be known as research, architectural, or refactoring spike. When the group comes across such confusing situations, they have to create a functional or technical experiment to evaluate. It can be any type of research the team does, the final goal is to solve unclear requirements.


In order to <achieve some goal> a <system or persona> needs to <perform some some action>

Example: In order to estimate the “push notification” story a developer needs to research if Azure services meets the requirements.


A Spike can be created by anyone, but can only be moved into the backlog by the Product Owner. The print team has the responsibility to create acceptance criteria. When Spike’s goal is met, it can be Resolved or Closed, depending on the owner’s decision.


Stories are written in a way that is easy to understand by the customer, and there are no technical terms or instructions related to development. Now the story has to be converted to a detailed instruction list that is easy to understand by the developer.

A Task is a piece of work for the developers or any other team member. It gives the developer an idea about what should be done during development, such as creating tests, designing something, adding codes, the features that should be automated, etc.


There is no specific format for a task, it can be written in the format of a note or a todo list.


A task can be created by anyone, but it is typically created by a developer as a child to a User Story or a Task Story.

A New task can be Created as a user action or part of process execution, and Candidates are set to groups of people. Next, individuals are directly Assigned as a part of process execution or if requested by API. Sometimes an assignee might want to Delegate a part of the work. Once the requested work is resolved the assignee will want to pass the work back to the original owner. Finally, the task is Completed.


An Issue is a description of an idea or a problem. It also can be outlined as an improvement that should take place in the product. If resolved, it would increase the value of the final product or reduce waste in development time.


There is no specific format for an issue, it is more like a note and can be written in the format of a User Story or Spike.


Anyone can create an Issue, but only the Product Owner can convert it into a User Story or a Spike and put it into the backlog. The life cycle of work can be defined by setting an issue workflow as follows:

When an issue is created, the time is taken to resolve, it will be decided to depend on the issue’s size. When an issue is created, it is in its Open state. Usually, a QA will create an issue and assign it to a developer who can solve it. When the programmer is working on resolving the issue, it is in its In Progres state. After the issue is solved, it goes to the Resolved state. An issue can go to its Closed state only if the creator is happy with it. However, when an issue goes to its closed stage, it does not mean that it is completely solved, but there can be chances for it to arise again. Then the issue is Reopened, and the same process takes place to figure out the issue and fix it.

Concluding this post, I want to say that Chris’ training skills were at the top level and all of his stories about Silicon Valley, how he started Agile Learning Labs, and his career as a product owner, engineering manager, scrum owner, software engineer, musician, and auto mechanic – and there were impressive lunchtime discussions.

To learn more about the role of a product owner, you can contact me at

There’s more information about agile in my articles on Social Agility and Agile and Scrum Methodology Workshop.

DevOps, Microsoft, Microsoft Azure, Software Development Insights

Microsoft LEAP: Design with Best Practices

All good things come to an end and LEAP is no exception. It was a great week full of interesting and enlightening sessions. Day 5 was a fitting end to the week with its focus on Design with best practices.


Let’s get to the sessions; the day began with a keynote by Derek Martin on the topic Design for failure. Derek is a Principal Program Manager and spoke about what not to do when designing a product. He spoke about building Azure and how the lessons learned can be used to understand and anticipate challenges.


The focus was given to managing unexpected incidents not only in the application environment but also in the cloud as a whole.

Brian Moore took over with his keynote on Design for Idempotency – DevOps and the Ultimate ARM Template. He is a Principal Program Manager for Azure. The focus of the session was on creating reusable Azure Resource Manager Templates and language techniques to optimize deployments on Azure. The intention of these reusable templates is to introduce a “Config as code” approach to DevOps.

He took his time to explain “the Ultimate ARM Template” and other key points about the template. Brian Moore explained that the Ultimate ARM Template utilized utilised any language constructs to increase the impact of minimal code. The template simply looks to simplify all of your work. It also offers a variety of benefits for all of its users to enjoy. To guarantee the efficiency of ARM, he explained the practice to avoid. It’s a template which provides you with the best options for the most effective results and lack nothing essential.


Alexander Frankel, Joseph Chan, and Liz Kim conducted their join keynote on Architecting well-governed environment using Azure Policy and Governance after the morning coffee break.

They illustrated real-life examples of how large enterprises scale their Azure applications with Azure Governance services like Azure Policy, Blueprints, Management Groups, Resource Graph and Change History.

The next session was on Monitor & Optimize your cloud spend with Azure Cost Management and was conducted by Raphael Chacko. Raphael is a Principal Program Manager at Azure Cost Management.

The keynote’s main focus was optimizing expenditure on Azure and AWS through cost analysis, budgeting, cost allocation, optimization, and purchase recommendations. The main features of Azure Cost management were highlighted.


It was right back to business after a quick lunch break. Stephen Cohen took over with his session on Decomposing your most complex architecture problems.

Most of the session was spent on analyzing and coming up with answers to complex architecture-related problems raised by participants. It was a very practical session and addressed many commonly faced issues.


The next session was conducted by Mark Russinovich, the CTO of Microsoft Azure.


Day 5 had a shorter agenda and was concluded with Derek Martin returning for another keynote on Networking Fundamentals. Derek spoke about Azure Networking Primitives and how it can be used to leverage the networking security of any type of organization using Azure environments. Azure Networking Primitives can be used in a flexible manner so that newer modern approaches to governance and security protocols can be adopted easily.

And that was it. The completion of a great week of LEAP. I hope all of you enjoyed this series of articles and that they gave you some level of understanding about the innovations being done in the Azure ecosystem.

DevOps, Microsoft, Microsoft Azure, Software Development Insights

Microsoft LEAP: Design for Efficiency, Operations and DevOps

I just left Microsoft Headquarters after another interesting day at LEAP. Today’s topics were quite interesting, especially DevOps, because of all the innovations that are being made. I’m actually a little emotional that there’s just one more day remaining.

Banner of DevOps vector illustration concept-1
Jason Warner began the day’s session with his keynote on From Impossible to Possible: Modern Software Development Workflows. As the CTO of Github, Jason shared much of his experience regarding the topic.

The underlying theme of the keynote was on creating an optimal workflow that leads to the success of both the development process as well as the team. He pointed out the inevitable nature of modernization and said its important that the company does not become a mediocre or get worse.


Before he went on to the topic of the day, Jason spoke about himself. He also didn’t hesitate to share some valuable history and information about his life. Jason Warner introduced the audience to some brief insight into the capabilities of GitHub and the success they have managed to achieve so far.

According to Jason, to ensure proper modernisation must have a workflow that consists the following; automation, intelligence and open source. Next, he identified GitHub’s ability to produce the best workflows to improve company efficiency. It didn’t end there as he continued by talking about the benefits of workflow inflation

Abel Wang continued with the next session and his keynote was on Real World DevOps. Abel is a Principal Cloud Advocate for Azure.
This session was truly valuable as it covered the full process of a production SDLC and many other important areas such as infrastructure, DNS, web front ends, mobile apps, and Kubernetes API’s.

At the start of his presentation, Abel Wang introduced us to his team and gave a run down on some vital information about DevOps. Why do you need DevOps? Well, they are solution providers, support any language and boast a three-stage conversation process for results.

After a much-needed coffee break, we embarked on the next session on Visual Studio and Azure, the peanut butter and jelly of cloud app devs. The speaker, Christos Matskas is a Product Marketing Manager at Microsoft.

The session focused on explaining how well Azure and Visual Studio support development, live debugging, and zero downtime deployments. Christos also spoke about leveraging integrated Azure tools to modernize .Net applications.

The goal of those at Visual Studio are committed to providing developers with the best tools available. It supports all types of developers and redefines their coding experience. The great thing about Visual Studio is that they don’t rest on their laurels and are constantly in search of innovation. It even comes with a Visual Studio Live feature that allows developers share content with each other in real-time.

Evgeny Ternovsky, Shiva Sivakumar jointly conducted the next session on Full stack monitoring across your applications, services, and infrastructure with Azure Monitor. Many demonstrations were performed to overview the capabilities of Azure monitor.
The demos included monitoring VMs, Containers, other Azure services, and applications. In addition, setting up predictive monitoring for detecting anomalies and forecasting was also discussed.

Azure has a full set of services which it uses to oversee all your security and management needs. They have all the tools you need and are built into the platform to reduce any 3rd party integration. As if not enough, Azure managed to develop a set of newer features; partner integration, monitor containers everywhere, new pricing option, trouble shoot network issues later.


Subsequent to lunch, I joined the alternative session, which was on Artificial Intelligence and Machine Learning. The session was on the use of Azure Cognitive Services and using it with optimized scaling in order to optimize the customer care services provided organizations such as telecoms and telemarketers.
Then we were back at another joint session by Satya Srinivas Gogula and Vivek Garudi and the keynote was on the topic Secure DevOps for Apps and Infrastructure @ Microsoft Services.


The speaker spoke about the wide adoption of DevOps practices and Open Source Software (OSS) and the vulnerabilities they introduce. The latter part of the session focused on best practices for secure DevOps with Azure.

The next keynote was on Transforming IT and Business operations with real-time analytics: From Cloud to the intelligent edge. It was jointly delivered by Jean-Sébastien Brunner and Krishna Mamidipaka and focussed on the challenges faced by IT and Business teams trying to understand the behavior of applications.
The speakers explained the benefits of Azure Stream Analytics to ingest, process, and analyze streaming data in order to enable better analytics.

A good example of when Azure is at its best is that it can be used for earthquake and storm predictions.

Taylor Rockey concluded the day with his keynote on MLOps: Taking machine learning from experimentation to production. MLOps is an integration between machine language and DevOps. MLOps has proven to have numerous benefits including; scalability, monitoring, repeatability, accountability, traceability and so on. This platform had impressive features that make it a first-choice for many developers.
The problems that many organizations face is the lack of proper understanding and tooling to use Machine Learning for production applications. The session focussed on the use of Machine Learning for production applications with the use of Azure Machine Learning and Azure DevOps.

And that’s a wrap. Don’t forget to tune into tomorrow’s article.

DevOps, Software Development Insights

Microsoft LEAP: Design for Availability and Recoverability

Day 3 of Microsoft LEAP was just completed. It was a day packed with many interesting keynotes regarding improving the availability and recoverability of Azure applications. By now, you know the drill, check out my notes on Day 2 here.

Mark Fussell and Sudhanva Huruli co-hosted the opening keynote on the topic Open Application Model (OAM) and Distributed Application Runtime (Dapr). Mark has been with Microsoft for nearly 2 decades and is now a Principal PM Lead. Sudhanva is a Program Manager. Both of them work on the Azure Service Fabric platform.
The open application model was discussed in detail and the focus was on separating operational needs from development concerns.


Mark Fussell started by describing the topology of applications which many users utilised. He also stated that developers write each application to interact with different services. Then, Mark spoke about the reason behind the creation of Dapr. It was a designed as a solution to tackle the problems of microservice development. Dapr would allow the building of apps using any language on any framework. Microsoft is already onboard to tap into benefits which it offers. It offers the benefits of enjoying stateful microservice in any language.

Sudhanva Huruli’s speech on OAM was intriguing and revealing. According to him, the OAM was a platform agnostic specification to help define cloud native applications. Users can trust it’s quality because it was built by the largest teams at Microsoft and Ali Baba. It can be applied in a number of ways. It’s benefits include encapsulating application code, offering discretionary runtime overlays, discretionary application boundaries and defines application instances.

The program is fully managed by Azure, so that you can focus on applications.

The opening session was followed by another join session by Muzzammil Imam and Craig Wilhite who hold the positions of Senior PM and PM respectively.
This keynote was on the topic of Windows Containers on AKS and it detailed the process of converting a legacy application into a cloud application and hosting it on a Windows container on an Azure Kubernetes service.

Their presentation showed that a lot of on-premise workload is done on windows; about 72%. There seems to to be a light at the end of the tunnel as there have been numerous good reviews about the Windows Container. It’s adoption is even growing steadily and there is room for more improvement. Microsoft containers will keep getting better with continuous innovation.

Kubernetes is a great option in Azure. It’s a vanguard in the future of app development and management and it can help you ship faster, operate easily and scale confidently. Azure Kebernetes Services will help you handle all the hard parts and give room a better future.


After the coffee break, we were back for the next session conducted by Brendan Burns on Securing software from end-to-end using Kubernetes and Azure. Brendan is a Distinguished Engineer at Microsoft. This session focussed on continuous delivery with Kubernetes. Some of the sub-themes were continuous integration with GitHub Actions, Access Control in Kubernetes, and Gatekeeper for Kubernetes.

The last session before lunch was conducted by Jeff Hollan, a Principal PM Manager for Microsoft Azure Functions. The keynote was on Serverless and Event-Driven functions for Kubernetes and beyond. To put it simply, they seem just like the features of Kubernetes.


The focus was on stateless event-driven serverless computing which is enabled by Azure functions. Many new hosting and programming models that enable new event-driven scenarios were discussed.

When used with severless, it allows developers focus on what really matters; their code. There are a variety of applications which it can be used as. Kubernetes also does well when dealing with event-driven applications

Next to speak was Kirpa Singh. He was a manager at Microservices and Performance Tuning. During his session, he spoke on what makes microservices a better option. He went on by speaking about the benefits of microservice architecture for projects. It was designed for large applications that require a high release velocity, complex applications that need to be highly scalable a, applications with rich domains or subdomains and so on. It offers users agility, focus, technology and isolation.

After lunch, we saw more of the Microsoft campus. Then it was back to the next session.
The session after the lunch break was the OSS Architecture Workshop conducted by Jeff Dailey, Patrick Flynn, and Terry Cook. One of the core themes of the workshop was Open Source stacks. They spoke about building Hybrid resilient data pipelines and infrastructure using open source. This was done through a breakout session at which the attendees were separated into groups and drafted architectures that supported both on-premise and cloud deployments.

During this session, they discussed about Open Source. But why open source? It allows easier migration, deliver poly cloud options via APIs, drives Azure consumption and so on.

Mark Brown conducted the next session on Building high-performance distributed applications using Azure Cosmos DB. He is a Principal PM in the Azure Cosmos DB Team.
The session’s key theme was building globally distributed cloud applications with high availability while ensuring extreme low latency. Many real-world demos were explored during the session and these will help us, developers, to tackle these issues in our own projects.

Hans Olav Norheim, a Principal Software Engineer, concluded the sessions for the day with a keynote on Designing for 99.999% – Lessons and stories from inside Azure SQL DB.
The session focussed on building applications with almost 100% uptime while covering design choices, principles, and lessons learned that can be used in our own projects to overcome uptime issues.

Thus were the proceedings of Day 3. I conclude my note while looking forward to the next set of sessions with the theme Design for efficiency & Operations & DevOps.
I’ll be publishing another article tomorrow.