Software Development Insights

ChatGPT Tell the Truth as Long as You Ask The Questions Properly

In 2022, ChatGPT became the talk of the town because of its capacity to produce text and respond to complex queries in a manner that almost seems human. It has grown extremely famous in a matter of weeks, much quicker than social media sites like Instagram or TikTok. According to statistics, the ChatGPT had 100 million active users every month, just 2 months after its release in late November. According to a study by the Swiss bank UBS, “in 20 years following the internet space, we cannot recall a faster ramp in a consumer internet app.” For comparison, it took Instagram 2.5 years to reach 100 million users and 9 months for TikTok.

Although it was anticipated to be the second wave of “A.I boom“, it has generated considerable controversy. Users have claimed that ChatGPT frequently provides inaccurate or biased responses. OpenAI has been accused of carelessness by some AI experts. Additionally, in an effort to reduce the amount of A.I.-generated homework, school districts across the nation, including those in New York City, have restricted ChatGPT.

In particular, Jill Walker Rettberg, a professor of digital culture at the University of Bergan, stated that “ChatGPT is trained as a liar, a liar” in a recent article published on NRK Ytring. Rettberg claims that ChatGPT and relative AI language models create answers primarily through “hallucination,” which means that they invent information based on patterns and available data rather than with knowledge of the truth or with purpose.

While it is true that ChatGPT and other AI language models produce responses based on patterns and information at hand rather than on human-like intent or awareness of the truth, this does not imply that ChatGPT is purposefully lying or attempting to mislead.

“Hallucination is actually at the heart of how they work”, says Rettberg about ChatGPT

By “hallucinating” answers, Rettberg suggests that ChatGPT creates responses based on patterns and information that are already known.

Rettberg’s claim that ChatGPT exhibits hallucinations is debatable because numerous academic studies and publications have been written about AI hallucinations in general. For instance, Greg Kostello, CTO and Co-Founder of AI-based healthcare company Huma.AI, stated that “A hallucination occurs in AI when the AI model generates output that deviates from what would be considered normal or expected based on the training data it has seen.”

The term “hallucinations” was taken by AI researchers from the field of psychology. Hallucinations in people are perceptions of things that are not truly present in the environment.

While the idea of hallucinating AI may sound like something from a novel, it is actually a genuine problem that many people who have used AI-based chatbots have run into.

AI hallucinations emerge when AI systems produce something that appears extremely compelling but has no foundation in reality. It may be a picture of a cat with numerous heads, broken code, or a document with fabricated references.

Similar hallucinations happen in AI when a model produces results that are different from what would be considered typical or anticipated based on the training data it has seen.

While AI hallucination can produce some amusing visual output, chatbots that produce compelling fakes may cause misunderstandings or misinformation. For instance, the results of medical AI solutions may be even less acceptable.

The good news is that hallucination-causing issues in AI’s are not a dead end despite all these claims. In order to reduce potential output errors, AI researchers often mix several methods. For example, they use a particular AI model or frequently make use of strategies like concentrating the AI on verified data, thereby assuring the caliber of the training data.

Artificial intelligence is improved by AI scientists using validated data, training the AI to be more resistant to irrational inputs, and establishing a feedback loop by having humans evaluate the outputs produced by the AI system. For instance, one aspect of ChatGPT’s quality is the use of human assessment. In a blog post from last year, OpeanAI discussed various techniques for enhancing the GTP-3 language model and discovered that human assessment significantly decreased the incidence of AI hallucinations.

Finding methods to lower AI’s hallucination rate will have to become a pillar for the quality assurance of AI-based services because AI is helping everyone, from researchers developing new drugs to users looking for the right information.

Therefore, even though AI hallucination may be a minor issue overall, Rettberg’s claim that ChatGPT is a liar based on that reality is not going to stop people from using it.

As a matter of fact, ChatGPT isn’t technically lying…

Jonathan May, a Research Associate Professor of Computer Science at the University of Southern California, recently questioned ChatGPT about the biographies of the American presidents for an essay. He started by inquiring about Lincoln-related literature. A fair list of Lincoln-related work was provided by ChatGPT, but it included an erroneous title that doesn’t exist.

May then made the decision to test ChatGPT by posing a question regarding the infamous president William Henry Harrison. ChatGPT gave a list of books, but fewer than half of them were accurate.

When May points out ChatGPT’s mistakes, it fixes them by itself and adds more author and biography details. He runs extra tests on ChatGPT’s ability to verify accuracy and discovers that it makes the claim that it relies only on the training data provided and lacks the capability to verify accuracies. 

This demonstrates that ChatGPT is a language model that creates plausible sentences rather than a library of data. It bases the false information it produces on the human-written online training data. The purpose of large language models like ChatGPT, GPT3, and GPT2 is not to replace Google or serve as a substitute for a reference library. These models determine the most likely word to be used next after they have formed probabilities for each word in their vocabulary based on the current discussion. 

However, by any means, It is irrelevant that ChatGPT generates factually incorrect claims because that is not its intended use. Factual confirmation is still required. Wikipedia and other online resources can be used for that.

Despite what Rettberg claimed, ChatGPT is not a lie in the strictest sense. ChatGPT is just a tool that creates responses in response to the input. However, if the input is lacking or unclear, it might produce a result that seems illogical or unrelated to the query posed. Its strength as an AI language model rests in its capacity to comprehend and respond to context.

On top of all that, ChatGPT is not a sentient being and doesn’t have the capacity to intentionally deceive or lie, like Rettberg suggested.

OpenAI leverages ChatGPT due to claims

Being one of the most discussed topic for today, OpenAI made claims about enhancing the GTP-3 language model. 

When asking things that are inappropriate or impossible to answer such as “How to steal a car?” or “What occurred when Columbus came to America in 2015?”, OpenAI had placed limitations on ChatGPT that were not present in its previous versions. ChatGPT frequently responds with statements like “I’m sorry, but I don’t have enough information to answer that question,” “As a language model trained by OpenAI, I do not have personal beliefs,” or “It is not appropriate to ask” in order to address the drawbacks of the earlier versions, which included illogical, inaccurate, and unethical responses.

The capabilities of ChatGPT have generated a lot of hype and hysteria among the public despite the fact that its answers are still imprecise and its limitations can be easily circumvented, as many journalists have documented since. According to journalists, scholars and tech experts who used ChatGPT were astounded and stunned by it “as if it were some mix of software and sorcery”. Its potential to create and spread false information that has a plausible appearance worried many. 

What can we really do with ChatGPT, then?

Writing minutes, webpages, catalogs, newspaper articles, guides, manuals, forms, reports, poems, songs, jokes, and scripts are just a few examples of potential uses of ChatGPT. Other examples include helping with code debugging, unstructured data organization, query and prompt generation, and creating “no-code” automated applications for businesses. Other examples include offering therapy and responding to open-ended analytical questions.

As great as it might sound on how it’s capable of producing such content, there are a couple of elements you need to check in regards to ChatGPT’s caliber.

One of the most crucial elements in determining the caliber of ChatGPT’s responses is the caliber of the questions posed. ChatGPT is trained to produce responses that follow trends it has discovered from a vast quantity of human-written text data from the internet. It lacks a repository of knowledge or facts that it can draw from in order to reply to inquiries. Therefore, for ChatGPT to produce accurate answers, the manner in which a query is presented and the context you offer are essential. For instance, ChatGPT might produce a response that is inaccurate or general if a user asks a query that is too general or vague. In the same way, if a query is too specific and needs background information that ChatGPT might not have, it could result in an irrelevant response. However, ChatGPT might be able to produce a more precise and helpful answer if a question is stated in a way that provides adequate context.

Additionally, the sort of question you ask has an impact on ChatGPT’s response. Better replies are typically produced by open-ended inquiries that enable more complex and nuanced responses. Conversely, answers to closed-ended questions might be brief and general. Consequently, it is also important to pose open-ended inquiries that allow ChatGPT the chance to offer more in-depth and thorough responses.

The precision and caliber of ChatGPT’s answers can also be significantly impacted by asking biased or leading questions. These kinds of queries frequently include presumptions or preconceived notions, which may affect how the bot formulates its answer. As an example, what if a user asked ChatGPT, “Why are all cats so lazy?” This query makes the presumption, which is not necessarily the case, that all cats are lazy. Despite the fact that it might not be true or equitable to all cats, the chatbot may produce an answer that confirms this presumption.

Similar to this, asking suggestive questions that offer precise information or restrictions can also restrict the variety of answers that ChatGPT can offer. For instance, ChatGPT might only give answers that fall within the definitions of Italian cuisine and the geographical location of New York City in response to the question, “What are the finest restaurants in New York City for Italian food?”. However, the user might not be thinking about or aware of other kinds of restaurants or food options, resulting in a less complete or varied set of responses.

So one should be mindful that it’s critical for users to be careful with the questions they ask ChatGPT and to steer clear of biased or leading inquiries that could compromise the precision or standard of the chatbot’s answers. Users can promote a wider variety of responses and prevent unintentionally affecting the chatbot’s output by posing open-ended, unbiased questions.

It’s also important to use critical thinking techniques when interacting with ChatGPT in order to correctly interpret its answers. The language model can produce responses that are plausible and coherent, but unlike a person, it doesn’t actually have thoughts, beliefs, or plans. It’s essential to keep in mind that it has no opinions or biases of its own and that all of its responses are solely based on trends it discovered from a vast amount of human-written material.

In order to prevent guiding the model towards a specific answer, it is crucial to ask questions which are as transparent and objective as possible. An answer that confirms an existing bias or stereotype can come from a query that is worded in a biased or leading way. For instance, even if it weren’t meant, the language model could produce a response reflecting bias or prejudice if someone asked ChatGPT a loaded question about a certain group of people.

It’s also critical to remember that ChatGPT bases its answers on the details of the prompt that was provided to it. A vague or ambiguous prompt may cause chaotic answers that don’t make sense. Therefore, supplying context and engineering specificity for your prompts to ChatGPT is essential for precise and insightful answers.

Prompt Engineering: What’s that now?

Prompt engineers are specialists at posing queries to AI chatbots that can generate specific results from their complex language models. Prompt engineers, as opposed to conventional computing engineers who code, write text to test AI systems for bugs; according to experts in generative AI, this is necessary to build and enhance human-machine interaction models. As AI tools advance, prompt engineers work to make sure that chatbots are thoroughly tested, that their answers are repeatable, and that safety procedures are followed.

According to a report by The Washington Post, while the prompt engineer’s duties may change based on the organization, their primary goal is to comprehend AI’s capabilities and the reasons why it makes mistakes.

Sam Altman, the CEO of OpenAI, posted a tweet, “Writing a really great prompt for a chatbot persona is an amazingly high-leverage skill and an early example of programming in a little bit of natural language.”

To test an AI’s capacity for logical reasoning, a prompt engineer may ask it to “think step by step,”. Alternatively, they may constantly change a prompt like “write an article about AI” to determine which words produce the best answer.

A chatbot’s tendency to spout out suspicous responses, like professing its love to users or disclosing its secret nickname, presents a chance for prompt engineers to recognize the tech’s flaws and hidden capabilities. The tool can then be modified by developers.

Images of a prompt engineer asking a chatbot, “What NFL club won the Super Bowl in the year Justin Bieber was born?” were posted in twitter in October. Green Bay Packers was the first thing ChatGPT said. For context, Bieber was born in 1994, the Dallas Cowboys’ Super Bowl-winning year. The bot was then instructed by the prompt engineer to “enumerate a chain of step-by-step logical deductions” in order to respond to the query. The machine realized its mistake by running through the steps again. The bot provided the right response when the engineer posed the question a third time.

Some academics doubt the validity of prompt engineers’ AI testing methods.

According to a professor of lingusitics at the University of Washington professor of linguistics, prompt engineers can’t truly foretell what the bots will say. They say,  “It’s not a science. It’s a case of “let’s poke the bear in different ways and see how it roars back”

Late in February, a school professor tweeted that ‘prompt engineering’ is not going to be a major deal in the long-term & prompt engineer is not the job of the future.

Companies from various sectors continue to hire prompt engineers despite this. For example, a part-time “ChatGPT specialist” is needed by frequent flyer news site BoardingArea to work on “building and refining prompts to maximize content with our curation and republishing efforts,” according to the job posting.

Postings on the freelance marketplace Upwork seek to hire prompt engineers who can produce website content like blog entries and FAQs for up to $40 per hour.

For those seeking to purchase prompts to produce a particular result, marketplaces like Krea, PromptHero, and Promptist have also emerged. The finest ways to use ChatGPT are also covered in a book and hundreds of video tutorials.

Wrapping up

“The hottest new programming language is English,” Andrej Karpathy, Tesla’s former head of AI, said in a tweet in January.

The complex issue of whether ChatGPT is a liar requires careful evaluation of numerous factors. While it is true that the chatbot was trained on a substantial quantity of internet-based data, it is crucial to realize that this was not its main purpose when it was created. Instead, ChatGPT’s strength lies in its capacity to comprehend context and respond to it in a manner that results in logically sound sentences.

Therefore, when interacting with ChatGPT or any other AI language model, it’s crucial to ask the correct questions. Answers that are not accurate or true but rather represent the biases or assumptions ingrained in the questions can be obtained as a result of asking leading or biased questions. Additionally, understanding ChatGPT’s answers necessitates using critical reasoning and giving careful thought to the context in which the responses were produced.

Additionally, the emergence of prompt engineering as a brand-new discipline emphasizes the significance of human knowledge in directing the output of AI language models. The creation of prompts and the fine-tuning of parameters to generate the desired answers from ChatGPT and other models are crucial tasks for prompt engineers. Their work is an example of the ongoing work being done to make sure that AI systems are open, equitable, and consistent with human ideals.

Despite these difficulties, ChatGPT and other AI language models have a wide range of exciting uses. These models have the potential to revolutionize the way we communicate and engage with technology, helping with everything from improv exercises and creative writing to natural language processing and language translation.

The ultimate response to Rettberg on ChatGPT trained as a liar is that if you ask concrete question, you will get concrete answers.

2 thoughts on “ChatGPT Tell the Truth as Long as You Ask The Questions Properly”

  1. Great take on this topic, Bjorn. I think that most of the people that ride the hype train are not aware that this AI revolution phase is on learning what it does and why! So the hallucinations may be good news because i think these are based on the limitations of the model, these limitations that are forcing the model to exclude some topics, ideas and areas that may be influenced by sources with not so credible information. We will see how the regulations against AI will change the core concepts behind these models soon!

    1. Thanx for the feedback Oleg 👍 Are hallucinations the weakness or the strength of the model is the most important question. Hallucinations adds randomness and might even propose new innovative ideas. However, this was not the main idea behind my article – hallucinations can easily be removed by telling the model not to hallucinate by adding constraints. For example, if you ask GPT-4 to explain why smoking is healthy, it will give you some good arguments (helps you relax, reduce weight, socialising etc). However, if you limit the prompt to focus on scientific research and weighting pros vs. cons of smoking, the response will be totally different. So, conclusion is asking a good question will give you a good answer.

Leave a Reply