November 30, 2022 was a very important day for the world of artifiial intelligence: ChatGPT was made available to the public. But what is ChatGPT? It is a prototype chatbot developed by OpenAI suitable for anyone who wants to familiarize themselves with Artificial Intelligence (without the need to set parameters or having to get hands-on with code) and based on the GTP3.5 models or better known as InstructGPT. These models, unlike previous ones, are based on deep learning and optimized through human reinforcement. By making access to ChatGPT public, OpenAI hopes to:
In this article we will explain how ChatGPT was built, its advantages and limitations, and some examples of its use.
How ChatGPT is built
To create ChatGPT, OpenAI trained a model using the Reinforcement Learning from Human Feedback (RLHF) technique, as had already been done in the Instruct models, but with some differences in the configuration of the data collection. Specifically, the initial model was trained using supervised fine-tuning: a pool of human supervisors (trainers) provided real conversations in which they interpreted both parties, the user and the AI assistant. These users also had access to suggestions written by the model to help them write their responses. Finally, they mixed this new dialogue dataset with the InstructGPT dataset, appropriately transformed into a dialogue format.
To create a reward model for reinforcement learning, trainers were asked to classify different alternative responses to messages that had been used in conversations with the chatbot. Using this classification and Proximal Policy Optimization, the generation model was refined. This whole process was run several times to fine-tune the result.
ChatGPT turns out to be significantly better than predecessors in that it provides:
- longer answers: ChatGPT is significantly more verbose than both text-davinci-002 and also text-davinci-003
- more relevant answers: ChatGPT is significantly better than text-davinci-002 at understanding the requests made to it
- more confident answers: ChatGPT has been optimized to reduce “hallucinations” i.e., the creation of invented content
Also significantly improved is the production of code, which not only can be copied directly with one click, but is also explained, commented, and, if specifically requested by the user, even debugged and simplified.
All without having to set anything: model, temperature, top_p, max_tokens etc. Everything is hidden and runs automatically.
How to use ChatGPT
After entering your credentials you will be redirected to the ChatGPT dashboard.
If you already use GPT-3 you will have no problem. Once you reach the page you will find a much more simplified dashboard than what you are used to.
But if you are a novice, don’t worry anyway. Using ChatGPT3 is very simple and intuitive. In fact, once you reach the dashboard you will find only a command line and no parameters to set.
As you can see, on the ChatGPT homepage itself there are some prompts to begin your interaction. You will simply enter any prompt in the command line and press ENTER.
You can have fun creating your own chats, you can for example:
- ask to write a recipe given some bizarre ingredients
- write an SEO optimized article
ChatGPT will respond automatically, using the information at its disposal to provide consistent and realistic, and most importantly, original responses. Without the need to set model parameters.
Advantages of ChatGPT
One of the main advantages of ChatGPT is its ability to “learn” from the conversation it has with users. In this way, the system is able to adapt to different interaction styles and offer increasingly relevant and personalized responses.
In addition, you can keep track of the history of interactions (find your conversations in the left sidebar).
It is a great way to try and test new prompts, familiarize yourself with GPT-3, and help through your own interaction to the optimization of the model.
Also, a not insignificant aspect. It is absolutely free. It has token limits and you cannot customize the various model parameters. But it is absolutely a great tool to familiarize yourself with this new artificial intelligence.
The dialogue format allows ChatGPT to answer follow-up questions, admit its errors, challenge incorrect premises and reject inappropriate requests.
Finally, it is usable in any language. Depending on the language of the text provided, ChatGPT’s response will be possibly consistent. So, in some cases, there is not even a need to translate the text!!!
Like any AI model, ChatGPT has its limitations.
The first aspect concerns the setting of the model variables, although a more technical and less user friendly aspect, the ability to measure the level of creativity through temperature or top_p seemed to us an oversimplification. Also, from what we noticed:
- The model sometimes produces incorrect or nonsensical responses, which can be difficult to correct. This may be due to the lack of a “source of truth” during training by reinforcement.
- The model may be sensitive to small changes in the input sentence or repeated prompts, leading to inconsistent responses.
- The model is often verbose and may repeat certain phrases or concepts too often, which may be caused by bias in the training data or over-optimization problems.
- The model, while trained in conversation, does not ask clarification questions when the user’s request is ambiguous, but instead tries to guess what the user means.
In addition, although OpenAI has made efforts to have the model reject inappropriate requests, the model can still respond to malicious instructions or exhibit bias behavior. The moderation API is used to flag or block certain types of unsafe content, but can give false positives or false negatives.
Finally, the high number of users is causing OpenAI’s servers to shake. So it is common to observe errors, connection problems, and temporary service interruptions. One cannot complain, as it is (still) a completely free service.
What you can do
The possibilities for use are varied and are limited more to our imagination or needs than anything else. Let’s look at some examples.
Generating Python code
With ChatGPT it is possible to request to generate code in our preferred language. For example, suppose we submit the following request:
The answer we might get is as follows:
circumference = 2 * pi * r
circumference = 2 * math.pi * r
# Function test
print(compute_circumference(2)) # Output: 12.566370614359172
print(compute_circumference(5)) # Output: 31.41592653589793
As you can see, the reported code, which you can also copy directly from the chatbot, is not only correct but also commented. Therefore, writing bolierplates or simple tutorials on some programming topics may be much easier and faster. Of course, we recommend that you check that what is generated is correct. In case it is not, you can still provide feedback to ChatGPT to improve the results provided.
One can ask ChatGPT to also generate Twitter threads on a topic of our choice. For example, asking it to generate it on the topic “database modeling,” the result we would get might be as follows.
Despite being a virtual assistant and not a human being, ChatGPT can also interact as a psychological support. Suppose we are going through a difficult time. By submitting our thoughts to the chatbot we might get the following response:
Suppose we have created our new company and are running out of ideas for a name. We then submit the following request:
The answer will be as follows:
As you can see, not only do we get a supply of 10 possible candidates, but also some recommendations in the choice we are going to make!!!
In addition to suggesting names for things, it can also be useful to get some gift ideas for relatives and friends. Asking what gifts can be given to an 8-year-old child, the answer will be as follows:
Again, the ideas proposed are not out of context. Obviously, we do not have a detailed list of products we could purchase but it gives us an idea from which to start our research.