Everything you need to know

OpenAI is one of the biggest AI companies in the world, and it’s been shaping some of the most advanced artificial intelligence of our time. Some of its models like GPT 3.5, GPT-4, and GPT-4 Turbo have redefined what AI can do and cemented it as one of the top competitors to companies like Google. Well, the latest and greatest AI model from OpenAI is called GPT-4o, and it’s its most powerful model to date. What is GPT-4o, and how can it benefit you in your AI Journey?

This is what we’re here to answer. We’re going to go through what this AI model is, what it can do, what it cannot do, and other information you may be wondering about. We will answer the important questions and let you know whether you should use this over other models such as Gemini, Claude, Meta AI, Etc.

We will dive into the most important questions that you may have. However, since this is about artificial intelligence, there are certain aspects that we can’t dive too much into, as it will make this article much too long. This includes diving into some of the science and intricate details about the model.

Lastly, this article will be updated every time GPT-4o gets a new addition. So, feel free to check back every now and then to see what’s new with GPT-4o. Without further ado, let’s dive in.

What is GPT-4o?

If you’ve been following the development of OpenAI’s models, then you may have gotten wind of its rather unconventional naming scheme. GPT-4o doesn’t sound like much, but it is the most powerful AI model from OpenAI to date. It is the successor to GPT-4 Turbo. So, if you’re using OpenAI’s most advanced AI tools, then you are most likely using GPT-4o.

How do I access the new model?

There are a few ways. Firstly, you will be able to access GPT-4o the same way you access  ChatGPT regularly. You can go straight to the ChatGPT website or use the dedicated mobile app.

When OpenAI announced GPT-4o, the company also announced another way that you can access the model. It announced a new Mac OS desktop application. This basically turns ChatGPT into a bit of a chat assistant on your computer. You are able to summon it by using a simple keyboard shortcut and interact with a floating text bar that appears. Along with that, you’re able to input images, add screenshots, and take pictures with your device’s native camera for input. At the time of writing this, we are still waiting for a voice feature to come out for the application. We’re not sure when that’s going to land.

As for Windows users, at the time of writing this, there is no Windows application. However, OpenAI is currently working on bringing a Windows application that will do much the same thing. The company plans to launch this sometime later in 2024, so Windows users will have to stay tuned.

Another way to access GPT-4o is through Microsoft Copilot. As you may know, Microsoft invested heavily in OpenAI, and the company uses its AI technology to power Copilot. As such, some of Copilot’s most advanced features are most likely powered by GPT-4o. The company recently announced the new Copilot-powered PCs, and we’re certain that some of the heavily integrated AI technology is powered by GPT-4o. So, if you’re all for Microsoft’s Copilot and how it can improve the Windows experience, then you are most likely using GPT-4o.

Do I have to sign up for it?

No. If you already have an OpenAI account, you simply have to go to the ChatGPT website, click the drop-down menu at the top of the screen, and select the model you want to use. If GPT-4o is available in your region, then it will be available to select.

However, if you do not have an OpenAI account, then you will want to sign up for one in order to use the new model. Also, signing up for an OpenAI account will give you access to other features that account holders can use to gain a more personalized experience. You’ll also have a chat history to see a backlog of your conversations.

Does the “O” in GPT-4o mean anything?

Yes, the “O” stands for “Omni”. We’re sure that OpenAI sees this as an all-in-one model that can satisfy most of your needs.

Is GPT-4o multimodal?

Yes, it is. Using GPT-4o, you are able to input classic text-based prompts. It will power ChatGPT just like the other models. Also, GPT-4o can also understand speech. Using the voice feature, you are able to speak to the model as you would any digital assistant.

Not only that, but GPT-4o can also understand visual input. It has a vision feature that will allow it to use a camera viewfinder to ascertain the world much like Google Lens or the AI pin. It will also have the ability to see what is on your computer screen and give you information based on what it sees.

You will be able to ask GPT-4 questions about what’s on your screen like the text, images, web pages, Etc. As of late May 2024, this feature is not available. This article will be updated when it becomes available.

How do I access the vision feature?

One of the most exciting features that OpenAI announced along with GPT-4o was improvements to the vision feature. This allows the model to see what is currently on your screen and answer questions about what it sees. Not only that, but the vision feature is also coming to the mobile version of ChatGPT.

The company showed off the ability for ChatGPT to see a live preview of the world through your camera’s viewfinder. It will be able to answer questions about what it observes.

During the announcement, it was able to identify math problems written on a piece of paper and help the person through them. Along with that, it was even able to look at a person’s face and tell what emotion they were feeling. This is similar to Google’s Project Astra which the company announced just one day after OpenAI’s Vision feature. So, there are definitely going to be some comparisons between both of these features.

Is there an upgrade to the voice feature?

The voice feature got a pretty notable upgrade. GPT-4o was meant to be a much more efficient and faster model than GPT-4 turbo. This is felt mostly in the voice feature. When OpenAI showed off the new voice feature, we saw that users got responses much faster. You can almost emulate a real-time conversation with a person responding to you instantly.

The response still took a second or two to come, but it was still an improvement. The voice that you hear in the response is also much improved. However, as of late May 2024, the real-time voice has been suspended. There’s currently ongoing tension between OpenAI and Scarlett Johansson. The new voice that was unveiled is shockingly similar to Scarlett Johansson’s voice, and she expressed her distaste for it. As such, the company is currently changing direction.

What is the context window for GPT-4o?

When it comes to the context window, GPT-4o is still pretty far behind the rest of the pack. Currently, it has a 128,000-token context window. That’s the same as GPT-4 Turbo. While it is a major improvement over GPT-4’s 8,192 token limit, it’s still miles behind what we’re getting from Gemini 1.5 Pro, which can reach up to 1 million tokens. Google is even testing an experimental 2 million-token limit for Gemini 1.5 Pro. So, OpenAI still has a lot of catching up to do.

How much does the GPT-4o API cost per million tokens?

While GPT-4o shares GPT-4 Turbo’s context window, it doesn’t share its price per million tokens. GPT-4o has an input cost of $5 per million tokens and an output cost of $15 per million tokens. That’s half of what you pay with GPT-4 Turbo, which has an input cost of $10 per million tokens and an output cost of $30 per million tokens.

Does GPT-4o output images?

No. OpenAI’s main image generation platform is still DALL-E. However, it does support image input.

How many languages does GPT-4o support

GPT-4o is available in over 50 languages.

What is the knowledge cutoff date for GPT-4o?

This is one area where GPT-4 Turbo has its successor beat. GPT-4 Turbo has a cutoff date of December 2023. This means that it doesn’t have any knowledge of the world created past that date. GPT-4o, on the other hand, cuts off in October 2023. So, that’s 2 months of data that the company does not have access to compared to its predecessor. Also, if you ask GPT-4o any questions, it will not know anything past October 2023. So, it does not have any knowledge of anything going on in the year 2024. That’s something to keep in mind.

Can GPT-4o be a translator?

Yes. One of the main features showcased when GPT-4o was unveiled was the translation feature. It’s able to translate numerous languages in real-time. Not only is it able to translate different languages, but it also responds in a very human way. Rather than translating the speech word for word, it will give you a very human-sounding summary of what the other person said.

If an Italian person asks “Where is the nearest Starbucks?”, GPT-4o will not translate that word for word. However, it will give a very human-sounding translation like “He wants to know where the nearest Starbucks is.” This is delivered in a way similar to what a person would say rather than translating word for word.

Is GPT-4o available for free users?

Yes, but there is a major caveat. Free users can use the capabilities of GPT-4o like browsing the web, analyzing and extracting insights from data, uploading images in prompts, providing support files and prompts, and using GPTs. What’s the caveat? Well, you can only use these a limited number of times within a three-hour time span. After that, you will be reverted back to GPT 3.5.

OpenAI will notify you once you reach your limit, and it will tell you what time your limit will reset.

Are ChatGPT Plus users also limited?

Unfortunately, yes. If you are paying $20/month to access GPT-4o, you will be able to send up to 80 messages every three hours. Once you reach that limit, you will be knocked back to a less powerful model. Once three hours are up, your limit will reset.

How do I access a higher limit using the model?

At this point, there does not seem to be a way to increase your limit. However, if you are in a ChatGPT Team workspace, then you should have access to roughly twice as many messages.

Do my unused messages roll over?

No, they do not. If you only use 60 of your messages, and 3 hours pass, the remaining 20 messages will not be added to your refreshed limit. You will start back at 80 messages.

Is GPT-4o better than Gemini 1.5 Pro?

That’s a pretty tough question to answer, but GPT-4o has a lot going for it. While, Gemini1.5 pro has a much larger context window, it appears that GPT-4o is much better at understanding and reasoning. A company made a comparison between the two models where it asked both models certain logic questions along with asking them to ascertain images. In all, there were eight questions asked. Gemini 1.5 Pro did not beat GPT-4o in any of the questions.

However, GPT-4o beat Gemini with six of the eight questions. For the two questions that GPT-4o did not beat Gemini, both models got one of them right and both models got one of them wrong. So, in terms of reasoning and problem-solving, it appears that GPT-4o is quite far ahead of Gemini.

GPT-4o going forward

At this point, we are still waiting for a few features to land on the new model. These include some of the voice and vision features, so if you are waiting for those, you’re going to have to be patient.

Other than that, we expect a typical slew of improvements like better reasoning, faster processing, etc. to come out over the coming months. We’re not sure if this is going to be the next step before GPT-5. However, that remains to be seen.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top