Google Gemini, explained

Share

Artificial intelligence has become the wonder technology of this year. But since it comes in many different flavors from many different companies, this may be the case Really misleading. Not only do you have the ChatGPT bot created by OpenAI, but you also have the large three – Google, Apple and Microsoft – preparing their own versions.

Google’s latest attempt is called Gemini, and it’s no less confusing than the others.

When I first started researching Gemini, I Googled “Google Gemini versions.” In addition to the search, I got an AI-generated summary that started:

“Google Gemini has three versions: Ultra, Pro and Nano. Ultra is the largest model designed for complex tasks, Pro is the best model for scaling across a wide range of tasks, and Nano is the most efficient model for on-device tasks.

OK that’s enough. But that’s not the full story.

Gemini is the third sign of the zodiac, associated with the twins Castor and Pollux from Greek mythology.

Okay, sorry. I could not resist. Gemini is a chatbot created by Google that replaced its previous chatbot called Bard. It is based on something called the large language model (or LLM), also called Gemini, developed by DeepMind, part of Google.

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-grey-bd dark:[&>a:hover]:shadow-highlight-gray [&>a]:shadow-highlight-gray-63 dark:[&>a]:text-grey-bd dark:[&>a]:shadow-underline-gray”>Screenshot: Google

How much time do you have? Seriously though, we’ll limit ourselves to the Gemini types you might encounter, as the number of iterations seems endless.

Originally, when it launched in December 2023, Gemini offered three different versions (called models): Nano as a lightweight Android version, Pro for everyday use, and Ultra for heavy-duty business/enterprise use.

Then on May 14 at its I/O 2024 event, Google unveiled Gemini 1.5 Pro, the first, as the company called it, a “mid-range multimodal model.” According to Google, the new Pro version is about as powerful as the previous Ultra version and is aimed at improving existing applications and creating new ones for everyday use.

In other words, it can accept prompts in all the different communication modes: text, images, audio and video.

Well, not really. Gemini 1.5 Flash is also available, a faster version of Gemini for developers who will be able to use it in specific applications. In other words, if you’re not a developer, this won’t be something you’ll work with.

So let’s reiterate that developers can now work with four Gemini models: Ultra, Pro, Flash and Nano. (We’ll tell you how to play with it in a moment.)

That’s what you’ll get by watching an event that matters more to developers than to regular people like us. But it’s really not that difficult.

Tokens are word elements used to train artificial intelligence models like Gemini. The more tokens an AI model supports, the more information you can give the AI ​​and the better it will understand what you need and what it can give you.

Well, if you are a developer, you can use it to add or create many new applications. Otherwise, Google adds it to many of its existing apps and creates new ones.

Well, as an example, let’s start with Google Photos. A new feature expected this summer called Ask Photos will allow you to search using more complex queries. Instead of just finding all the photos of your grandmother, for example, you should be able to ask her to “Find all the photos of my grandmother over the years that show her working on carpentry projects.”

There is also an existing Lens app that uses both text and photos to help you identify and examine things. Lens will now be able to find information through videos as well. Google demonstrated this by recording a video of a turntable misbehaving and using it to find out why the tonearm wasn’t making contact with the record.

You know that sidebar in Google Docs, Sheets, Slides, Drive, and Gmail? The one where you can now access various other Google apps? Well, it will be acquired by Gemini, which will be used to unify – or at least connect – various Google applications so that you can, for example, easily refer to a Google Doc in an email or vice versa. The solution should be available to subscribers next month.

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-grey-bd dark:[&>a:hover]:shadow-highlight-gray [&>a]:shadow-highlight-gray-63 dark:[&>a]:text-grey-bd dark:[&>a]:shadow-underline-gray”>Screenshot: Google

Even basic Google search has been impacted: AI Reviews now displays search results, giving you an AI-generated summary of what Google thinks you’re looking for. (Although this was met with a lot of backlash and many users wanted to get rid of it.)

Many of them. Currently some include:

Project Astra, which is basically Google Assistant with the added ability to see (via your phone’s camera) and respond to spoken language. It’s still in its early days, so you probably won’t see it for a while.

LearnLM, which will help students find answers to their questions using educational sources; according to the company, it has already been built into some products and is being introduced to teachers.

I see, a “generative AI video model.” Generative as it will be Generate the 1080p videos you ask us to create. Want a video of a cat in a nightgown and top hat jumping over the moon? Veos is what you want to operate. Well, whenever you can – like Project Astra, it’s still in testing and won’t be available to the general public for some time.

You can start working with the Gemini 1.0 chatbot right now and here. However, if you want to play on Gemini 1.5 Pro – which is faster and offers more features – you need to purchase a subscription Advanced Gemini, which will cost $20 per month after a two-month trial period. (Gemini Advanced is part of a Google One subscription, so you’ll also get 2TB of storage and other Google One benefits.)

If you run a business using Google Workspace and want to try more advanced levels of AI (also starting at $20/month), you can find more information Here.

Just uncomplicated warnings. As with all AI applications, Gemini’s answers can be uncertain – in other words, downright wrong. This technology is definitely in its early stages of development, so while it can be a useful tool, you should also check any data you obtain. This has gotten to the point where the misinformation generated by AI engines has gained its own name: hallucinations, because by accessing the misinformation, the AI ​​creates its own reality. So buyer beware.

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-grey-bd dark:[&>a:hover]:shadow-highlight-gray [&>a]:shadow-highlight-gray-63 dark:[&>a]:text-grey-bd dark:[&>a]:shadow-underline-gray”>Screenshot: Google

Taking this into account, it looks like artificial intelligence will be with us for a long time. It’s not a bad idea to do some hands-on exercises to become familiar with them and how they work. In addition to ChatGPT and Gemini, the lineup will include Microsoft’s upcoming CoPilot Plus PCs, which will feature built-in AI-enabled hardware, not to mention Apple’s just announced and upcoming suite of features called Apple Intelligence. So depending on your favorite operating system, not to mention your level of curiosity, you can experiment with different AI chatbots, improved apps, and other features.