Google Gemini AI Enhances Bard Chatbot with Photo and Video Skills

Google's Bard AI chatbot gains multimedia capabilities through the Gemini model, empowering it with video, audio, and photo understanding.

ADVERTISEMENT

Gemini: Adding Multimedia Capabilities to Bard AI

Google has introduced Gemini, a new model that brings native understanding of video, audio, and photos to its Bard AI chatbot.

Gemini is a significant departure for AI, as it aims to enhance text-based chat with richer, three-dimensional information processing capabilities.

The new technology has been rolled out in dozens of countries as part of Google Bard's Gemini update, offering text-based chat abilities for complex tasks.

Gemini's Three Versions

Google has released three versions of Gemini tailored for different levels of computing power.

Gemini Nano, designed for mobile phones, will power the new features of Google's Pixel 8 phones.

Gemini Pro, optimized for fast responses, will be used in Google's data centers and will power a new version of Bard.

Gemini Ultra, limited to a test group for now, will be available in an upcoming version of Bard Advanced in early 2024.

Advancements in Generative AI

Google's Gemini update highlights the rapid pace of advancement in generative AI, competing with OpenAI's ChatGPT.

Gemini is Google's third major AI model revision, with plans to deliver this technology across various products like search, Chrome, Google Docs, and Gmail.

Google envisions Gemini as an AI model that feels more like a helpful collaborator rather than just smart software.

ADVERTISEMENT

The Challenges of AI Models

While AI models like Gemini continue to improve, challenges with accurate responses still persist.

AI models are trained on vast amounts of data but may provide plausible yet incorrect responses.

Google advises users to double-check Bard's responses as they may contain inaccurate information.