Google Gemini AI Enhances ChatGPT with Multimedia Abilities

Google has introduced Gemini, a new AI model that brings video, audio, and photo understanding capabilities to its Bard AI chatbot. Gemini is expected to be integrated into Google Workspace tools in early 2024.

ADVERTISEMENT

Gemini Enhances AI Abilities with Multimedia

Google has introduced Gemini, an AI model that aims to enhance the capabilities of its Bard AI chatbot. Gemini brings video, audio, and photo understanding abilities to the chatbot, allowing it to improve AI performance in various complex tasks such as summarizing documents, reasoning, planning, and even writing programming code.

Although Gemini was initially released in English for users in many countries, Google plans to expand its availability to other languages in the near future. Users with Google Pixel 8 phones will be among the first to experience Gemini's new AI abilities, with the integration of Gemini into Gmail and other Google Workspace tools expected in early 2024.

Gemini Focuses on Multimedia Interpretation

Gemini represents a significant step forward in AI models by incorporating multimedia interpretation capabilities. While text-based chat is important, humans interact with the world through rich information, including speech, imagery, and more. Gemini is designed to bridge the gap between traditional text-based AI models and the more expansive ways in which humans process and understand information in the world.

By training Gemini on text, programming code, images, audio, and video simultaneously, Google aims to improve the model's ability to handle multimedia inputs efficiently. The diverse abilities of Gemini include correctly identifying the next shape in a series, identifying connections between photos, converting bar charts into labeled tables, and even processing handwritten physics problems.

Gemini's Availability and Future Plans

Gemini is available in three different versions, each tailored for different levels of computing power. Gemini Nano runs on mobile phones and powers new features on Google Pixel 8 phones. Gemini Pro is designed for fast responses and runs in Google's data centers, while Gemini Ultra is limited to a test group for now and will be available in a new version of the Bard Advanced chatbot. Google plans to release the Gemini Ultra version of Bard in early 2024.

Google is actively courting developers to incorporate Gemini into their own software and applications. By offering discounted prices and providing integration options through its AI Studio web interface and Vertex AI, Google aims to encourage developers to explore Gemini's capabilities. The company also plans to integrate Gemini into its own services such as Gmail, Google Docs, Meet, and other parts of Google Workspace.