Google Unveils Gemini GenAI Model for Tackling Highly Complex Tasks


Google Unveils Gemini GenAI Model for Tackling Highly Complex Tasks
Google has elevated the competition in the generative AI field by unveiling Gemini, its latest and most versatile model that boasts top-tier performance on various benchmark tests. Gemini comes in three iterations, starting with Gemini 1.0, which is tailored for different sizes: Ultra, Pro, and Nano. Google's AI chatbot Bard will leverage a refined version of Gemini Pro, offering enhanced capabilities in reasoning, planning, comprehension, and more. This advanced model will be accessible in English across over 170 countries and territories, and “we plan to expand to different modalities and support new languages and locations in the near future”, said Google.
Additionally, the company has integrated Gemini into Pixel 8 Pro, introducing innovative features such as Summarize in the Recorder app and initiating the deployment of Smart Reply in Gboard. This rollout begins with WhatsApp, with plans to extend to other messaging apps in the upcoming year. Over the next few months, Gemini will be incorporated into various Google products and services, including Search, Ads, Chrome, and Duet AI.
“These are the first models of the Gemini era and the first realization of the vision we had when we formed Google DeepMind earlier this year”, said Alphabet and Google CEO Sundar Pichai. Gemini is the result of large-scale collaborative efforts by teams across Google, including colleagues at Google Research. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video”, said Demis Hassabis, CEO and Co-Founder of Google DeepMind.
While Gemini Ultra is the largest and most capable model for highly complex tasks, Gemini Pro is the model for scaling across a wide range of tasks and Gemini Nano is for on-device tasks. “With a score of 90 percent, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities”, said Google.
Gemini Ultra demonstrates superior performance across 30 out of the 32 widely used academic benchmarks in the realm of large language model (LLM) research and development, surpassing current state-of-the-art results. Its capabilities extend from natural image, audio, and video understanding to mathematical reasoning. Google emphasizes that Gemini 1.0's advanced multimodal reasoning abilities are instrumental in deciphering intricate written and visual information. “Our first version of Gemini can understand, explain and generate high-quality code in the world’s most popular programming languages, like Python, Java, C++, and Go”, said the company.
Gemini can also be used as the engine for more advanced coding systems. The company trained Gemini 1.0 at scale on its AI-optimised infrastructure using Google’s in-house designed Tensor Processing Units (TPUs) v4 and v5e. “Today, we’re announcing the most powerful, efficient and scalable TPU system to date, Cloud TPU v5p, designed for training cutting-edge AI models”, said Google.
Source: IANS