Google Unveils Gemma 3n: Open AI Model for Phones and Laptops

- Google unveiled Gemma 3n, a compact and memory-efficient AI model in the Gemma 3 series, designed for smartphones, laptops, and tablets, at its annual Google I/O conference.
- With 5B and 8B parameter versions, Gemma 3n uses Per-Layer Embeddings (PLE) to drastically reduce RAM usage running on just 2-3GB RAM, making it suitable for low-memory devices.
- The model supports audio, visual, and text input, has a 32K token context window, and features smart memory tools like PLE caching and conditional parameter loading. It’s available in preview via Google AI Studio with commercial licensing for developers.
Google announced the Gemma 3n, a new member of its open AI model Gemma 3 series, during its annual Google I/O conference. According to the company, the model is optimized to operate on common devices such as smartphones, laptops, and tablets. Gemma 3n is commensurate with the next generation of Gemini Nano, the light-weight AI model that is already behind some on-device AI capabilities on Android devices like voice recorder summaries on Pixel phones.
Gemma 3n model: Specifications
According to Google, Gemma 3n utilizes a novel method known as Per-Layer Embeddings (PLE), which enables the model to use much less RAM compared to other models of a comparable size.
Although the model has 5 billion and 8 billion parameters (5B and 8B), this new memory optimisation brings its RAM usage closer to that of a 2B or 4B model. In practical terms, this means Gemma 3n can run with just 2GB to 3GB of RAM, making it viable for a much wider range of devices. Gemma 3n model: Key capabilities
Audio input: The model can take in sound-based data, allowing applications such as speech recognition, language translation, and audio analysis.
Multimodal input: Having support for visual, text, and audio inputs, the model can take in complicated tasks that involve merging various forms of data.
Wide language support: Google stated that the model is trained on more than 140 languages.
32K token context window: Gemma 3n takes input sequences of a maximum size of 32,000 tokens, enabling it to process enormous amounts of data at once convenient for abstracting lengthy documents or carrying out multi-step reasoning. PLE caching: The inner workings (embeddings) of the model can be cached temporarily in quick local storage (such as the machine's SSD), which saves memory during repeated access.
Conditional parameter loading: If one task does not need audio or visual features, the model can avoid loading those components, conserving memory and accelerating execution.
Gemma 3n model: Availability: As a member of the Gemma open model family, Gemma 3n comes with easy access to weights and commercial-licensing, enabling developers to fine-tune, customize, and deploy it into a range of applications. Gemma 3n is now live as a preview in Google AI Studio.
Read More News :
'Beam' by Google Transforms Ordinary Videos Into Lifelike 3D Scenes
Foxconn Enhances iPhone Production in India with $1.5 Billion Investment