(Bloomberg) — As confident as artificial intelligence assistants can sound in chat responses, if you ask them to generate an image containing several text phrases, chances are the resulting imagery will contain some typos or distorted fonts.
Some models have gotten better at it over time, but they’re not consistently reliable — which has limited their potential as a design tool for professionals.
On Thursday, Alphabet Inc.’s Google announced a new image-generation and editing model that it says addresses the issue. It’s hoping to persuade consumers and advertisers alike to use its latest tools for accurately generating complex graphics and diagrams.
The new image model, Nano Banana Pro, can produce better visuals with more precise and legible text in multiple languages, Google said in a blog post. Those improvements were made possible by Gemini 3, the latest version of the company’s AI model released on Tuesday, which the company says represents a “massive jump” in reasoning and coding ability. The update was met with a warm reception from investors, who sent Alphabet shares to a record high on Wednesday.
Thursday’s announcement marks the search giant’s latest attempt to monetize its AI technology. Google said users of its free Gemini product around the world will be able to use the new Nano Banana Pro model, with quotas, after which they will revert to an older model. Members of paid AI plans will have a higher limit. The model is also integrated with some popular design tools, including Canva, Figma and Adobe Inc.’s Firefly and Photoshop.
A Google spokesperson said the Nano Banana Pro model is better at planning the text placement, its font characteristics and spatial relationship to other image elements, all before rendering the final image. For example, the technology can help recast the text of a recipe as an illustrated flow chart, or visualize real-time information like weather or sports, the company said in the blog post.
For brands that want to incorporate their own designs when brainstorming new marketing campaigns, the model can take in up to 14 reference images from users and arrange them in new scenarios they describe in the text prompt, while retaining the characteristics of the input materials, the company said.
Users can further refine the image by specifying in the prompt any preferred camera angles, depth of field, color grading and aspect ratios, as if they were capturing the image with a camera.
As part of Thursday’s announcements, Google also said users can upload an image to the Gemini app and ask if it was generated by Google AI. It plans to expand that capability soon to include audio and video, it added. Google currently embeds an imperceptible digital watermark for all media created with its AI tools, as well as a visible one for images created by free or Pro tier users. That visible watermark is removed for people who subscribe to the most expensive Ultra plan.
More stories like this are available on bloomberg.com
artificial intelligence, image generation, Google AI, Nano Banana Pro, Gemini 3
#Google #Tackles #AIs #Spelling #Problem #Image #Generation #Model

