Google recently introduced a cutting-edge AI tool called Whisk, designed to create images using other images as prompts instead of relying solely on lengthy text descriptions. This innovative approach allows users to specify the subject, scene, and style of an AI-generated image by providing reference pictures.
How Whisk Works
Whisk offers a more intuitive way to generate images by using visual inputs. Users can upload images to serve as inspiration for the subject, scene, or style of the desired output. Additionally, multiple images can be provided for each category to refine the results. While text prompts are optional, users can include them if they wish to add extra details.
For those without specific images, Whisk features a dice icon that randomly selects AI-generated visuals to guide the creation process. At the end of this setup, users have the option to type further details into a text box, but it’s not a mandatory step.
Once the inputs are finalized, Whisk generates a set of images along with corresponding text prompts. Users can either favorite or download the resulting image if it meets their expectations. Alternatively, they can refine it by entering additional text or editing the generated prompt.
Google emphasizes in its blog post that Whisk is intended for “rapid visual exploration” rather than precise, pixel-perfect editing. The tool is built to encourage experimentation, acknowledging that it might occasionally produce results that don’t align with user expectations. To address this, Whisk allows adjustments to the underlying prompts, giving users more control over the creative process.
The iterative nature of Whisk makes it engaging for users who enjoy tweaking their creations. In a brief hands-on experience with the tool, its functionality proved both entertaining and flexible, even if the images sometimes leaned toward the unconventional. While the few seconds required to generate each image might feel like a minor inconvenience, the ability to iterate on results keeps the experience enjoyable.
At the core of Whisk is the latest iteration of Google’s Imagen 3 image-generation model, also announced alongside the tool. Imagen 3 builds on previous versions with enhancements to image quality and coherence, providing users with a robust platform for creative experimentation.
In addition to Whisk, Google unveiled Veo 2, the next-generation version of its video-generation model. Veo 2 is designed with a deeper understanding of cinematographic language, making it capable of producing video content with a more polished and professional aesthetic. Notably, it also addresses common issues found in earlier AI models, such as generating distorted features like extra fingers.
Veo 2 will initially be available through Google’s VideoFX, which users can access by joining the Google Labs waitlist. The company plans to expand its application to YouTube Shorts and other platforms by next year, signaling a broader push to integrate AI-generated video content into its ecosystem.
With Whisk and Veo 2, Google is broadening its portfolio of AI-powered creative tools. Whisk’s unique approach of using images as prompts lowers the barrier to entry for users who may find text-based prompts cumbersome or less intuitive. Meanwhile, Veo 2 aims to revolutionize video generation by combining cinematic expertise with improved accuracy and reduced distortions.
Together, these tools showcase Google’s commitment to advancing AI technology in creative fields. Whisk is ideal for rapid exploration and concept visualization, while Veo 2 offers a glimpse into the future of AI-driven video production. Both tools highlight the company’s efforts to make artificial intelligence more accessible and effective across a range of creative applications.
As these technologies continue to evolve, they promise to redefine how individuals and professionals approach visual content creation. Whether it’s generating images with minimal input or crafting compelling video narratives, Google’s AI innovations are paving the way for new possibilities in digital creativity.