Google generates a cold-handed AI twin of the real world

last updated:

Google generates a cold-handed AI twin of the real world

last updated:

Without meaning to, Google is generating a view of the world of work that underlines how little it understands of the not-insignificant chunk of it that exists outside its white walls.

“At Google, AI isn’t a tomorrow thing, it’s a today thing” ran the voiceover at the head of the kickoff for Google’s Cloud Next conference in Las Vegas earlier this week. If it seems like that came around extremely quickly since last year’s event, it did; this edition swapped the previous August slot for April, meaning that it’s been only 8 months since the company laid out its broad vision for the Google Cloud business. But don’t think that means there’s somehow less to talk about, apparently there’s been around 1000 “product advances” within the Google Cloud portfolio since that late summer event in 2023.

Image: Google Cloud CEO Thomas Kurian holds up an Axion process on stage at Cloud Next 2024

The Google Cloud business of course encompasses a great deal more than I – or indeed all of us at Deep Analysis combined – actually track. There was a reminder that no tech leader likes nothing more than to hold up a bit of silicon, when Google Cloud’s CEO Thomas Kurian got to parade one of the soon-to-arrive Axion processors on stage and amongst the early scene setting, he also updated us that the business is now on a $36bn revenue run rate (as we recently reminded ourselves, that’s set against an overall revenue number of a touch over $300bn, just for a bit of context).

Everything is now twinned with Gemini

Focussing in on our portion of interest, there was a reminder that one of the constants in tracking Google Cloud’s business is trying to maintain a mental note of their constantly shifting product naming structure. What wasn’t news for Cloud Next was Gemini, the successor to the LaMDA and PaLM AI model families – that was formally announced at the tail end of 2023, having been trailed heavily throughout the months prior. 

What was new(er) was that pretty much all the other generative AI derived products have dropped their individual branding in favor of also being called – at least in part – Gemini too. So we might have already noticed that Bard is no longer Bard, but the Duet branding has also been dropped. When I come to update that section of the “Workplace AI Market Analysis”, I’ll only be needing the big red pen by the looks of it.

In one sense, it makes everybody’s life easier as there’s only one name to recall. In another, tracking specific functionality and tying that to a specific product (or a SKU if you’re trying to manage what-on-earth you’re buying) has simultaneously got much harder. As has working out which products might have been dropped entirely, which is one of the other very Google habits we’ve got used to since the company arrived at the doors of the enterprise.

From Assistants to Agents

The biggest noticeable shift however was a semantic one; from assistants to agents. What this means in practice is moving from the ability to answer questions to instead to being able to manage complete tasks (something that we’ve also seen in the form of “Large Action Models” of the type that Rabbit have baked into their physical device). In Google’s case, it laid out 6 broad sets of use cases for the use of agents; Customer, Employee, Creative, Data, Code and Security and within the lengthy keynote, showed off heavily controlled demos for each

Image: Google’s visual explanation of how its AI Agents sit amongst its platform, models and infrastructure.

Much of the current functionality of these will be familiar from their previous guises as say  – for example – Duet AI for Developers instead of the rebadged Gemini Code Assist (itself within Gemini for Google Cloud, perhaps proving that calling everything Gemini doesn’t make life any easier). Google was keen to share the productivity benefits here from its own internal use; 40% faster completion time for common developer tasks, 55% less time spent writing new code.

Workspace veers toward Marketing?

Google Workspace provided the content backbone for much of what was discussed and shown for those Customer, Employee and Creative agents. For each, the Gemini agent is naturally dependent upon a direction (in the form of a prompt) that is detailed enough to be able to perform a complete action. That additional content and context that is the underpinning of RAG, is in these cases to come from the material within those Google Docs, Sheets, Slides and Forms managed within Workspace. 

Within a keynote demo, this manifested itself as being able to generate marketing materials from via a Creative agent, which was able to ascertain from existing marketing briefs the basis for a visual campaign storyboard, create the supporting imagery (with multilingual captions) and finally generate an entire podcast script with suggested personas for the presenters which it could output as SSML and pass to a voice model via a text to speech API to generate the final podcast full. Perhaps it’s this sort of campaign generation agent actions that fuelled the recent Hubspot acquisition rumors (on which – as someone who covered the Digital Marketing market for a long while – I naturally have feelings)? 

Image: An example prompt from an AI Agent demonstration at Cloud Next that was used to generate a podcast.

There was also the addition of what Google suggested would become the fifth member of the Workspace family; Vids. Although a fair way from general availability just yet – it doesn’t arrive in Workspace labs until June of this year – it will provide the ability to generate video content. From a storyboard that a Gemini agent can assist the creation of, it can pull together an organization’s own content, stock materials and AI created materials to produce a finished video with generated voiceover as required. 

Generating a cold-handed version of reality

As technically competent and polished as these agent demonstrations undoubtedly were, they represent a version of reality that leaves me cold. I’m not just saying that as an embittered podcast host watching AI produce an output far more professional than he has been able so far, but rather as a human worker watching a demonstration of machines going through the motions of work without really understanding what their work actually is. I realize that the notion of “uncanny valley” is conceptually familiar to most of us by now and manifestations of it like “This Person Does Not Exist” have been around for a few years at this point. 

But surely, an AI agent might wonder, isn’t this exactly what you asked us to do? To remove these bits of drudgery from your working lives? To make work better? Yet, they don’t really do any of that in ways which are generally meaningful to the overwhelming bulk of workers, who don’t commission video projects or write code. These might not be petty concerns, but they are certainly niche concerns. They represent what someone entirely isolated from everyday life might think work in the real world looks like, but to those of us in that real world it feels like that uncanny cold hand placed over ours, presupposing what it believes our lives need for enrichment. 

Without meaning to, Google is generating a view of the world of work that underlines how little it understands of the not-insignificant chunk of it that exists outside its white walls.

Leave a Comment

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.

Work Intelligence Market Analysis 2024-2029