
Considering the host of AI features Apple promised at last year’s WWDC, here we are one year later, and the company seems even further behind the competition than it did last June.
To call the Apple Intelligence rollout a blunder is quite an understatement. However, I am hopeful that in a few weeks, we might see a few new features (or maybe concepts of a plan) that may actually make it feel like the company might catch up:
#1: Google’s Notebook LM
With support for custom instructions and, more recently, multiple languages, Audio Overviews have been a massive part of my study routine for dense and technical machine learning papers.
Still, every time I use it, I can’t help but think how fantastic it would be if Safari had a native Audio Overview-like feature. (Or if Apple Notes had a broader Notebook LM-like feature, for that matter.)
From the possibility of creating automatic daily audio digests of Read Later links, to listening to an instant roundup of anything I might be reading in Safari, Apple has a ton of options to bring AI to its browser beyond the Summarization tool.
#2: Anthropic’s MCP
Last year, Anthropic announced Model Context Protocol (MCP): an open standard that allows LLMs to securely and seamlessly interact with external tools, APIs, and platforms through a unified interface.
You are looking at Claude interacting directly with Blender and creating a 3D scene from a user prompt.
In the last few months, MCP has been adopted by OpenAI, Zapier, Google DeepMind, Replit, Microsoft, Block, and many other companies and platforms. It actually has a chance of becoming as much of a standard for LLM-platform interaction, as HTTPS is for the web or SQL is for databases.
Considering the frameworks already in place with Siri Intents and Siri Shortcuts, we’d be much more likely to see the company announce its own MCP-like protocol next month rather than see Apple adopt MCP proper.
Regardless, users would benefit hugely from, for instance, asking an LLM (even Siri!) to create an entire Keynote presentation from a Pages document. Likewise, users who rely on accessibility tools would unlock possibilities that to this day feel like science fiction.
#3: OpenAI’s Screen Sharing
Currently, Apple offers Visual Intelligence, which lets the user “click and hold it to do things like look up details about a restaurant or business; have text translated, summarized, or read aloud; identify plants and animals; and more”.
However, it still lacks a crucial feature that ChatGPT users have been enjoying since last year: video and screen sharing.
While applying AI to a photo may be helpful, it already feels like a prehistoric workflow compared to just opening the camera or messing with the phone and talking about what is on the screen with ChatGPT. Ask ChatGPT to help you with a food allergy question while you flip through a menu, and you’ll see what I mean.
Roundup
For all the talk about AI we have all seen over the last couple of years, the truth is that the wider audience (think grandma) is still having difficulty going beyond a few mindless prompts on ChatGPT and figuring out how that may actually fit into their lives.
The key to making AI useful is to actually deliver its benefits in the context where the user already is, rather than having them jump back and forth between apps. Granted, many (if not all) of these features could potentially involve server-side processing of sensitive data, but hey, that’s the job.
As long as users know what is going on, they can decide whether to use a feature. However, passing on the development of these features due to privacy challenges is no longer an option.
How about you? Which AI features from other apps and services would you like to see Apple adopt natively on the Mac and the iPhone? Let us know in the comments.
FTC: We use income earning auto affiliate links. More.
Comments