Key Takeaways:
Artificial intelligence researchers at Apple on Friday released a paper on its upcoming ReALM – Reference Resolution As Langage Modelling – program, claiming that it can “substantially outperform” Microsoft-backed OpenAI’s popular large language model (LLM), GPT 4, in supposedly understanding and handling prompts under different contexts.
Apple Says Its ReALM Better Than GPT-4 At Understanding Contextual Prompts
Reference Resolution is a linguistic problem of AI models being incapable of understanding what a particular expression is referring to.
The meaning of reference words like “they” or “that” in natural language can be obvious to humans, who can understand it based on context, but an AI program like ChatGPT struggles to understand exactly what a user may be referring to.
This is a complex issue that computer programs struggle to solve because they can’t interpret images in the way humans do. However, it turns out that Apple may have found a solution to this lingering issue using its LLM.
Apple Takes a Different Route to the One Taken by Existing LLMs
Users tend to reference contextual information, such as background tasks or on-display data when interacting with voice assistants like Apple’s Siri.
Traditional parsing methods, like the ones used by GPT-4, rely heavily on large models and reference materials like images to respond. However, Apple seems to have streamlined the approach by converting any given context into text.
By converting images into texts, ReALM can skip the need for advanced image recognition parameters, thus making the models smaller and more efficient.
Apple has also deviated from the issues with AI hallucinations by including the ability to constrain decoding or use simple post-processing methods.
For instance, if a user is scrolling through the website of a business and decides to call them, just saying “call the business” should require the AI model to parse what they meant given the context.
The ReALM would be able to see that there is a phone number listed on the page and will label it as the business number and make the call without further prompts.
Apple’s researchers wrote in the paper that they want to use ReALM to understand and identify three kinds of entities – onscreen, conversational, and background entities.
Onscreen entities refer to items displayed on the users’ screen. Conversational entities are those that are relevant to the conversation, for example, if a user were to query the LLM “What workouts am I supposed to do today?”, the chatbot should be able to understand from previous conversations that the user is on a 3-day workout schedule and clarify what the workout for the day is.
Background entities are activities occurring in the background that don’t necessarily fall into the other two categories.
For instance, if a user is listening to a podcast while doing some other work on the phone and wants to clarify a specific portion of what was said, the LLM should be able to understand when the reference is made.
Apple’s Smallest Model Matched up to GPT-4 Performance, OpenAI’s Most Advanced Model
During the research, Apple discovered that its AI model outperformed GPT-3.5 and GPT-4 in several key areas, with its smallest ReALM equaling the performance of OpenAI’s most advanced LLMs, while the larger models “substantially” outperformed the bunch.
Note that when trialing against the text-based GPT-3.5, researchers’ input was strictly in text prompts, but with GPT-4, they also provided a screenshot to perform the task.
Apple is working on unveiling a comprehensive AI strategy during June’s Worldwide Developer Conference (WWDC).
Reports suggest that the company will rely on the smaller model for on-device computations, preserving privacy and security, while licensing other companies’ LLMs for the larger off-device processing that is often filled with ethical conundrums.
Apple is also expected to announce the iOS 18 operating system for the iPhone at the WWDC, which is heavily rumored to include a ReALM model.
More News: Apple One Subscription: Price, Benefits, and Deals; Is It Worth It In 2024?