Meta's open-source ImageBind AI aims to mimic human perception

ImageBind could eventually lead to leaps forward in accessibility and creating mixed reality environments.

·Contributing Reporter

May 9, 2023 at 2:15 PM·3 min read

Meta is open-sourcing an AI tool called ImageBind that predicts connections between data similar to how humans perceive or imagine an environment. While image generators like Midjourney, Stable Diffusion and DALL-E 2 pair words with images, allowing you to generate visual scenes based only on a text description, ImageBind casts a broader net. It can link text, images / videos, audio, 3D measurements (depth), temperature data (thermal), and motion data (from inertial measurement units) — and it does this without having to first train on every possibility. It’s an early stage of a framework that could eventually generate complex environments from an input as simple as a text prompt, image or audio recording (or some combination of the three).

You could view ImageBind as moving machine learning closer to human learning. For example, if you’re standing in a stimulating environment like a busy city street, your brain (largely unconsciously) absorbs the sights, sounds and other sensory experiences to infer information about passing cars and pedestrians, tall buildings, weather and much more. Humans and other animals evolved to process this data for our genetic advantage: survival and passing on our DNA. (The more aware you are of your surroundings, the more you can avoid danger and adapt to your environment for better survival and prosperity.) As computers get closer to mimicking animals’ multi-sensory connections, they can use those links to generate fully realized scenes based only on limited chunks of data.

So, while you can use Midjourney to prompt “a basset hound wearing a Gandalf outfit while balancing on a beach ball” and get a relatively realistic photo of this bizarre scene, a multimodal AI tool like ImageBind may eventually create a video of the dog with corresponding sounds, including a detailed suburban living room, the room’s temperature and the precise locations of the dog and anyone else in the scene. “This creates distinctive opportunities to create animations out of static images by combining them with audio prompts,” Meta researchers said today in a developer-focused blog post. “For example, a creator could couple an image with an alarm clock and a rooster crowing, and use a crowing audio prompt to segment the rooster or the sound of an alarm to segment the clock and animate both into a video sequence.”

Series of two graphs with the title — Meta’s graph showing ImageBind’s accuracy outperforming single-mode models. (Meta)

As for what else one could do with this new toy, it points clearly to one of Meta’s core ambitions: VR, mixed reality and the metaverse. For example, imagine a future headset that can construct fully realized 3D scenes (with sound, movement, etc.) on the fly. Or, virtual game developers could perhaps eventually use it to take much of the legwork out of their design process. Similarly, content creators could make immersive videos with realistic soundscapes and movement based on only text, image or audio input. It’s also easy to imagine a tool like ImageBind opening new doors in the accessibility space, generating real-time multimedia descriptions to help people with vision or hearing disabilities better perceive their immediate environments.

“In typical AI systems, there is a specific embedding (that is, vectors of numbers that can represent data and their relationships in machine learning) for each respective modality,” said Meta. “ImageBind shows that it’s possible to create a joint embedding space across multiple modalities without needing to train on data with every different combination of modalities. This is important because it’s not feasible for researchers to create datasets with samples that contain, for example, audio data and thermal data from a busy city street, or depth data and a text description of a seaside cliff.”

Meta views the tech as eventually expanding beyond its current six “senses,” so to speak. “While we explored six modalities in our current research, we believe that introducing new modalities that link as many senses as possible — like touch, speech, smell, and brain fMRI signals — will enable richer human-centric AI models.” Developers interested in exploring this new sandbox can start by diving into Meta’s open-source code.

The Daily Beast
‘The View’s’ Ana Navarro Uses Nude Melania Trump Photo to Defend Kamala Harris
Ana Navarro, a long-time co-host of The View, posted on her Instagram Thursday an old photo of nude Melania Trump as a way to troll her husband’s supporters, saying: “You wanna go low? ... I’ll happily go 20,000 leagues under the sea.”It was a picture from 2000 featured in British GQ, five years before Donald Trump married her.Navarro also included a picture of both Trumps partying with Jeffrey Epstein and Ghislaine Maxwell, also from 2000. Her explanation for posting these images was that it wa
Miami Herald
Ana Navarro just posted a racy throwback pic of Melania — and the Internet has opinions
The GQ spread appeared in 2000
The Daily Beast
Donald Trump Seen in Public Without Ear Bandage
Donald Trump ditched his ear bandage for his meeting with Israeli Prime Minister Benjamin Netanyahu on Friday. The former president’s right ear returned to public life after being injured during the assassination attempt on the former president on July 13.The former president’s large bandage became an impromptu fashion statement during the Republican National Convention with some attendees donning DIY wound dressings. Following the convention, Trump swapped out his bulky white gauze for a thin n
The Daily Beast
FBI Is Not Fully Convinced Trump Was Struck by a Bullet
FBI Director Christopher Wray revealed during a marathon testimony on Wednesday that investigators still do not know if former President Donald Trump was grazed by a bullet or a piece of shrapnel during his attempted assassination.Twice during the hours-long session, Wray told lawmakers that the FBI was still working to determine what exactly struck the former president on his right ear during a rally in Butler, Pennsylvania. “My understanding is that either it [a bullet] or some shrapnel is wha
The Independent
Team USA swimmers reveal one ‘foul’ habit they all do in the pool: ‘That’s just how it goes’
One two-time Olympic champion says she does it ‘in every single pool I’ve swam in’
USA TODAY Opinion
Republicans, pay attention to who Harris picks for VP. One of them should scare us.
All eyes on Kamala Harris as she decides who to name as her vice presidential candidate. Republicans should hope she chooses incorrectly.
The New Republic
Trump’s Pathetic Excuse Why He Can’t Debate Kamala Just Disappeared
Donald Trump blamed his backing out on Barack Obama.
KTLA articles
Alison Chao’s father arrested by Monterey Park police
The father of a 15-year-old girl whose missing persons case made national headlines has been arrested and faces possible charges for child abduction, conspiracy and falsifying a police report. On Friday, the Monterey Park Police Department announced that Jeffery Chao was arrested following an “extensive investigation.” Chao’s daughter, Alison, was reported missing on July 16 […]
The New Republic
Trump’s Shortest-Term Official Predicts When J.D. Vance Is Out
Even Anthony Scaramucci knows it’s a matter of time until Donald Trump drops J.D. Vance.
The Hill
Harris campaign hits Trump over Fox News interview: ‘This guy shouldn’t be president ever again’
Vice President Harris’s campaign hit at former President Trump’s interview on Fox News on Thursday, questioning his age and mental stamina. The campaign released a “Statement on a 78-Year-Old Criminal’s Fox News Appearance,” knocking Trump for his age in an effort to turn the tables after President Biden, 81, dropped his bid over questions about…
HuffPost
Stephen Colbert Taunts Trump With Absolutely Brutal Reminder About Melania
The "Late Show" host mocked the former president over one curious claim.
Fox News
Grandmother kills college track coach in murder-suicide in wealthy New York neighborhood: police
The NYPD said Friday morning that Marisa Galloway, 46, had been gunned down on a Manhattan street by Kathleen Leigh, the 66-year-old grandmother of her 4-year-old son.
The Daily Beast
Vance’s Sister Hits Back at Jennifer Aniston Over ‘Cat Ladies’ Outrage
J.D. Vance’s sister has come to the defense of the GOP vice presidential candidate after he was criticized by actress Jennifer Aniston for a past remark that America was run by “a bunch of childless cat ladies.”Lindsay Lewis Ratliff, 44, said in a statement released by his campaign that her brother “was raised by some of the strongest women I know and went on to marry an incredibly strong woman in Usha.”Married with three children, Ratliff added: “J.D. is a testament to the women in his life, an
The Independent
Is Donald Trump good at golf? We asked a professional coach to analyze his swing
With Joe Biden calling Trump’s alleged golfing prowess into question, is the 45th president as good as he claims to be?
Rolling Stone
Harris Taunts Trump After He Backs Out of Debates
“What happened to ‘any time, any place’?”
NY Post
California homeowner ropes off public beach, claiming it is part of her multimillion-dollar property
California homeowner ropes off public beach, claiming it is part of her multimillion-dollar property
KTLA articles
Laguna Beach woman berates beachgoers in viral TikTok video
A heated confrontation between a homeowner and a family of beachgoers in Laguna Beach, California, is going viral on social media. A video shared last week on TikTok by Rosie Garcia (@rosiecheeks_irl) shows an angry woman berating Garcia and her children, whom she accused of crossing onto her property at Victoria Beach in the upscale […]
HuffPost
Trump Responds To Claims He's 'Cognitively Challenged' In Bafflingly Weird Way
The former president brought it up twice during a rally in North Carolina.
Country Living
Kelly Clarkson Looks Absolutely Stunning in a Sparkling Blue Mini Dress at the Olympics Opening Ceremony
The superstar is hosting the ceremony with Peyton Manning and Mike Tirico.
The New Republic
J.D. Vance Faces Most Embarrassing Rumor (and Fact-Check) of His Life
Donald Trump’s running mate is being dogged by hilarious rumors he had sex with a couch, in a sign everything is going well for the campaign.

Recommended Stories