Ghost Autonomy is looking to test the use of multi-modal large language models (MLLMs) in autonomous driving.
MLLMs may help robotaxis interpret complex traffic situations, which usually require the intervention of human remote driving monitors.
Robotaxis are now facing greater scrutiny for the level of involvement of human remote safety monitors, who help solve traffic issues when robotaxis request their intervention.
The current year may have started off with an unexpected amount of optimism in the field of autonomous vehicles, offering a long-sought light at the end of the tunnel for autonomous tech startups, the perpetually unprofitable ride-hailing industry, and their investors.
But the year is now ending on a sour note as robotaxis receive renewed pushback in the cities where they have been able to launch operations—pushback that is citing safety concerns and job loss potential.
The question that the recent controversies have inevitably raised is whether SAE Level 4 systems are good enough to be set loose in major American cities on a large scale.
Officials in San Francisco have cited several occasions in which robotaxis being tested or already operating have blocked streets for approaching emergency vehicles, disregarded taped-off police and emergency scenes, or have simply frozen indecisively in intersections.
What these shortcomings have in common is that Level 4 systems still have trouble navigating complex traffic situations to deal with, for instance, construction zones, emergency vehicles, imperfect road markings, hand signals from other drivers, and verbal commands from police.
To bridge this gap, developer Ghost Autonomy now wants to introduce multi-modal large language models (MLLMs) to autonomous driving, effectively coupling the latest AI to Level 4 systems.
The company has revealed a $5 million investment from the OpenAI Startup Fund to use MLLMs to improve complex scene understanding in robotaxis, or events that are currently considered complicated or rare.
"Solving complex urban driving scenarios in a scalable way has long been the holy grail for this industry—LLMs provide a breakthrough that will finally enable everyday consumer vehicles to reason about and navigate through the toughest scenarios," said John Hayes, founder and CEO of Ghost Autonomy.
However, it remains to be seen whether MLLMs by themselves will be ideally suited to interpret visual data from video feeds in addition to radar, lidar, and audio in a way that could help robotaxis more accurately process complex traffic situations.
Until now, this type of AI has been mostly associated with producing term papers that could be good enough to convince some people that they were written by a human.
This type of AI is promising to do quite a bit for Level 4 autonomy at a time when some of the tech limitations are being exposed as its scale grows, and at a time when there is still scant hope for profitability on the horizon.
"While LLMs have already proven valuable for offline tasks like data labeling and simulation, we are excited to apply these powerful models directly to the driving task to realize their full potential," Hayes adds.
Even if MLLMs and other types of AI are successful, the end state might not be eliminating the need for human remote driver intervention entirely. Rather, the goal may be gently reducing robotaxis' reliance on human remote driver intervention—a reality that is rarely mentioned by robotaxi companies.
There is also some concern that Level 4 systems may not outgrow remote human monitors in the coming years (despite certainly creating more jobs of this type in the process), even though they will continue to be small in proportion to the given number of robotaxis, whether from a safety or a technical need perspective.
In some ways the robotaxi industry is now racing against time to produce near-human intelligence for robotaxi software, before business issues catch up to the tremendous sums that have been poured into the technology.
Will robotaxis become mainstream this decade, or will some factors prevent their replacement of human ride-hailing app drivers? Let us know what you think.