[ I've worked with some of the very large warehousing companies mentioned, but not recently ]
For restaraunt use, picking packaged units from a pallet or storage area to bring to "ready-use" shelves is NOT very time-consuming, most don't have a dedicated employee for it. Unpacking to line-use containers and refilling the common open supplies would be a big savings, but it'd have to have more flexibility and use less space when not active than current solutions have before it's anywhere near competitive with humans on practical grounds, regardless of cost.
For smaller places, employees are not fractionally employable, and tasks are not very well standardized, so once you need people around, you may as well have them do most of the work.
The big question is when multimodal LLM and "AI" gets good enough to put into automation that is general-purpose enough to handle exceptions well. Putting something back after it falls, minor self-maintenance, incorrect shipments recieved, refilling a busy ingredient bin, etc. are all things that humans are currently required for, and are the large part of employment costs for smaller businesses.
So, the reason I was talking about time savings is because I was trying to make a point about automation in general, but for picker robots in restaurants specifically, the bigger advantage might be space savings from taller and denser shelves in expensive areas like NYC.
This is an important consideration. If things can be stored more densely and still quickly accessible, that's a huge improvement. Inventory accuracy can be improved quite a bit with automation as well, allowing a business to store less not-yet-needed stuff.
Another counter-force, though, is that small businesses (and the smarter large ones) are VERY nervous about the fragility of complex systems that don't have simple human fallback mechanisms. JIT inventory means downtime if a supplier misses a delivery, and hyperdense storage means a LOT more human effort (or downtime if we haven't prepared for it and have the humans on-call) when the internet's out or a staff member broke the robot's arm trying to teach it to dance or whatnot.
I think $5k is still high. Fundamentally these things aren't any more complex than a Roomba, and those sell for $100-$1000. If you want to add a cheap robot arm to it, maybe double the price.
"But those don't have the accuracy and reliability required for a commercial environment"
Yeah, AI fixes this.
A Kiva robot can lift and carry >1000 lbs. Your Roomba can't.
That arm doesn't have the strength or reach for useful commercial applications. It uses cheap planetary gears that don't have enough angular precision for most robotic arm applications and aren't specified for a long lifetime.
Warehouse automation has been very successful. Here's a typical modern system. As you can see, a tall narrow robot rides on a single rail at the top and bottom. The linked example is up to 42m tall. Items are stored on top of pallets, and the robot has a telescoping fork, which might be able to handle 2-deep pallets to improve space efficiency.
Stores are much less automated than warehouses. When you go to a Walmart or Aldi or Ikea, they don't usually have robots in the back - let alone smaller stores. There are now many companies selling automation systems for smaller items and smaller spaces. That's called micro-fulfillment, hereafter "MF".
There are many different configurations being developed and marketed, which indicates that people haven't yet figured out the best approach. Here are some approaches I'm aware of.
Kiva/Amazon
Here's a teardown from 2016. That Kiva design has some problems:
Geek+ RoboShuttle
AutoStore
Alert Innovation
Zikoo
Brightpick Autopicker
Brightpick Dispatcher
EXOTEC
Dematic Multishuttle
So, how has MF been going?
My understanding is, most retailers have been taking a hesitant approach. They've mostly been waiting for someone else to show economic success with some system, or doing a small pilot program and seeing how it works out.
So, how have these pilot programs been working out economically? My understanding is, they've successfully reduced labor requirements for picking specific items by 1/2 to 2/3, but are more expensive than the conventional approach of carting items out to shelves and letting customers shop for themselves. (You have to be careful looking at sales material for MF. Brochures will sometimes, for example, compare labor savings to the purchase cost, but MF systems can also have substantial maintenance costs and per-item software licensing fees.)
Here's a study which concluded that (customers shopping online, MF item picking, and pickup at the store) costs $1.52 per item vs $0.96 for conventional shopping (for the entire logistics chain), but MF systems are cheaper than other approaches to online orders. Yes, companies are getting "free labor" from customers taking their items, but shopping online takes time too, and being able to physically inspect products has some advantages.
So, the MF systems might be better if they didn't take up the space used for the normal restocking process, or if most orders were made online.
Grabbing items is a common task, and analysis of MF systems is a useful starting point for estimating the costs of other possible applications for small robots. Some adjustment needs to be done for robot complexity required, robot utilization rates, task completion speeds, and so on, but that's easier than starting from first principles.
Suppose there's a restaurant where many ingredients are stored for chefs. With modern Transformer systems, it's possible for a chef to verbally request an item, and for neural networks to convert that audio to text, figure out what item and task was being requested, and direct a robot to grab a tote with the right item. Extrapolating from MF system costs, such automation would probably be $0.20 to $1 per (storage + fetch), depending on utilization rate, travel distance, etc. Supposing workers cost $30/hour, that would have to save 24 to 120 seconds worth of labor per (storage + fetch) to be potentially worthwhile.
Some restaurants are now using automated carts to carry meals to customers. These are obviously mechanically simpler than picker robots, but they do have to navigate a more unpredictable environment than a warehouse. Compared to an ingredient picker, those also probably have a higher utilization rate.
In factories, robots are often doing something that humans can't do as quickly or at all, and often operate continuously. When you start looking at replacing humans in less-controlled environments, the tasks are always things that humans can do well enough, and they're done less continuously than on an assembly line.
Dishwashers and washing machines are unused most of the time, and they're not particularly expensive per use. The problem is that picker robots can be 100x as expensive to buy, and also have higher operational costs. That's comparable in cost to a car, but cars aren't a great comparison in general: they're abnormally cheap for their complexity and power output compared to other machines, due to a trillion dollars a year of them being made. A single robotic arm can cost more than a car, too.
If robotic pickers were better-designed and mass-produced, could the cost be brought down substantially? Yes, I think so, but I think they'd still be thousands of dollars. Supposing a picker robot and 2 years of operation could be done for $5k, and serve 10 picks/hour for 2000 hours/year at a restaurant, that'd be $0.25/pick.