We were promised robots everywhere — fully autonomous robots that will drive our cars end-to-end, clean our dishes, drive our freight, make our food, pipette and do our lab work, write our legal documents, mow the lawn, balance our books and even clean our houses.
And yet instead of Terminator or WALL-E or HAL 9000 or R2-D2, all we got is Facebook serving us ads we don’t want to click on, Netflix recommending us another movie that we probably shouldn’t stay up to watch, and iRobot’s Roomba.
So what went wrong? Where are all the robots?
This is the question I’ve been trying to investigate while building my own robotics company (a currently stealth company named Chef Robotics in the food robotics space) as well as investing in many robotics/AI companies through my venture capital fund Prototype Capital. Here’s what I’ve learned.
Where are we now?
First and foremost, robots aren’t anything new. Industrial six degrees of freedom (read as six motors serially attached to each other) robot arms were actually developed around 1973 and there are hundreds of thousands of them out there — it’s just that up to this point, almost all of these robots have been in the extremely controlled environment of factory automation doing the same thing over and over again millions of times. And we’ve formed many multibillion dollar companies through these factory automation robots including FANUC, KUKA, ABB and Foxconn (yes they make their own robots). Go to any automotive manufacturing plant and you’ll see hundreds (or in Tesla’s case, thousands). They work insanely well and can pick up massive payloads — a full car — and have precision sometimes up to a millimeter.
More generally, the world of industrial automation is extremely mature and there are hundreds of “systems integrators” who you can go to and say, “I want an automation machine that does this one extremely narrow use case millions of times. Build me a system to do it.” This is how Coca-Cola gets their bottle fillers, Black & Decker makes their drills, Proctor & Gamble makes your shampoo, and more generally how we manufacture most products today. These systems integrators may charge you $1M and make you wait a year to make the machine, but almost any kind of system is possible in this world. The problem with these systems is that they mostly are what’s known as “hard automation” in that they’re mainly mechatronic systems and will work inordinately well if the inputs into the system are exactly what they’re designed and programmed to do; but as soon as you put a two-liter Coca-Cola bottle into a bottling machine designed for half-a-liter bottles, the system doesn’t know what to do and will fail.
The other major world we see lots of production robots (excluding purely software AI agents like recommender systems, spam finders for email, object recognition systems for your photos app, chat bots and voice assistants) is surgical robots. One of the major players in this space is a company called Intuitive Surgical ($66B market cap) who has built and already deployed around 5,000 teleoperated robots. Note that these robots are indeed “remotely controlled” by a physician and aren’t mostly autonomous. But considering that upwards of 40% of deaths in a hospital are correlated with a mistake that a physician makes, patients are paying extra for these robotic surgeries and hospitals are buying them in droves; major players like Verb Surgical, Johnson & Johnson, Auris Health, and Mako Robotics are following this trend.
What you’ll notice about both factory automation and surgical robots is that they’re in extremely controlled environments. In the case of factory robots, the robots aren’t really “thinking” but rather doing the same thing over and over again. And in the case of surgical robots, almost all the perception, thinking and control is being done by a human operator. But as soon as you make the factory automation robots think for themselves or have the surgical robot make decisions without human supervision, the systems break down.
So why don’t we see more robots today?
The distinction to be made is that we don’t see robots today in the day-to-day world we live in — in noncontrolled environments. Why don’t we see robots in the day-to-day world? What’s the one major thing that is preventing us from reaching our dystopian world robotic future? Is it a hardware issue? A software issue? An intelligence issue? An economics issue? A human interaction issue?
In order to answer that question, it’s important to understand what a robot actually means. In the literature, a robot is an agent that does four things:
- Sense: The agent perceives the world using some sort of sensor — say a camera, LIDAR, radar, IMU, temperature sensor, photoresistor or pressure sensor.
- Think: Based on the sensor data, the agent makes a decision. This is where “machine learning” comes in.
- Act: Based on the decision, the agent actuates and changes the physical world around it.
- Communicate: The agent communicates to others around it. (This was only recently added to the model.)
In the last 50 years, we’ve made exponential advances in each of these realms:
- Sensing: The prices of cameras and other sensors like LIDAR, IMU, radar and GPS are going exponentially lower.
- Think: Cloud computing like Amazon Web Services and Google Cloud Platform have made building software insanely cheap and allow you to pay for just what you use. GPUs like NVIDIA’s have been repurposed from gaming graphics cards to be able to run parallel processes that are ideal for machine learning applications (and now we have cloud hosted GPUs). Algorithms like deep neural networks have built on the age-old perceptron to be able to do things like recognize objects, understand natural language and even create new content.
- Act: This is probably the realm that’s the most mature. If we divide the robotics world on the highest level into manipulation (interacting with the world like we do with our hands) and mobile robots (walking/moving around), then the automotive industry has solved most problems in mobile robot hardware and industrial automation has solved many of the problems in manipulating objects (assuming a given pose of the object). We’re extremely adept at making hardware and we have the basic hardware necessary to build robots that can do basically anything.
- Communicate: Through the internet and mobile revolutions of the 2000s and 2010s, we’ve made enormous strides in the world of user interaction. So much so that today if we find a company doesn’t have a simple UI/UX, we instantly don’t take it seriously. Defunct companies like Jibo, Anki and Rethink Robotics made serious contributions in this field.
In other words, purely from a technical perspective (we’ll come to economics and human interaction later), it doesn’t seem like sensing and acting are the major bottlenecks. We have really great and cheap sensors and we have great actuation technology (thanks mainly to industrial automation).
So the problem is mainly in “think.” Specifically, according to University of Pennsylvania Engineering Dean Vijay Kumar and Founder of the Robotics GRASP Lab, the reason we don’t see robots in our day-to-day world is that “the physical world is continuous while computation, and therefore sensing and control, are discrete, and the world is extremely highly dimensional and stochastic.” In other words, just because a manipulator can pick up a tea cup does not mean it can pick a wine glass. Currently the paradigm for think that most companies have adopted is based on the idea of machine learning — and more specifically deep learning — where the basic premise is that instead of writing a “program” as in classical computing that takes in some input and spits out an output based on it, why don’t we give an agent a bunch of inputs and outputs in the form of training data and have it come up with the program? Just as we learned in algebra that the equation for a line is y = mx + b, the basic idea is that if we give the machine learning algorithm y and x, it can find m and b (except on much more complex equations). This approach works well enough to get you most of the way there.
But in the insanely unpredictable world we live in, the idea of providing training data in the hordes with the idea of “if you see this, do this” doesn’t work; simply said, there will never be enough training data to predict every single case out there. We don’t know what we don’t know and unless we have training data for every single instance that has ever happened to an agent in the past and that will ever happen to an agent in the future, this deep learning-based model can not bring us to full autonomy (How can you predict something that you don’t even know is possible?). Humans as intelligent beings can actually think; deep learning-based agents aren’t thinking — they’re pattern matching and if the current state the agent is in doesn’t match one of the patterns that’s already been given to it, the robot fails (or in the case of autonomous vehicles, crashes).
What can we do to make more robots that work?
So perhaps deep neural nets are not the way we get to 100% autonomous systems (which is why companies like OpenAI are investing into reinforcement learning algorithms that mimic a Pavlovian reward/pain-based approach to learning). But in the meantime for startups, what if the question of how to build a fully autonomous agent is the wrong question to ask?
A company that exemplifies this idea of not pursuing 100% autonomy is Ripcord, a Hayward, California-based startup that does autonomous digitization of paper. Today corporations have thousands of reams of paper that they’d love to digitize — “no human went to college to become a staple remover,” says CEO Alex Fielding — and so they send them to Ripcord where the reams are fed into robot cells that pick and place each sheet, scan them and then restack them. Chatting with Alex in the factory, one of the things that struck me was that he never mentioned the idea of “automating humans.” Rather his pitch was that Ripcord makes a human 40x more efficient. I saw this first hand — one human oversees four robotic work cells at Alex’s facility. In one example, the robot was working extremely fast through sheets of paper when it perceived a sheet that confused it. Just then, the human overseeing the system received a clear notification on a screen with the problem. The human quickly fixed the problem within 10 seconds, and the robot spurred back into life for the next sheets.
So what if the question for how to build a successful robotics company is not “How do we build agents to automate humans?” but rather “How do we build agents to make humans 40x more efficient while also using their intelligence to handle all the edge cases?” While artificial intelligence develops, this seems to be the formula for building successful companies in the meantime.
Another company that exemplifies this is Kiwi Robotics. Based in Berkeley, California, Kiwi makes food delivery mobile robots. But chatting with CEO Felipe Chávez, “We are not an AI company; we are a delivery company.” When Felipe founded Kiwi, he didn’t invest into a ton of expensive machine learning engineers; rather after building the hardware prototype, he built low-latency software to be able to teleoperate Kiwi. The idea was initially humans doing 100% of decision-making for Kiwi and slowly they’d build algorithms to decrease that from 100% to full autonomy. Today Kiwi has a team of dozens of teleoperators in Colombia (where Felipe was born) and has made over 100,000 deliveries. A single human can oversee multiple robots and the robot is making almost all the decisions and the humans are just course-correcting. On the other hand, many competitors who are investing in full autonomy are struggling to make even 1,000 deliveries. [Full disclosure — I am an investor in Kiwi Robotics though my fund Prototype Capital.]
In both of these cases, one of most important factors is not the machine learning algorithms but rather the human machine interface. Is that what contemporary robotics companies are missing? According to Keenan Wyrobek, the Founder of blood drone delivery company Zipline and an early robotics pioneer, “while I get the ‘cut labor’ pitch works well to … business owners in the US market, I have seen countless robotics startups fail with this mindset. Make sure your design and eng[ineering] team focus on making all the users of your system more productive … I don’t care how good your robot is, it still has users (people who set up, reconfigure, troubleshoot, maintain, etc). And if those users are not at the center of your design process your robots will not work well enough to ever see a[n] ROI.”
Further, according to Amar Hanspal, CEO of Bright Machines and former Co-CEO of Autodesk, “The common factor between both is that robotic companies start with the technology first (it is too hard and somewhat exciting, so it becomes an end goal in itself) rather than the problem they are trying to solve. The key is … to define a problem you’re trying to solve and then build a great UX around it. Robotics is a means to an end, not the end itself.”
What else can we do to see more robots in our day-to-day world?
So far we’ve seen that one of the major reasons robotics for the everyday world haven’t lived up to their promise is that the world is extremely stochastic and artificial intelligence-based on deep learning-based models simply isn’t good enough to deal with every corner case. So perhaps instead of a labor savings model, robotics companies should adopt the “human augmentation” model. Take Apple and Airbnb’s playbook of a human centered design-first mentality — not engineering — and invest into amazing user experience.
Here are a few other things we can do to bring robots to the forefront:
The first is to sell the product before building it. In the software world of Silicon Valley, “The Lean Startup” by Eric Ries has popularized the idea of “launch fast and iterate fast till you get to product market fit.” For software startups, this works insanely well. But with hardware and robotics, what ends up happening is that engineering talent-heavy startups focus initially not on sales but rather on engineering and they build, build, build. Then they go to customers to sell, customers say, “This doesn’t exactly meet our goals,” the companies don’t have enough runway to iterate and then they die. This has happened over and over. It seems like for software startups, the lean startup approach works since you can launch most of the time for free (thanks to the cloud), iterate once in the field, deployments are fast and you have five or six shots on goal before you run out of money in your seed round. But in the world of hardware, you have upfront hardware costs, deployments are slow, iteration cycles are slow and you only have one or two shots on goal.
To be clear, we are extremely adept at hardware; it’s just that software-centric Silicon Valley isn’t (with notable exceptions being Apple and Tesla). Perhaps one of the reasons is a lack of selling before building. Case in point: Boeing didn’t approach Juan Trippe, the legendary founder of Pan Am Airlines and say, “Here’s a Boeing 747 — do you like it? No. Let me go back and build a new version … Do you like it now?” (i.e., iteration a la “The Lean Startup”). Instead, Boeing asked Pan Am to give them an upfront order for dozens of units with all the features upfront so that Boeing could build it right the first time. In other words, Boeing sells their product before building it. Systems Integrators ask for orders and cash before building anything. So do most hardware companies and military branches. Maybe robotics companies can take a page from Bill Gates playbook and sell MS-DOS to IBM before writing MS-DOS.
One of the benefits of selling before building is that you can do a sanity check on unit economics. Robotics is one of those fields where not only is there technical risk but also unit economics risk. Many companies have historically found that even if they can find a great idea in a constrained environment, build the tech, raise venture capital and build great human machine collaboration, their economics don’t make sense and once again they fail. By selling before building, you have to analyze your customer’s economics as well as your own and make sure it makes sense. If you try to sell your product before building and nobody wants it, it’s an extremely low-risk way of figuring out that your customers probably won’t buy it and that you may want to move onto the next idea.
More generally on economics, we need to shift from upfront cash models to robotics as a service models. A lot of the customers who will be buying robotic applications have extremely low margins and cannot afford to pay $100,000+ upfront for a system (even if the payback period is a year or two). Adding fuel to the fire is that the activation energy ends up being too much to change something when they “already have something that works.” Thus they reject the product (and then the startup dies). We can take a page from the solar cell/photovoltaic cell industry here; solar cell economics make a ton of sense for a lot of homeowners and yet for a very long time in the 2000s, we saw very few solar cells. Why? The upfront was too much for most Americans even though the economics make sense in a few years. The tipping point was not technical but rather financial with companies like Solar City, Sunrun, Sun Power and others innovating on a model where the customer pays almost $0 upfront but then has monthly PPA loans where they pay per kilowatt-hour that the cells generate. The same was the innovation of cloud computing — rather than buying a bunch of servers locally to run Oracle and SAP, companies like Salesforce came up with a “pay for what you use” model. To be successful, robotics companies need to do financial engineering so that customers have to pay very little upfront and only pay for what they use (each hour worked, each sheet of paper scanned, each dish cleaned, each mile driven, each kilo of freight shipped).
Another one of the benefits of selling before building is that you can consistently test in the field even though you’re building hardware too. Traditionally this “iteration after deployment” is the benefit of software (compared to Apple who often starts hardware development for some of their Macs five to seven years ahead of launch). Since you already have a customer, they have a vested interest in making the product work. One strategy we’ve seen be extremely successful is providing some advisor equity to your early customers so that they’re further incentivized to work with you to make the product economically and technically work for them.
But not everything has to be software either. These days, most Silicon Valley VCs cringe when they see robotics companies that are “hardware heavy.” “We’ll invest if you take a more software approach” they say, and so today we see robotics companies trying to use almost 100% off-the-shelf hardware and focus almost all their efforts on software. That makes sense in certain applications but the fact of the matter is that hardware fails a lot less than software and hardware has been around for millennia and we’re really good at it compared to the relatively nascent computing era. In a lot of cases, hardware can solve the problem a lot better than software. Take for example bin picking; today there are dozens of startups who have raised hundreds of millions of dollars from major VCs building generic deep learning-based and reinforcement learning-based systems to be able to pick and place generic objects out of a bin. On the other hand, at PACK Expo in Las Vegas, I was able to see a company called Soft Robotics. They have taken a mostly hardware-based approach to bin picking with a novel gripper that, without any computer vision, can pick up and place objects using great control (much more consistently than almost all computer vision-based startups). Sure, building a software and training data moat matters, but why solve the problem in a more complex way when there’s a simpler and robust solution? We shouldn’t run from hardware — we just need to rethink how to do hardware.
More generally, Silicon Valley VCs have created a mentality that if a company cannot be worth a billion dollars, it’s not worth doing or investing in. So robotics founders try to build technology that can serve every possible customer in the hopes of raising venture capital; and although they alleviate VCs, they end up building a product that doesn’t make any one customer extremely happy. The best companies at the beginning had extremely small markets. In our highly dimensional world, trying to build an insanely generic robotics company day one is a mistake. Rather, at the beginning it’s important to focus on one (or maybe two) customer(s) maniacally. Once you solve that customer’s problem, you’ll find that other customers probably want something similar. Robotics will probably not scale as fast as consumer or even enterprise software companies at the beginning. But this is not unheard of. Before Intel and the personal computer era, computing worked very similar to how automation systems integrators work today: you went to an engineering firm for a specific computer that could do one thing — say calculate the trajectory of your missiles — you pay them $1M, you wait six months and you get your computer the size of a room. Just as computing was slow and nonscalable at the beginning so too will be robotics. That’s okay and there are still billions of dollars in returns to reap.
Finally, perhaps the way to go to build a successful robotics company is indeed to sell vertical B2B solutions (i.e., the “hole in the wall” not a drill) instead of making consumer-facing B2C companies. The promise of the latter was simple: If existing customers don’t see the technology working for them or the economics making sense, why don’t we both develop the technology and be our own customer? After all, our tech is better so we can make our own profit and plus we can control the environment and so it should be technically easier too. It was the same pitch as innovative high frequency trading firms who decided to build their own hedge funds instead of selling their technology to other hedge funds. So we saw B2C robotic restaurants, end-to-end legal firms that were building AI to automate itself and consumer-facing coffee shops. The problem was two-fold: One, most B2C businesses like restaurants fail and most startups fail, but trying to do both is just too much, especially for a startup with limited runway; and two, a lot of these brands didn’t work out not because the tech didn’t work but rather because the consumer brand wasn’t strong enough. The kind of team it takes to build a hard technical product is very different than the kind of team it takes to build a consumer brand and, oftentimes, even if their tech works, the brand wasn’t strong enough and so customers came once to take a picture but retention wasn’t good enough to make the economics work. The same is true for education-based and “toy” robotics — while these are “cool,” we have yet to see an example of a company who used this model to build a lasting company since it seems like they’re more “nice to have” than “need to have.” (So when an economic downturn like the one we’re in happens, nobody wants the product anymore.)
There also has recently been a trend toward platforms to empower robotics companies to make it easier for them to succeed just like AWS made it easier for modern internet companies to succeed. Again this sounds great on the surface but the difference is that before AWS, there was a flourishing set of software companies who were building great businesses and who had cash to pay AWS for a better product. But today, there simply aren’t enough robotics companies who have enough revenue to make these B2B companies make sense. It still seems we need the “killer application” of the iPhone before the platform of the App Store makes sense.
Areas ripe for disruption
In other words, we have a long way to go in terms of seeing robots in our day-to-day world since there are so many places robotics companies can go wrong. Here are the kinds of robots that I think we’ll see more of in the day-to-day world in the short term (next two to four years):
More autonomous factory automation. For factory automation, the customers already exist. If we can build better technology that makes these systems more autonomous, we’ll see a lot more customers who want this.
Semi-autonomous and teleoperated companies. Similar to the surgical robots, Tesla autopilot and Kiwi, we’ll see a lot more companies whose goal is partial autonomy and of augmenting humans not replacing them.
Manipulation based robots in factory-like settings. In 2015 mainly because of Google’s investment into self-driving cars, VCs invested hundreds of millions into autonomous vehicles with the premise that “driving is driving is driving.” If we can solve driving for one car and in one city, it can probably scale pretty well. Today, we’re in a bit of a winter in autonomous vehicles and very few companies seem to have an idea of what to do next (mainly because the world is so random and deep learning may not be enough). On the other hand, manipulation was left behind and today seems to be making a comeback as we’re seeing engineers leaving autonomous vehicle companies and seeking something new that could actually be in production sooner. Manipulation applications tend to be in extremely controlled environments and we’ll probably see more of these (such as Bright Machines’ microfactories and AMP Robotics’ recycling sorting robots)
In the same vein, today there’s a trend of “moving toward the cloud.” Imagine that before the first Industrial Revolution, we used to make textiles in our homes. But then we realized that we can centralize production of textiles at factories and take advantage of economies of scale. As a result, today we see very few people making textiles at our homes. Applying this to today, if you imagine a world in which almost everything moves to the “cloud” and you send your household chores to someone else to do them using a central robotic facility (cooking, dishwashing, cloth washing, cloth folding, etc.), there’s a massive opportunity to apply robots that affect the everyday person but are in a setting where robots work best (factories).
Perhaps the only thing we’ll do in our homes then is cleaning, and thus there is and always will be a massive opportunity for cleaning robots from systems to clean indoor homes, mow outdoor laws, clean indoor malls and other B2B applications, and plow outdoor snow.
Robotics still holds immense promise and it’s certainly doable. Selling before building, ensuring the unit economics work early with low-risk bets, testing the system often in the field, providing early customers advisor equity to align incentives, building a product to solve a problem for a particular customer well rather than building something generic, thinking about robots as a combination of great hardware and great software rather than software alone and pursuing vertical B2B applications can help. But in a broader sense, rather than hitting every nail with the same software mentality hammer, it may be time to think from scratch.