We DEMAND perfection ⚙️

We (unfortunately) hold Generative AI to a crazy high standard

Jul 14, 2023

I had one of those weeks. On Monday, I was bug bashing a feature. On Tuesday, I was building email sequences. On Wednesday, I was cleaning up landing page copy. By Thursday, I was looking off into the distance and thinking about how at some point between next week and the next decade, there will be an AI co-pilot helping out with all of these.

It’s still crazy to think about: every aspect of my job that I described above could look different because of LLMs and generative models. The promise is an inflection in productivity, which as I’ve discussed before, means smaller teams, more impact.

It seems like every week we learn a little more about where the value will accrue. Building a wrapper on top of OpenAI has no moat. There will be many LLMs (not just OpenAI, Bard, etc.) that are competing for niche use cases. The chat UX is promising for some things, but may not be the ultimate game changer for everything. AI agents are super cool and Langchain is sweet. These models are really-freaking-good at creative functions (generating copy, images, ideas, brainstorms) and analyzing insane amounts of text (legal, search 2.0, research). But, they’re sometimes bad at things like addition and subtraction. Hallucinations are tricky to solve. They make small mistakes every now and then.

While the Twitter-verse gushes it’s excitement on the above, I’m not seeing many people that can provide a reasonable answer to “when” my AI co-pilots will actually join my team and drive value. There are plenty of technical challenges (both hardware and software) that create a wide confidence interval, but one that I think is not talked about enough is trust.

A long-long time ago (2017-ish) at Lyft, the company went big on autonomous driving. Because of some incredible technical advancements in the prior year, it was time to start really betting on it. In addition to acquisitions and venture capital raises, long-term company north stars were added that were along the lines of “X% of rides will be autonomous by 2020”.

You all know the story. Autonomous gets snagged on the “last mile” of technical challenges. Regulation is a lot more dicey than we thought. 2020 comes and goes and autonomous is demoted back to a crazy cool upcoming technology.

Luckily, GenAI is very different than the rollout of autonomous cars, but a big learning from the self-driving journey is that we hold technology to a really high bar. Like, absolute perfection. Road crashes are the leading cause of death for young people in the United States. You would think that if we could reduce these crashes by >10%, we would act on it as soon as possible (saving tens of thousands of lives each year). Autonomous promised safer roads, but because it is not perfect, we don’t trust it enough to roll it out at scale. It remained a “cool” technology instead of getting promoted to a “useful” technology. We give humans room for error, but not technology. The team at Cruise took out a full-page ad in the New York Times earlier this week, arguing that it was (finally) time to fully embrace self-driving cars as safer.

I’m curious if this same concept will end up dragging out when GenAI co-pilots actually make an impact. Does my AI co-pilot need to be perfect for me to integrate it into my work? I need to trust that it will help me find the bug in a feature and not miss anything, that my email sequences will execute correctly, and that my landing page copy is better than what I could spin up. These models will do very well in the creative space where there is no such thing as perfection. But, when it comes to data analysis or workflow execution, I get nervous about if we’ll see it get past a “cool” technology with crazy potential in the next few years.

Luckily, unlike autonomous, we can incrementally roll out generative analysis and workflows that automate the most simple and straightforward experiences. Co-pilots are starting as “QA-assistants” to your code rather than writing your entire micro-service. This ability to incrementally release and quickly improve is the powerful advantage of software over hardware and why the rate of change is so much faster. But, this does not change the fact that anyone who is trying to automate a human process needs to make sure that they keep their bar of quality really high and aim for perfection. Trial users are easy to come by while the AI hype is at its peak, but as we hit the inevitable trough of disillusionment, we may see design partners and curiosity usage decline. Keep pushing though, because I really need these co-pilots.

Happy weekend,

Raman at Rhetoric

📚 What’s made me a better storyteller this week

Here’s a great read from Dan Shipper on letting Superhuman AI do his email. Cool, but still early.

Why AI Will Save the World | Andreessen Horowitz — You’ve probably already read it, but Marc Andreessen’s Why AI Will Save The World does such a great job of capturing the opportunity and potential.

Mic Check

Discussion about this post