For the past year, I've been leading a redesign of the Prime Video mobile apps — rethinking the entire experience for millions of customers across Android and iOS. It's the kind of project that fails more often than it succeeds. Even when it doesn't have negative impact, it tends to underwhelm — delivering far less positive impact than expected.
I've seen this play out as a customer too. Digg's redesign in 2010 effectively killed the company, sending its entire user base to Reddit. Reddit's own redesign was so divisive that old.reddit.com is still actively used years later and I stopped using it for years. The patterns are pretty consistent.
Here's what I've learned about why redesigns fail, and the decisions we made to avoid those traps.
The big-bang trap
The most common way a redesign fails is the big-bang launch. You spend 12–18 months building the "new experience" in parallel, flip the switch, and ship it all at once. It feels clean. It feels decisive. And it almost always goes badly.
The problems are predictable:
- You're making hundreds of decisions based on assumptions that are 12+ months old by the time they reach customers
- There's no feedback loop during development — you're flying blind until launch day
- When something goes wrong (and it will), you can't tell which of the hundred changes caused it
- The team is exhausted from a long build cycle and immediately has to firefight
- Customers who hate the change have no gradual path to adjust — it's all or nothing
The instinct to bundle everything together comes from a good place. You want the experience to feel cohesive. You want the "wow" moment. But cohesion doesn't require simultaneity. You can ship pieces that build toward a coherent whole without shipping them all at once.
What we did instead: find the Goldilocks zone
Early on, we made a deliberate decision to release the redesign incrementally, validate each piece through experiments, and only move forward when data showed it worked.
The key risk with this approach is obvious: you create a Frankenstein experience where one part of the app looks completely different from the rest. And if the increments are too small, users don't even perceive the change — and the impact isn't measurable in an A/B test.
This is where finding the Goldilocks zone matters. You need chunks of the experience that are large enough to be perceptible and measurable, but small enough to ship and learn from quickly. Each chunk should be a bet you have reasonable confidence will drive positive impact.
For example, we tested new 2×3 artwork and autoplaying video on the home page. Ideally, we wanted this across all surfaces — search, detail pages, everything. But we made an intentional decision to constrain scope so we could ship and learn sooner. That constraint paid for itself: we got real signal weeks earlier than we would have otherwise, and used it to inform how we rolled out the same patterns to other surfaces.
Parallelize, don't serialize
The other trap redesigns fall into is serialization — finishing one thing completely before starting the next. It feels orderly, but it's incredibly slow when you have a large team.
We took the opposite approach: maximize parallelization across fewer workstreams at a time. Instead of one team working through a sequential backlog, multiple teams worked on different parts of the redesign simultaneously, each with clear ownership and independent launch criteria.
This only works if the pieces are genuinely decoupled. If Team A's work depends on Team B finishing first, parallelization just creates waiting. So we invested heavily upfront in identifying dependencies and restructuring the work to minimize them. Sometimes that meant doing things in a less architecturally elegant order. Worth it.
One thing that's easy to miss: when you're running parallel experiments, you need to think about the combinatorics. If you're experimenting on the home page and the detail page simultaneously, there are four possible combinations a customer might see (old home + old detail, new home + new detail, old home + new detail, new home + old detail). All four need to be acceptable experiences. This takes deliberate design work upfront, but it's what makes parallelization safe.
Experiment deliberately
We didn't just ship features — we experimented with them. But we were intentional about which changes warranted a full A/B test versus which ones we could ship with confidence while monitoring metrics.
Larger, riskier changes got formal experiments. Smaller, higher-confidence changes got a judgment call and post-launch monitoring. Trying to A/B test everything would have been slower than the big-bang approach we were trying to avoid.
This sounds obvious, but it has a non-obvious implication: you have to design your features to be experimentable. That means building things so they can be toggled on and off, their impact can be measured in isolation, and a negative result doesn't block everything else.
The payoff is that you spend almost no time debating whether a change is good — the data tells you. And even when an experiment fails and you still want to launch the feature, you have data to understand the implications, make an informed call, and build a follow-up plan. That's a fundamentally different conversation than "we think this is probably fine."
The real cost of this approach
I'd be dishonest if I said incremental delivery is strictly better. It has real costs.
The hardest part of a large redesign isn't the technology or the design — it's the coordination. With incremental delivery, people are constantly context-switching between modes. One workstream is in design, another is mid-build, a third is analyzing experiment results. In a big-bang redesign, the whole team moves through phases together. In our approach, you're in all phases simultaneously. We set this expectation upfront so people weren't surprised by it, but it's genuinely harder.
The second cost is emotional. When an experiment fails on something the team spent weeks or months building, it stings. The instinct is to feel like the work was wasted. This is where leadership matters most — reinforcing that learning is its own outcome. A failed experiment isn't a failed project; it's a project that saved you from shipping something that would have hurt customers. And because you're fighting on multiple fronts, a setback on one doesn't stop progress on the others.
Long redesigns are marathons. If you burn people out in the first six months, you won't have the stamina to finish. Ship early wins to build momentum. Celebrate launches, not just results. Give the team visible proof that the work matters before asking them to keep going.
The best redesigns don't feel like redesigns
They feel like the product gradually getting better — like it's finally becoming what it always should have been. That only happens when you ship continuously, learn constantly, and resist the temptation to hold everything back for one big reveal.