Three kinds of work, three kinds of success, and why confusing them is costly
Early in my career I worked at Bell Labs, where one could work on any intellectually challenging problem, whether or not it had immediate relevance to the mothership AT&T. Later I worked at startups where the pressure to sell and ship was the only thing that mattered. At IBM Research, where I was a researcher, and then a manager, the relationship between research and development was marred by constant tension. At ICSI, a purely academic institution where I served as Director, work perceived as close to a product was unwelcome by most of the theoretical researchers, and the agenda was driven entirely by research grants. At Google I led a large team of R&D engineers in an environment where research and engineering were institutionally separated but collaborated closely. At one of my most recent positions at an enterprise AI company, I led a large AI engineering team that occasionally did interesting research, though the company rarely knew what to do with it.
My long journey as a researcher, engineer, and leader through the most technological aspects of AI taught me something that even highly technical people often miss: research, R&D, and product engineering are not simply the same activity performed by the same people at different speeds and under different levels of pressure. A good friend of mine, and an amazing technologist, once joked that engineering runs on quarters of three months, while research runs on a quarterly of a century timeline. In reality they are fundamentally different jobs, with different goals, different success criteria, and people with different motivations and experience.
The confusion is understandable. From the outside, they all look like smart people working on hard technical problems. But conflating them has real consequences. Organizations that do not understand the difference tend to underfund the work that matters most, overpromise on the work that is hardest to measure, and more importantly chase away talented people who feel they do not fit properly within the company agenda.
Before going further, one clarification. None of what follows is an argument that one kind of work is more valuable than another. A great product engineer who ships reliable software at scale creates enormous value. So does a researcher who spends three years on a problem that may never ship. The mistake is not in choosing one over the other. It is in confusing them, or expecting one to behave like the other.
The technology portfolio lens
A useful starting point is to think of these categories along two axes: impact in case of success, something related to return on investment, and probability of success, directly related to something called epistemic risk, that is uncertainty about whether the knowledge needed to solve a problem even exists yet. These two axes together define a space that innovation portfolio researchers have mapped into four quadrants, a framework developed by Robert G. Cooper and colleagues for R&D portfolio management. The quadrants have names that are worth knowing: Pearls (high probability of success, high impact), Oysters (low probability of success, high impact if achieved), Bread and Butter (high probability of success, low impact), and White Elephants (low probability of success, low impact). If you wonder about the term “white elephants”, it comes from the legend of those sacred animals that existed in parts of Asia, and that a king would gift to courtiers he wished to ruin: too expensive to maintain, too sacred to put to work.
The goal of any healthy technology portfolio is to cultivate Pearls, pursue Oysters selectively, keep Bread and Butter work from crowding everything else out, and eliminate White Elephants as quickly as possible. Mapping the three categories of technical work onto this space gives us a useful picture.
Product engineering lives in the Bread and Butter quadrant. The time horizon is short, typically one to six months, epistemic risk is low, and the goal is to take existing technology and turn it into something reliable and useful. Examples are everywhere: fine-tuning a pre-trained model for a specific domain, improving speech recognition accuracy on a known test set, reducing inference latency in a production pipeline. Success is measurable: does it work, does it scale, does it ship on time? Are customers happy?
R&D occupies the Pearl quadrant. The knowledge base exists, the technical direction is reasonably clear, and the goal is to push current systems significantly further. Retrieval-augmented generation after the original concept was proven, reinforcement learning from human feedback once OpenAI demonstrated the approach, multimodal extensions of transformer architectures once the scaling properties were understood: these are Pearl-type efforts. High return, achievable with sustained engineering and research effort, risk coming from execution rather than from fundamental unknowns. The frontier labs competing within the transformer paradigm, on cost, openness, or fine-tuning capability, are running Pearl-type programs. The architectural uncertainty is resolved. The question is how well and how fast.
Research lives in the Oyster quadrant. The horizon is long, often several years, epistemic risk is high, and there is no guarantee of practical output. You can see the potential, but finding the pearl requires years of work, wrong turns and backtracks, and constant awareness of what other researchers around the world are doing. That is why research cannot happen in isolation. Publishing results, attending conferences, and engaging with the broader scientific community are not optional extras. They are how researchers know where the field is, avoid duplicating work already done elsewhere, and attract the collaborators and critics that sharpen their own thinking. The early work on transformer architectures before anyone knew whether scaling would yield emergent capabilities was an Oyster. The initial bets on large language models, the first serious work on interpretability, the attempts to build world models that represent reality rather than language about reality: these are Oysters. Some will open. Many will not. The frontier labs betting that transformer-based LLMs have a fundamental ceiling, and pursuing entirely different architectures or learning regimes, are running the deepest Oysters in the current landscape. Same industry, similar talent pool as their Pearl-type competitors, completely different relationship to epistemic risk, and completely different motivations. That difference matters enormously for how you fund them, manage them, and judge their progress.
Every portfolio also accumulates White Elephants: projects with low probability of success and low return even if they succeed. In AI, these tend to be investments that made sense under a previous paradigm but were not abandoned when the paradigm shifted. Building a proprietary natural language understanding stack after transformer-based models commoditized the problem. Investing heavily in voice biometrics as a primary authentication layer just as device-based authentication made it redundant. Continuing to develop sophisticated dialogue management on finite-state machines long after transformers had already made them obsolete. Or, more often than not, pursuing complex schemes to increase accuracy and speed, or make some technology cheaper by insignificant amounts. White Elephants are rarely born that way. Most were once Oysters. The failure is not in starting them. It is, most often, not knowing when to stop.
The above framework helps organizations allocate resources, set expectations, and avoid asking the wrong questions of the wrong teams. But it is incomplete.
The missing axis
Potential impact and epistemic risk describe the conditions of the work. They do not describe what success looks like when the work is done. The notion of what success is represents a third, independent dimension, and ignoring it is where organizations go wrong.
For product engineering, success is clear: a working feature, a measurable improvement, a product in the hands of users. The definition of done is not ambiguous.
For research, success is also clear, though it looks nothing like engineering success. A paper accepted at a peer-reviewed conference, and cited by many. A new model architecture. A finding that changes how the field thinks about a problem. Knowledge is the output. Whether it ships is often irrelevant to the pure researcher, even though seeing a product built on the results can be a gratifying thought.
The middle tier, R&D, is where the definition of success becomes genuinely difficult. And that difficulty is not accidental. R&D, when it works, is the conversion of research output into something an engineering team can actually build on. It is, in the language I use in my “From Spark to System” series, the Translation phase of innovation: the process by which an invention becomes an innovation.
Translation requires a rare combination of skills. The translator must understand the research deeply enough to know what it actually proves, and what it only suggests. And they must understand engineering deeply enough to know what is buildable, at what cost, and on what timeline. Most researchers never develop the engineering side of that equation. Most organizations never recognize that the gap exists.
What organizations get wrong
The most common failure I have seen is collapsing the three categories into one budget line called “R&D.” This produces two pathologies.
The first is that research gets evaluated on engineering metrics. Publication timelines get pressured by product roadmaps. Researchers are asked to justify their work in terms of near-term product impact. More often than not, going to a conference is seen as going to an expensive vacation. The result is that genuine research stops happening, though it may still be called research.
The second pathology is the opposite: engineering work gets labeled research to attract talent or justify budget. Nothing new gets built, but everything sounds impressive.
The rarest failure, but the most damaging, is simply having no concept of research at all: treating every technical problem as an engineering problem, and measuring every team by delivery velocity. I have seen this too.
The invisible research problem
There is another type of failure, subtler than the others and perhaps more common than people realize. I have worked at companies that had genuine research talent. Real publications. Real ideas that were ahead of their time. People who understood what the field did not yet know and were actively working on it. But those companies never claimed that identity. Research was something that happened in certain rooms, carried by certain individuals, largely invisible to the strategy, the sales deck, and the board presentation.
This is not the same as having no research. The work existed. In some cases it was excellent. But because it was never positioned as a strategic asset, the company captured almost none of the value it could have. It did not shape hiring. It did not inform product direction in any systematic way. It did not attract the kind of talent or partnerships that a genuine research identity makes possible. The research happened almost by accident, sustained by individual commitment rather than institutional recognition.
The contrast with Bell Labs is instructive. Bell Labs was not just a place where research happened. Research was the identity of the institution. It shaped everything: how the organization was structured, how success was defined, what kind of people it attracted, and how it was perceived externally. That institutional commitment to research as an identity, not just an activity, is what made the translation from research to world-changing technology possible over and over again.
Recognizing that you have research talent is not enough. You have to decide what to do with it.
Why it matters
Understanding the difference between these three kinds of work is not an academic exercise. It shapes how you hire, how you measure progress, and ultimately how much of what you build will still matter in ten years.
The quadrant framework gives you the two-dimensional map. Adding the third dimension, the nature of success, gives you something more useful: a way to ask not just how risky a project is, or how long it will take, but what kind of value it is actually trying to create. That question, asked honestly and answered clearly, is what separates organizations that innovate from organizations that only talk about it.


Leave a comment