How can we measure the productivity gains of AI?

-

There is a lot of talk about the productivity gains that artificial intelligence is said to be bringing to businesses. We read all kinds of figures, some of which are impressive, many of which are disappointing, and all of which are commented on and give rise to endless and heated debates. I have noticed that one question is never asked: how are these gains measured?

In short:

  • Discussions about AI-related productivity gains often lack a baseline measurement, making any assessment of progress uncertain.
  • The absence of initial observations prevents the effects of AI from being measured accurately, particularly due to the complexity and cost of this step.
  • The assessment of gains focuses on individuals rather than collective processes, which limits their relevance to the overall performance of the business.
  • New tasks induced by AI (such as verification or control) are rarely taken into account, distorting calculations of actual productivity.
  • Individual perceptions often replace objective measures, despite the biases they entail, leaving businesses without reliable data to assess the real impact of AI.

Measuring a situation or measuring progress

Productivity, as such, is a static measure: it expresses, at a given moment, the ratio between what is produced and the resources that are mobilized. It is a snapshot. But when it comes to talking about progress, gains, or improvement, it is a matter of comparing two snapshots.

Measuring progress requires a starting point and an end point. Without this starting point, it is impossible to say that progress has been made, whether in terms of productivity or in any other area. However, most discussions and studies on AI from this perspective lack an initial measurement.

The big omission: initial observation

How can this baseline measurement be established? Given that most of the time the discussion revolves around individual productivity gains, it is as appealing as it is empirical, and it is difficult to do better than what Taylor did in his day: follow an employee throughout their day, note the tasks performed, time their duration, and thus have a benchmark before introducing AI.

But in reality, no one does this. It’s too intrusive, too expensive, too complex, and we’re often in such a hurry to deploy the technology that we don’t want to waste time on this laborious step. As a result, we don’t know how long a task took before, but also the whole constellation of peripheral micro-tasks that punctuate the day, namely adjustments, coordination, interruptions, that famous “work about work” which simply tells us that employees spend an incredible amount of time compensating for organizational dysfunctions and tool problems in a completely hidden way (Work about work: when the reality of work consists of making things that don’t work work). This is time that nobody wants to see: ask a manager what their team does and they will remain very superficial in their assessment, based on what they think their work is in terms of expected deliverables, but without, in most cases, having the slightest idea of what they really do.

Measuring people or activities?

It would be simpler if we measured a process or activity from start to finish, a workflow without worrying about the intermediate steps, and I hope we will achieve this with analog AI. But today, most pilots involve generative AI with a focus on individual employee augmentation, with all the limitations that this entails. However, few people seem to be aware of this, unless they prefer to turn a blind eye (AI in the workplace: going beyond augmentation to actually transform).

However, for the business, what should really matter is progress in an activity or process, not in an individual. But, as confirmed by an expert on the subject who works with many businesses, finding a few employees for whom there is a significant gain is enough to make them happy today, regardless of whether this gain ultimately has any impact on the performance of a given process or activity. We have known for a long time that when a result depends on the work of several people, the performance of one does not predict the performance of the group, but this is a concept that has struggled to leave the factories and enter the offices (Local optimum vs. global optimum and the theory of constraints: why your productivity gains sometimes serve no purpose).

But if the intuition of a business like Moderna is correct, then one day we will have to resolve to measure flows rather than individual activities, and when that day comes, it is likely to give many people a headache (Thinking of work as a flow: appealing, but is it realistic?).

After AI: a truncated vision

Unsurprisingly, after the introduction of AI, these same limitations remain. We can see that a core task is accelerated, but this is not enough to establish a property overall gain. This is because AI also introduces new induced tasks: prompting, verifying, correcting, and checking. This time is never taken into account in the calculations, even though it can largely offset or even cancel out the initial gain.

And once again, even if an operation can be performed more quickly on an individual level, there is no guarantee that the net gain will be positive on the scale of an activity involving several participants, even if one person’s task is only a final validation. What is presented as a local acceleration may ultimately result in a neutral or negative overall result.

Perception replaces measurement

But everything you say would make sense if we had reliable figures, which is rarely the case.

In reality, most of the time, businesses don’t measure, they ask questions. Productivity “gains” are reported by employees themselves in a declarative manner (of course, since businesses didn’t think of/know/want/could measure them), but this perception is influenced by several biases.

There is the novelty effect, which gives the impression of working faster simply because the tool is perceived as modern, and the usual Hawthorne effect, which tells us that being observed changes behavior and creates an illusion of performance. But there are also employees who, excited by the magic of AI, will forget the new tasks required by the tools, and those who simply won’t dare tell their employer that they are not convinced that the tool in which they are investing heavily will ultimately bring them much benefit (especially if they know the price…).

It is often said that “perception is reality”. but in fact this only applies to employees. The business, on the other hand, needs reliable figures to know what is going on, and in this case it is moving forward in the dark.

I know the subject is complicated, but given what is at stake in terms of (potential) gains and (very real) expenses, I think the subject requires more rigor than simply being satisfied with finding employees who say it’s better than before. As Deming said, “In God we trust, all others must bring data.”

And I’m still talking about gains, not ROI, which is another matter entirely.

The trap of the collective

I’m not going to go over the fact yet again that an organization is not a juxtaposition of individuals but a system of interdependencies in which what benefits some does not always benefit the business, but if we add to that the bias of perception, the transition from “I think I’m going faster” to “we are certain that we are doing better” is highly risky.

At the risk of repeating myself, I think this is a subject on which many people are seriously mistaken, and I would be happy to see much more, if not vigilance, then at least awareness of this limitation.

Bottom Line

Ultimately, and at least for now, talking about AI-related productivity gains often amounts to telling a story rather than establishing a fact. Without baseline measurements, without comprehensive follow-up measurements, and without taking collective interdependencies into account, the figures put forward are based more on perception than on reality.

We can estimate, feel, and tell stories, but to prove anything, we would need to return to a level of rigorous observation that almost no one applies, because announcing productivity gains without a starting point is a bit like claiming to run faster without ever having timed yourself before.

To answer your questions

Why is it difficult to measure AI-related productivity gains in the workplace?

The problem stems mainly from the lack of baseline measurements. Without initial observations, it is impossible to compare before and after. Businesses therefore rely on employee impressions in the absence of hard data. This fast but imprecise approach means that the announced gains are often based on individual perceptions rather than on an objective and rigorous assessment.

What are the risks of focusing solely on individual gains rather than collective gains?

An employee may be faster thanks to AI, but if the other steps in the process don’t follow suit, the gain remains local and useless for the whole. This “local optimum” logic can even hurt overall performance. What really matters to the business are end-to-end flows and processes, not just individuals.

How does AI generate new tasks that impact actual productivity?

AI speeds up certain tasks, but it also creates new ones: writing prompts, checking results, correcting, and monitoring. These additional steps take time and can offset or even negate the initial gain. A task completed more quickly does not necessarily guarantee an improvement at the collective process level.

Why are employee statements not enough to measure the benefits of AI?

Feelings are skewed by the novelty effect or the desire to please management. Some people overestimate the tool, while others don’t dare say that it doesn’t add much value. These perceptions, however sincere they may be, too often replace objective measurements. For a business, this is like moving forward without reliable data.

What needs to change to better assess the impact of AI on productivity?

A rigorous observation process should be put in place: defining a starting point, analyzing entire processes, and integrating new tasks that arise. By shifting from an individual-centered vision to a collective and systemic approach, businesses could finally measure the real impact of AI with solid figures.

Image credit: Image generated by artificial intelligence via ChatGPT (OpenAI)

Bertrand DUPERRIN
Bertrand DUPERRINhttps://www.duperrin.com/english
Head of People and Business Delivery @Emakina / Former consulting director / Crossroads of people, business and technology / Speaker / Compulsive traveler
Vous parlez français ? La version française n'est qu'à un clic.
1,756FansLike
11,559FollowersFollow
27SubscribersSubscribe

Recent