· Field notes · 001

WhatUAEbusinessesgetwrongaboutAIin2026

We spend more time talking clients out of AI than into it. Here's why — and what we'd do instead.

Ghaith Al Abtah 2026 8 min read ai · practice · uae

A composite of the calls we’ve taken this quarter: a Dubai retailer phoned us. They’d spent eighteen weeks on an AI strategy with a regional integrator. They had a slide deck, an architecture diagram, a vendor short-list, and a budget. They didn’t have anything in production. They wanted to know if we’d take a look.

We did. The diagram was fine. The vendors were the right ones. The architecture was reasonable. The reason nothing had shipped was that nobody on the call could finish the sentence “if this works, the business will be able to…”. They had bought AI. They had not bought a solution to anything.

This is the most common mistake we see in 2026, and it isn’t going away. AI is the only technology category most boards have ever been asked to budget for without a clear use case attached. People who would never approve a database migration without knowing what data and which queries are signing six-figure cheques for “AI capability.” The result is what you’d expect: lots of pilots, very few shipped systems, and a slow loss of credibility for the technology when the conversation turns to outcomes.

What follows is the short list of patterns we see in UAE and GCC engagements right now. We are paid to disagree with our clients when they’re wrong, and these are the disagreements we have most often. None of them are about the technology being bad. They are about the people buying it not yet knowing what to ask for.


01 · You’re buying “AI”, not solving a problem

The phrase AI strategy is doing a lot of work. It sounds like a plan, but it isn’t one. A plan names what should be different on a specific date, for a specific user, measured in a specific way. AI strategy names a tool category and hopes that’s enough.

When a client tells us they need an AI strategy, our first question is the same every time: if this works, what changes for someone in your business? We will sit with that question for an entire call if we have to. Not “we’ll have AI”, but “the support team will close 30% more tickets without adding people,” or “our procurement team will spend two days a month on supplier validation instead of two weeks.” Both of those are AI problems if you squint. Only one of them tells you what to build.

The vendors won’t help you with this question. Their job is to sell tools. Your job is to know what you’re trying to do.


02 · Pilots that never ship

The pilot is the most expensive thing in enterprise AI right now, and it usually doesn’t even produce learning. Most of what gets called a pilot is, in effect, a demo with a longer timeline: no production data flowing in, no production users on the other end, no plan for what would happen if it worked.

A pilot that can never become production isn’t a pilot — it’s a presentation. The defining feature of a real pilot is that you could promote it to production tomorrow if it worked. You wouldn’t, because pilots are for learning, but you could. If you couldn’t, you’re not learning anything that survives contact with the real system.

Our position is simpler than it sounds: ship something tiny, end-to-end, in production, on day twenty. Smaller than you think. One workflow, one type of user, one measurable outcome. Then iterate. Most “pilots” we see could be replaced with three weeks of focused work on a smaller scope.


03 · The Arabic problem

This one is regional and it doesn’t get enough air time.

Most of the model output your team has tested has been in English. Most of your end users — at least the ones outside the office — will interact with it in Arabic. These are not the same problem. Modern frontier models are good at Arabic in ways they weren’t two years ago, but “good” is doing a lot of work in that sentence. The model that’s flawless in English on your test set may produce dialectal mismatches, religious-context errors, or RTL formatting failures on your real users’ inputs.

We’ve seen this go badly more than once. A customer-facing assistant that’s coherent in MSA but tone-deaf in Emirati or Levantine. A document summariser that handles English contracts beautifully and silently drops sentences from Arabic ones. A search system that ranks Arabic results last because the underlying embeddings were trained on a heavily English-weighted corpus.

If you’re deploying generative AI to Arabic-speaking users in 2026, you need an Arabic evaluation set. Not as a phase-two follow-on. As part of the first deployment, with a real signal-to-noise budget, and an honest answer when the model isn’t ready.


04 · Ignoring the boring middle

Customer-facing chatbots get most of the budget and most of the attention. They are also, almost without exception, the worst place to start with AI.

The boring middle — internal automation, internal copilots, structured-document processing, supplier-spec extraction, contract triage, expense classification — is where AI pays back fastest, with the lowest risk, and where the data you need actually exists in your business. Nobody writes a press release when a procurement team’s onboarding goes from three days to ninety minutes. But that’s the work that compounds.

We tell clients to start there for three reasons. The mistakes are cheaper — an internal user can flag a bad output before it reaches a customer. The data is cleaner — internal documents are usually more structured than user inputs. And the savings are real — you can measure them against payroll, not against an estimated conversion uplift that nobody can audit.

If you have a customer-facing AI deployment plan, ask yourself what your internal team would do with the same budget. The answer is almost always more useful.


05 · Trust without evals

The single most predictable thing about a system that has never been evaluated is that it will surprise you. Usually badly.

When we look at AI systems that are running in UAE production environments, the most common failure mode is not that the model is bad. It’s that nobody on the team can answer, in writing, how do we know it’s still working? There is no evaluation set. There is no regression alert. There is no documented threshold below which the system gets pulled. There is, instead, faith.

Evals — the boring discipline of writing down what good output looks like, scoring real outputs against it on a schedule, and acting on the score — is the thing that turns AI from a demo into a system. Most companies skip this step because it’s slow and nobody writes case studies about it. We don’t skip it. We sign off on AI deployments only after we’ve written the eval framework, run it, and put it on a schedule.

If a vendor or a consultant won’t tell you how they’d measure the model’s continued performance after launch, walk away from the meeting. The model is the easy part. The system around it is what you’re paying for.


What we do differently

The thread through all of this is the same: most AI mistakes in 2026 aren’t technical mistakes. They are mistakes of scope, of measurement, and of saying yes when the honest answer is not yet.

When we say yes, we scope smaller than feels comfortable, ship faster than expected, evaluate from day one, and build in a stop-using button. The work is less glamorous than what the integrators sell, and it produces fewer screenshots. But it stays in production, which is the only outcome that pays back.

If you’re being asked what’s our AI strategy? and you’re not sure how to answer, the right next call probably isn’t with a vendor. It’s a thirty-minute conversation about what would actually change in your business if the technology worked. Once you know that, the rest is engineering.

We are happy to be the people on the other end of that call. We are also happy to tell you that you don’t need us — and the most useful thing we do in that conversation, on the days it goes that way, is name the firm that does.