Federal Procurement Can't Buy AI the Way It Buys Software

Federal acquisition is extraordinarily good at one thing: buying a product against a specification. Define what you need, compete it, evaluate proposals against the spec, award to the best value, hold the vendor to the terms. That machine has procured federal IT for decades and it works. It is now being asked to buy something that does not fit its core assumption — a capability that changes after it is purchased. AI is not a product with a fixed specification; it is a system whose behavior evolves with every model update and improves with every cycle of feedback. The acquisition system can buy the AI of the day you wrote the spec. It struggles to buy the AI you will actually be running a year later, and that gap is quietly sorting agencies into those that can field AI and those that cannot.

A product has a spec; a capability has a trajectory

The mismatch starts with what is being bought. A traditional software acquisition specifies features: the system shall do X, Y, and Z. The vendor builds to the spec, the agency tests against it, and conformance is the measure of success. The spec is the contract.

An AI capability does not have a stable feature set in that sense. What it can do today is a snapshot of a trajectory; the same system will be more capable in six months because the model improved, the retrieval got tuned, and the feedback loop did its work. If the agency writes a fixed-feature spec and holds the vendor to it, it has frozen the capability at its least-capable moment and contractually prevented the improvement that was the entire point of buying AI. The acquisition succeeds on paper and fails on mission.

"Write a fixed-feature spec for AI and you've frozen the capability at its least-capable moment — and contractually forbidden the improvement that was the reason to buy it."

The fixed-specification trap

The trap has a predictable shape. An agency, working within the acquisition system it knows, specifies an AI capability the way it would specify software: enumerate the functions, set acceptance criteria against them, evaluate conformance. Several failure modes follow, and they are structural rather than the fault of any contracting officer.

The spec ages out before award. Federal acquisition timelines run long. An AI capability specified at the start of a procurement can be a generation behind by award. The agency buys, on day one, a system already eclipsed by what was available when the contract closed.
Conformance testing measures the wrong thing. Testing an AI system against a fixed acceptance script measures whether it does the enumerated tasks, not whether it performs the mission well — and those diverge, because the mission value is in handling the cases the script didn't anticipate.
Improvement becomes a contract dispute. When the vendor updates the model and behavior changes, a fixed-spec contract treats the change as a deviation to be managed rather than the improvement it usually is. The contract structure fights the technology.
Lock-in by omission. A spec that doesn't address model portability, data ownership, or exit terms quietly locks the agency in, because those terms are hardest to negotiate after award when leverage is gone.

What to specify instead of features

The answer is not to abandon specification — federal acquisition requires it and should. The answer is to specify the right things: the properties that must hold regardless of how the capability evolves, rather than the features as they exist today.

Outcomes, not functions. Specify the mission outcomes the capability must deliver and the performance thresholds it must meet, and let the vendor's evolving system meet them however it best can. The agency buys results, not a frozen feature list.
Governance properties. Specify the controls that must always hold — access scoping, auditability, human oversight for consequential actions, data handling. These are the properties that protect the mission, and they should be invariant across model updates.
Evaluation rights. Specify the agency's right to continuously evaluate the system's behavior, and the vendor's obligation to disclose material model changes. This replaces one-time acceptance testing with ongoing assurance, which is what a changing capability requires.
Portability and exit. Specify data ownership, output portability, and exit terms up front, while there is competitive leverage. The agency's ability to leave is its strongest protection against a capability that disappoints.

The evaluation problem the framework wasn't built for

Specifying outcomes raises a hard question the acquisition system has not fully solved: how do you evaluate a probabilistic system fairly during source selection? Traditional evaluation rewards demonstrable conformance. An AI capability cannot demonstrate deterministic conformance, because it is probabilistic by design — the same input may yield different outputs, and 'correct' is often a matter of degree.

Agencies getting this right are shifting evaluation toward how the system behaves on representative mission tasks, how it handles ambiguity and edge cases, how transparent its reasoning and its failures are, and how strong the governance wrapper around it is. That is a harder evaluation to run than checking a conformance matrix, and it requires evaluators who understand the technology well enough to judge it. The agencies that build that evaluation capability can buy AI competitively and defensibly. The agencies that fall back on conformance scoring will keep awarding to whoever demonstrates the spec best on the day of the demo — which is not the same as whoever will serve the mission best over the life of the contract.

Acquisition skill as a fielding advantage

The uncomfortable conclusion is that an agency's ability to field AI is becoming a function of its acquisition skill as much as its technical skill. Two agencies with identical missions and identical budgets will diverge sharply based on whether their acquisition approach can buy a capability that changes. The one that specifies outcomes and governance, secures evaluation and exit rights, and builds the muscle to evaluate probabilistic systems will field AI that improves on contract. The one that specifies features and tests conformance will field AI that was already obsolete at award and is contractually prevented from getting better. The acquisition system can buy AI. It just cannot buy it the way it buys software — and the agencies that internalize that distinction first will be the ones with working AI while their peers are still writing the spec.^[2]

Federal Procurement Can't Buy AI the Way It Buys Software

A product has a spec; a capability has a trajectory

The fixed-specification trap

What to specify instead of features

The evaluation problem the framework wasn't built for

Acquisition skill as a fielding advantage

Keep reading.

The CIO's 18-Month AI Budget Cycle Is the Real Constraint

The Cloud-Exit Math Federal Agencies Aren't Doing

The Federal Integration Layer

Put this thinking to work.