Federal acquisition is extraordinarily good at one thing: buying a product against a specification. Define what you need, compete it, evaluate proposals against the spec, award to the best value, hold the vendor to the terms. That machine has procured federal IT for decades and it works. It is now being asked to buy something that does not fit its core assumption — a capability that changes after it is purchased. AI is not a product with a fixed specification; it is a system whose behavior evolves with every model update and improves with every cycle of feedback. The acquisition system can buy the AI of the day you wrote the spec. It struggles to buy the AI you will actually be running a year later, and that gap is quietly sorting agencies into those that can field AI and those that cannot.

A product has a spec; a capability has a trajectory

The mismatch starts with what is being bought. A traditional software acquisition specifies features: the system shall do X, Y, and Z. The vendor builds to the spec, the agency tests against it, and conformance is the measure of success. The spec is the contract.

An AI capability does not have a stable feature set in that sense. What it can do today is a snapshot of a trajectory; the same system will be more capable in six months because the model improved, the retrieval got tuned, and the feedback loop did its work. If the agency writes a fixed-feature spec and holds the vendor to it, it has frozen the capability at its least-capable moment and contractually prevented the improvement that was the entire point of buying AI. The acquisition succeeds on paper and fails on mission.

"Write a fixed-feature spec for AI and you've frozen the capability at its least-capable moment — and contractually forbidden the improvement that was the reason to buy it."

The fixed-specification trap

The trap has a predictable shape. An agency, working within the acquisition system it knows, specifies an AI capability the way it would specify software: enumerate the functions, set acceptance criteria against them, evaluate conformance. Several failure modes follow, and they are structural rather than the fault of any contracting officer.

What to specify instead of features

The answer is not to abandon specification — federal acquisition requires it and should. The answer is to specify the right things: the properties that must hold regardless of how the capability evolves, rather than the features as they exist today.

The evaluation problem the framework wasn't built for

Specifying outcomes raises a hard question the acquisition system has not fully solved: how do you evaluate a probabilistic system fairly during source selection? Traditional evaluation rewards demonstrable conformance. An AI capability cannot demonstrate deterministic conformance, because it is probabilistic by design — the same input may yield different outputs, and 'correct' is often a matter of degree.

Agencies getting this right are shifting evaluation toward how the system behaves on representative mission tasks, how it handles ambiguity and edge cases, how transparent its reasoning and its failures are, and how strong the governance wrapper around it is. That is a harder evaluation to run than checking a conformance matrix, and it requires evaluators who understand the technology well enough to judge it. The agencies that build that evaluation capability can buy AI competitively and defensibly. The agencies that fall back on conformance scoring will keep awarding to whoever demonstrates the spec best on the day of the demo — which is not the same as whoever will serve the mission best over the life of the contract.

Acquisition skill as a fielding advantage

The uncomfortable conclusion is that an agency's ability to field AI is becoming a function of its acquisition skill as much as its technical skill. Two agencies with identical missions and identical budgets will diverge sharply based on whether their acquisition approach can buy a capability that changes. The one that specifies outcomes and governance, secures evaluation and exit rights, and builds the muscle to evaluate probabilistic systems will field AI that improves on contract. The one that specifies features and tests conformance will field AI that was already obsolete at award and is contractually prevented from getting better. The acquisition system can buy AI. It just cannot buy it the way it buys software — and the agencies that internalize that distinction first will be the ones with working AI while their peers are still writing the spec.[2]