AI bid models that survive a board meeting

Why most bid models fail in production

Almost every bid model we have audited was built by people who optimised for accuracy on a held-out test set. That is the wrong objective. A contractor's bid model has to survive a board meeting in which a CFO, a chief estimator, and a non-technical chairman are all looking for reasons not to trust it. Held-out accuracy will not save you in that room. Five other things will.

Pre-deployment tests every bid model passes — or it does not ship

−18%

Median bid-cycle time after a passing model goes live

Tenders where the model overrides the chief estimator without flagging

The five tests

1 · Defensibility

For every prediction the model makes, can the chief estimator point at the three to five line items driving the number, in plain language? If the answer is "the model said so," the model does not ship. Defensibility is not a nice-to-have; it is what lets the human in the chair sign the bid.

2 · Out-of-distribution behaviour

Bid models are trained on what the company has built. The next tender is, by construction, partly new. The test is not whether the model is accurate on familiar work — it almost always is. The test is whether it knows when to stop talking. We require an explicit confidence band that widens credibly on unfamiliar geometry, materials, or contract types.

3 · Stability under small input changes

If the take-off changes by 2% on one trade, the bid number should change by something close to 2% on that trade. We have audited models where a 2% input change moved the headline number by 11%, in the wrong direction, because of a non-monotonic interaction the team had not noticed. That model would have lost the tender or, worse, won it.

4 · A clean override path

The chief estimator must be able to override any line, with a one-line justification, and have the override propagate through the bid without breaking the audit trail. If the only way to override the model is to abandon it, the team will abandon it on the first contentious bid. Then you have a very expensive spreadsheet.

5 · A backtest the board can read

Not a confusion matrix. A list of the last twenty bids: what we won, what we lost, what the model would have priced, what we actually priced, and what the project actually cost. On one A4 page. If the model has no story to tell on that page, the board will not approve it for the next tender. They will be right.

An estimator does not need a model that is always right. They need a model that is wrong in predictable ways and tells them when it might be. — Internal training memo

Figure 1 — Backtest layout, 20-bid review for board approval

What this looks like in deployment

Models that pass these five tests do not look impressive on paper. They are smaller, slower to retrain, and almost always less accurate on a held-out test set than the model the data-science team would have shipped on its own. They also win more tenders, because the chief estimator actually uses them, and the board actually trusts the number.

If you are running a bid-model project that has stalled in pilot, this is usually why. We are happy to walk through the diagnostic on a briefing call.

Adam Khoury

Lead, AI Systems