An RFP goes out to six qualified suppliers. Three respond. The evaluation committee scores each response, picks a winner, and six months later the selected supplier is missing deadlines, exceeding costs, and delivering work that does not match the proposal. The post-mortem blames the supplier. But the selection system selected them. The outcome was determined the moment the scoring weights were set.

Data from the 2025 Deloitte CPO Survey puts a number on this: organizations that weight cost above 30% in RFP evaluation experience 40% more project failures, 35% more change orders, and 28% longer project timelines compared to those weighting cost at 15–25%. A 15% shift in price weighting changes the winning supplier in one of every three RFPs. The selection criteria are not neutral. They determine the outcome before the first proposal arrives.

65-75% RFPs that fail to deliver expected outcomes
40% More project failures when cost weighting exceeds 30%
43% RFP evaluation criteria that are generic and non-specific

How the failure unfolds: four stages from RFP to wrong supplier

Stage 1: The criteria are set to reward what is easy to measure. Cost is precise. Technical capability is ambiguous. Under pressure to produce an "objective" evaluation, the RFP team increases the cost weighting. Arphie research analyzing 1,200 procurement evaluations across 14 countries found that without a defined scoring rubric, evaluator scores for the same proposal varied by an average of 1.8 points on a 5-point scale. Cost becomes the tiebreaker for every ambiguous qualitative score — and since the qualitative scores are inherently ambiguous, cost decides almost everything.

Stage 2: Performance claims receive equal weight to performance evidence. "Yes, we can integrate with your ERP system" gets the same score as a case study with documented integration benchmarks and reference calls. A content analysis of 98 real-world US design-build highway RFPs published in ResearchGate found 43% of the 540 individual criteria were generic and 53% used vague constructed scales with no explicit point definitions. When criteria are vague, proposal-writing skill — not delivery capability — determines the score.

Stage 3: Cognitive bias tilts the evaluation. A controlled experiment by Dekel and Schurr using real procurement officials, published in the Review of Law & Economics, found that exposing evaluators to bid prices while assessing qualitative components systematically biases scores in favor of the lower bidder. The evaluator does not intend to favor the cheaper supplier. The brain does it automatically once price is visible.

Stage 4: No feedback loop corrects the error. The supplier selected under broken criteria underperforms. The post-mortem identifies supplier failure. The criteria remain unchanged. The next RFP uses the same weighting. Arphie data shows 75% of companies either lack proper supplier KPIs or lack the capability to track them. The system that selected the wrong supplier learns nothing from the failure because it never connects the selection criteria to the outcome.

Most RFP evaluation systems are perfectly designed to select vendors who look good on a spreadsheet but underperform in reality. The criteria reward proposal-writing skill over delivery capability. The scoring process amplifies bias. The feedback loop does not exist.

Root cause 1: cost weighting that overrides everything else

Gartner research cited by Arphie found poor vendor selection accounts for 46% of project failures. The Deloitte data quantifies the cost-weighting threshold: above 30%, failures spike. Yet the typical RFP allocates 35–50% to cost because it is the only criterion that feels objective.

The fix is not to ignore price. It is to cap cost weighting at 20–30% and let technical capability, past performance, and implementation approach carry the remaining 70–80%. The recommended distribution from procurement research: technical/solution fit at 40–50%, past performance and experience at 25–35%, and price at 20–30%.


Root cause 2: generic criteria that cannot discriminate between suppliers

"Demonstrates relevant experience" is not an evaluation criterion. It is a prompt for a vendor to write a paragraph. A real criterion defines what relevant means, how it is scored, and what distinguishes a 3 from a 5. The highway RFP study finding that 43% of criteria were generic means nearly half the evaluation matrix was incapable of discriminating between a supplier with 5 years of direct experience and one with 2 years of adjacent experience.

The WorldCC analysis of 1,200 evaluations showed that the 1.8-point variance on a 5-point scale disappears when evaluators use a defined rubric: explicit descriptions of what scores 1 through 5 mean for each criterion, agreed before the proposals arrive. A rubric converts "demonstrates relevant experience" into "5 = documented completion of 3+ projects of similar scope and scale in the same industry; 3 = documented completion of 1–2 projects; 1 = no directly comparable project documentation."


Root cause 3: evaluator inconsistency amplified by process design

Thirty-seven percent of RFPs show lack of consensus among evaluators, per Arphie data. This is not an evaluator quality problem. It is a process design problem. Three evaluators reading the same proposal through three different mental models will produce three different scores. The fix is not better evaluators. It is calibration before scoring: a session where evaluators score one sample proposal together and reconcile their differences before scoring independently. Automated systems achieve 91% consistency in applying criteria versus 60–70% for manual review without calibration, but even manual processes improve dramatically with a calibration session.


Root cause 4: the missing feedback loop

Seventy-five percent of companies cannot connect RFP selection criteria to post-award supplier performance because they lack the tracking infrastructure. The supplier that scored highest on "technical capability" during evaluation underperforms on implementation, and the evaluation matrix learns nothing. The same criteria select the same type of supplier next time.

The fix: define KPIs for every weighted criterion during RFP design, not after award. If "technical capability" is worth 40% of the score, define how you will measure technical capability at 6 months and 12 months post-award. Compare actual performance to evaluation scores. Update the criteria and weights based on which evaluation dimensions actually predicted supplier success. This single change — closing the feedback loop — transforms RFP evaluation from a one-time guess into an improving system.

Cost weighting ≤25%
Cap cost at 20-30% of total score. Technical fit (40-50%) and past performance (25-35%) carry the majority. This alone eliminates 40% of project failure risk.
Defined scoring rubrics
Every criterion has explicit 1-5 definitions agreed before proposals arrive. Eliminates the 1.8-point evaluator variance and makes generic criteria specific.
Blind price evaluation
Score qualitative components before revealing price. Dekel & Schurr research proves price visibility biases qualitative scores toward cheaper bidders.
Post-award KPI feedback
Define KPIs for each weighted criterion during RFP design. Compare actual performance to evaluation scores at 6 and 12 months. Update criteria based on predictive validity.

What this means in practice

Audit your last three RFPs for cost-weighting drift. Calculate the actual percentage of total score allocated to cost. If any exceeds 30%, that RFP was structurally biased toward selecting the cheapest supplier regardless of the evaluation narrative. Re-score one of those RFPs with cost capped at 25% and compare the ranking. Timeframe: two weeks.

Convert generic criteria to rubrics. For each evaluation criterion in your current RFP template, write explicit 1-through-5 definitions. Not "assess capability" — "5 = documented delivery of similar scope in same industry with reference verification; 3 = documented delivery of similar scope in adjacent industry; 1 = no documented comparable delivery." If a criterion cannot be rubric-ized, it is not a criterion. It is padding. Timeframe: one month.

Separate price evaluation from qualitative scoring. Score all qualitative criteria first, without knowledge of price. Lock those scores. Then open the pricing section. This eliminates the Dekel-Schurr bias where price visibility unconsciously shifts qualitative scores toward cheaper bidders. Timeframe: implement on the next RFP.

Build the post-award feedback mechanism. For your next RFP, add a column to the evaluation matrix: "How will we measure this at 12 months?" Define the KPI, the data source, and the review date before the RFP goes out. Schedule the 12-month review as a calendar event when the contract is signed. Timeframe: implement on the next RFP, review in 12 months.

Run an evaluator calibration session. Before the next evaluation, have all evaluators independently score one anonymized proposal. Compare scores. Discuss the 1.8-point gaps until the group converges on shared definitions of what each score level means. This one-hour session eliminates the largest source of scoring noise. Timeframe: before the next RFP evaluation.


FAQ

Why do RFPs so often select the wrong supplier?

Four interconnected causes: criteria over-weight cost (above 30% correlates with 40% more project failures), criteria are generic and non-measurable (43% of real-world criteria are generic), scoring is inconsistent (37% of RFPs show evaluator disagreement), and there is no post-award KPI feedback loop. The system selects proposal-writing skill over delivery capability.

What is the optimal cost weighting in RFP evaluation?

The 2025 Deloitte CPO Survey shows cost weighting between 15-25% produces the best outcomes. Above 30%, organizations experience 40% more project failures, 35% more change orders, and 28% longer timelines. Technical/solution fit at 40-50% and past performance at 25-35% should carry the majority of weight.

How can procurement teams fix broken RFP evaluation criteria?

Five fixes: lock criteria and weights before issuing the RFP, use defined scoring rubrics with explicit 1-5 definitions per criterion, evaluate price separately from qualitative components to reduce cognitive bias, reduce generic criteria to below 10% of total weight, and build a post-award KPI feedback loop that updates criteria based on actual supplier performance at 6 and 12 months.

📊
Infographic Available
A visual summary of this article — key data, models, and frameworks in one view.
View Infographic →