Stop Trusting Identity Verification Accuracy Claims. Start Testing Them

Shufti June 4, 2026 6 minute read

01 TL;DR
02 What does "99% accuracy" actually mean?
03 What independent testing actually found
04 What a trustworthy accuracy claim looks like
05 Four questions to ask before signing a contract
06 How Shufti approaches accuracy

TL;DR

In the 2025 DHS RIVR benchmark, 6 of 7 document validation systems failed the fraud acceptance or rejection threshold, or both.
Vendor accuracy claims mean little without disclosure of the document mix, region, lighting conditions, and attack class.
Independent testing under ISO/IEC 30107-3 is the only standard that separates real accuracy from marketing copy.
The right question isn’t “what’s your accuracy?” it’s “tested under what conditions, on which documents, by whom?”

Every identity verification vendor will tell you they’re accurate. Most cite a figure above 99%. Some go further: “industry-leading,” “AI-powered accuracy that outperforms human review,” “the most accurate system available.” The claims are everywhere, and they sound the same.

In February 2026, the U.S. Department of Homeland Security published results from its Remote Identity Validation Rally (RIVR), one of the most rigorous independent benchmarks ever run on commercial IDV systems. The findings were blunt: 6 out of 7 identity document validation systems tested fell short of the performance threshold for fraud acceptance, false rejection, or both. Only one system met the security target.

If your vendor selection relied on accuracy figures from a sales presentation, the RIVR results are worth sitting with.

What does “99% accuracy” actually mean?

Accuracy in identity verification is not a single number. It is a function of the conditions under which a system was tested, and most vendors do not disclose those conditions.

The variables vendors quietly omit

When a vendor quotes 99% accuracy, the first question is: 99% on what? A model trained primarily on US driver’s licences and EU passports returns very different results when it encounters a Thai national ID, an Indonesian KTP, or an Emirati residence permit. Document type mix is the single most influential variable in any identity verification accuracy benchmark, and it is rarely disclosed upfront.

Other variables that shift accuracy figures significantly: lighting conditions during image capture, camera hardware (a flagship smartphone versus a mid-range device), whether the test set included fraudulent documents or only genuine ones, and the demographic spread of the test population. A number stripped of these parameters is not a number you can act on.

Product validation accuracy vs. operational accuracy and why they diverge

There is a meaningful distinction between two accuracy measures that vendors report, often without flagging the difference. Product validation accuracy reflects how a model performs in controlled, balanced conditions, the dataset the vendor used to build and test their system. Operational accuracy is how the same system performs on your real users, with your document mix, in your target markets.

The gap between the two is predictable and, in high-diversity markets, substantial. A system achieving 99% in a controlled lab environment can fall significantly when processing a high volume of non-Latin documents or images captured in adverse lighting without a single line of code changing. The vendor’s marketing copy stays the same. Your conversion rate does not.

What independent testing actually found

Independent benchmarking closes the gap between what vendors claim and what their systems deliver. The DHS RIVR program, run by the Department of Homeland Security’s Science and Technology Directorate in partnership with the National Institute of Standards and Technology (NIST) and the Maryland Test Facility, is the most authoritative public benchmark currently available for commercial IDV technology.

The document validation gap

The Track 2 results, published in February 2026, evaluated 7 anonymised identity document validation systems against genuine and fraudulent documents from Maryland and California, with fraudulent samples supplied by the DHS Homeland Security Investigations forensic laboratory. The performance distribution was stark.

6 of 7 systems fell short on fraud acceptance rate (FAR), false rejection rate (FRR), or both. One system posted a false rejection rate above 97%, blocking nearly every document it saw — a number that reads as secure until you realise it signals a system that cannot function as an onboarding tool. Another posted a false acceptance rate above 13%, passing a troubling proportion of fraudulent documents as genuine. Only one of the seven met the security threshold as measured by false acceptance rate.

These are not prototype systems. These are commercial identity document validation technologies that, before February 2026, were presumably telling prospective buyers they were accurate.

The selfie-to-ID gap

Track 1 of RIVR evaluated 16 systems on selfie-to-document matching, the biometric step that confirms the person presenting the ID is the person in the photo. Results here were more encouraging but still pointed: only 5 of 16 systems met all of DHS’s performance goals. That is a pass rate of 31% among commercial vendors, for one of the most foundational tasks an IDV system performs.

The takeaway is not that identity verification cannot be done accurately. It is that accuracy at the level vendors claim, across the conditions buyers need, is rarer than the market implies.

What a trustworthy accuracy claim looks like

A trustworthy accuracy claim is specific, independently verified, and attached to a disclosed methodology. That narrows the field considerably.

The standard that now separates real from marketed

For biometric liveness and presentation attack detection, the benchmark that matters is ISO/IEC 30107-3, the international standard for testing whether a face verification system can distinguish a live person from a spoof. iBeta Quality Assurance, accredited by NIST’s National Voluntary Laboratory Accreditation Program (NVLAP), administers conformance testing against this standard.

In June 2025, iBeta introduced Level 3 conformance specifically in response to the rise of AI-generated attacks, including deepfakes and face-swap techniques. Level 3 tests expert attackers operating with no budget constraints and weeks to attempt a breach of the conditions that most closely replicate a motivated adversary. Very few vendors globally hold Level 3 conformance under this standard, and Shufti holds a Level 3 iBeta certificate is one of them.

For document validation, DHS RIVR is the most rigorous public reference currently available. A vendor that has participated in RIVR and can disclose their system’s results or identify themselves among the Track 1 systems that met all performance goals is demonstrating a standard of transparency that vendor-commissioned testing cannot replicate.

Four questions to ask before signing a contract

Before accepting any accuracy figure, press the vendor on four points.

What was the document mix in the test set?

If it skewed toward US and EU documents, the number does not represent a globally diverse user base.

Was testing conducted by an accredited independent third party?

Internal benchmarks have structural conflicts of interest; named, accredited labs are the reference points that matter.

What attack classes were included?

A liveness system tested only against printed photos and basic video replay has not been evaluated against the attacks that are relevant today.

Can you share the test methodology, not just the headline figure?

A vendor who can answer that question with specificity is a vendor whose accuracy claim is worth considering.

How Shufti approaches accuracy

The accuracy gap hits hardest in the markets that matter most. Most vendors weren’t built for Vietnam, Indonesia, Brazil, South Asia, or the Gulf. They trained on Western IDs and bolted on everything else, which means their lab figures reflect conditions that do not map to those documents in the field.

Shufti’s document verification was trained natively on 10,000+ document types across 240+ countries and jurisdictions. The model was built from the ground up to handle that range, not patched to accommodate it after the fact. On the biometric side, Shufti holds iBeta Level 3 conformance under ISO/IEC 30107-3, the standard introduced in June 2025 for AI-driven attack defence. Expert attackers, no budget constraints, weeks of attempts. That is the bar the conformance was earned against, and it is a claim you can test.

See how Shufti’s identity verification holds up on your actual document mix. Book A Demo

Frequently Asked Questions

What is the DHS RIVR and why does it matter for IDV buyers?

The Remote Identity Validation Rally (RIVR) is an independent benchmark run by the U.S. Department of Homeland Security S&T, NIST, and the Maryland Test Facility. It tests commercial IDV systems against real and fraudulent documents, giving buyers a vendor-neutral view of accuracy that no sales presentation can replicate.

What does ISO/IEC 30107-3 Level 3 conformance actually test?

Level 3, introduced by iBeta in June 2025, evaluates biometric liveness systems against expert attackers with no budget constraints over weeks of testing. It covers advanced AI-generated attacks including deepfakes and face-swap techniques — the attack classes most relevant to current fraud patterns. Very few vendors globally hold this conformance level.

How do I test an identity verification vendor's accuracy before committing?

Ask for the full methodology behind any accuracy figure: what documents were tested, what attacks were included, whether an accredited independent lab conducted the evaluation, and whether the vendor has results from a public benchmark like DHS RIVR or iBeta. Vendors who can answer these questions in detail are the ones whose numbers are worth trusting.

Stop Trusting Identity Verification Accuracy Claims. Start Testing Them

TL;DR

What does “99% accuracy” actually mean?

The variables vendors quietly omit

Product validation accuracy vs. operational accuracy and why they diverge

What independent testing actually found

The document validation gap

The selfie-to-ID gap

What a trustworthy accuracy claim looks like

The standard that now separates real from marketed

Four questions to ask before signing a contract

What was the document mix in the test set?

Was testing conducted by an accredited independent third party?

What attack classes were included?

Can you share the test methodology, not just the headline figure?

How Shufti approaches accuracy

Frequently Asked Questions

What is the DHS RIVR and why does it matter for IDV buyers?

What does ISO/IEC 30107-3 Level 3 conformance actually test?

How do I test an identity verification vendor's accuracy before committing?

Keep up to date with the Shufti newsletter

Related Posts

Identity Verification UK: What businesses need to know in 2026

Identity Verification in Australia: How It Works and Why It Matters in 2026

How to choose the right KYC solution for the APAC market

ID Verification USA: How It Works, Which Laws Apply, and What Businesses Need to Know

Identity proofing vs identity verification: what’s the difference?

Vendor KYC verification: What it Checks, Why it Matters, and How to Automate it

Identity Verification Pricing: How Pricing Models Work and What to Look For?