Doddington's Zoo and Fingerprint System Security - Noblis

Doddington's Zoo and Fingerprint 

System Security 

Ron Sutton 

BearingPoint, Inc 

April 19, 2007

This Presentation is BearingPoint Proprietary Information © 2007 BearingPoint, LLC. LLC 

1

Overview 

– MINEX04 results imply low expectations for FMR (~1%) using 

interoperable minutiae templates at a reasonable FNMR. 

– Biometric practitioners believe that biometrics are stronger than 

these numbers imply but no analytical methods are available to prove 

this 

We need a better approach to evaluating biometric algorithm performance 

This Presentation is BearingPoint Proprietary Information © 2007 BearingPoint, LLC. 

2

What is FMR? 

– FMR is the False Match Rate 

– Typically computed by cross-comparing biometric samples from 

distinct individuals and analyzing the results 

FMR 

= 

N 

∑ 

i= 

1 

# FalseMatches 

N( 

N 

−1) 

– This represents average performance over an entire population 

– But are populations really uniform with respect to FMR? 


i 

3

What does FMR really tell us? 

FMR represents the average probability of a false match of any randomly 

selected individual against any other. But this is not how credentialing 

systems work. 


4

Applying FMR to Credentialing Systems 

FMR does not tell us the probability of a false match of any 

randomly selected individual against a specific other 


5

Doddington’s Zoo 

– Doddington, et al, says, 

─ “In our model, sheep dominate the population and systems 

perform nominally well for them.” 

─ “Goats tend to adversely affect the performance of systems by 

accounting for a disproportionate share of the missed 

detections. The goat population can be an especially important 

problem for entry control systems, where it is important that all 

users be reliably accepted.” 

─ “Lambs, in our model, are those speakers who are particularly 

easy to imitate.” 

─ “Wolves, in our model, are those speakers who are particularly 

successful at imitating other speakers.” 


6

Doddington’s Zoo & Fingerprints 

– Sheep – persons whose biometric perform well but are not 

necessarily more likely to match non-mates 

FNMR sheep < FNMR MINEX 

FMR sheep = ~FMR MINEX 

– Goats – persons whose biometric performs poorly but are not 

necessarily more likely to match non-mates 

FNMR goats >> FNMR MINEX 

FMR goats = ~FMR MINEX 

– Lambs & Wolves – persons whose biometric is more likely than 

those of others to match or be matched by non-mates 

FNMR lambs/wolves FMR MINEX 


7

Zoo Population Sizes 

– What percentage of the transportation worker population falls into each ‘species’ of 

Doddington’s Zoo? 

In NISTIR 7271, “The Myth of Goats:How many people have fingerprints that 

are hard to match?” (13 September 2005) researchers Austin Hicklin, Craig Watson 

and Brad Ulery conclude: 

“The definition of a Goat, or person whose fingerprints are intrinsically hard 

to match, varies. However, results clearly show that the proportion of Goats 

is very small, regardless of the definition. None of the 6,000 subjects had 

fingers that were always hard to match (with single-finger mate scores 

worse than a threshold corresponding to a verification False Accept Rate of 

1%); less than 0.05% of the subjects had fingers that were usually hard to 

match; less than 0.3% of the subjects had fingers that were hard to match 

even a quarter of the time. 

Many individuals were particularly easy to match: for 77% to 81% of 

subjects, every fingerprint comparison had mate scores better than a 

threshold corresponding to a verification False Accept Rate of 10-6 

(0.0001%).” 

(Emphasis added by presentation author) 


8

Applying the “Zoo” to System Design 

– If the ‘zoo’ concept really applies to fingerprints, then the ROC 

curves for various subpopulations are not identical. For example, 

some hypothetical algorithm may perform as shown below. 

W/L 

Sheep and Goats – 90% of population 

FNMR = 0.001 at FMR of 0.000001 

Wolves and Lambs – 10% 

FNMR=0.2 at FMR of 0.001 


9

Applying the “Zoo” to System Design 

If Sheep/Goat and Wolf/Lamb subpopulations are disjoint and 

subpopulation membership can be determined at time of credential 

issuance, it can be recorded on the credential and used to establish a 

variable matching threshold. 

─ Credentials marked as Sheep or Goat would have lower matching 

threshold 

─ Credentials marked as Wolf/Lamb would have higher matching 

threshold 

─ Credentials marked as ‘Indeterminate’ would have higher matching 

threshold 

This approach produces a lower system FNMR at a given FMR. 

FMR 

System 

= + 

N 

S/G 

* FMR 

N 

S/G 

S/G 

+ N 

+ N 


W/L 

W/L 

* FMR 

W / L 

10

What more do we need to know? 

– Anecdotal evidence supports Zoo application to fingerprints 

– Population topology is unknown 

─ Are sheeps/goats and wolves/lambs disjoint sets? 

─ Is there a single wolf/lamb subpopulation, or many? 

─ Are there any persons who tend to match everyone more 

frequently? 

– Hard data will allow more definitive statements about true 

strength of biometric security solutions and provide architects 

with additional tools to meet system requirements 

– Reanalysis of MINEX results could produce data needed 

– Applicable to TWIC, RT, HSPD-12, others 


11

Analysis of NIST IAD Biometric Scores Set 

Dataset properties 

– 6000 subjects 

– Left & Right index fingers 

– Each subject is matched against himself and all others 

– Scores for each compare included in dataset 

Problems with this dataset 

– Not enough data to assess ‘Goat’ population 

– No multiple sample compares 

– No images or image quality scores 

– Algorithm is not particularly strong 


12

Performance data – Left Index only 

Mate Score Mate Scores Non-Mate Score 

Non-Mate 

Scores 

Non-Mates with 

scores below FNMR FMR 

False Matches at 

this threshold 

10 72 10 1893647 31748647 5.3% 17.06% 6,139,000 

11 50 11 1298575 33047222 6.5% 11.79% 4,245,353 

12 60 12 893047 33940269 7.4% 8.19% 2,946,778 

13 54 13 615514 34555783 8.4% 5.71% 2,053,731 

14 53 14 426652 34982435 9.3% 4.00% 1,438,217 

15 53 15 297757 35280192 10.1% 2.81% 1,011,565 

16 47 16 208379 35488571 11.0% 1.98% 713,808 

17 56 17 146547 35635118 11.8% 1.40% 505,429 

18 61 18 103996 35739114 12.7% 1.00% 358,882 

19 40 19 73938 35813052 13.8% 0.71% 254,886 

20 57 20 52221 35865273 14.4% 0.50% 180,948 

21 64 21 37412 35902685 15.4% 0.36% 128,727 

22 45 22 27006 35929691 16.4% 0.25% 91,315 

23 53 23 18836 35948527 17.2% 0.18% 64,309 

24 46 24 13399 35961926 18.1% 0.13% 45,473 

25 56 25 9597 35971523 18.8% 0.09% 32,074 

26 62 26 6823 35978346 19.8% 0.06% 22,477 

27 41 27 4764 35983110 20.8% 0.04% 15,654 

28 58 28 3318 35986428 21.5% 0.03% 10,890 

29 56 29 2352 35988780 22.5% 0.02% 7,572 

30 65 30 1599 35990379 23.4% 0.01% 5,220 


13

Score Frequencies 

Frequency 

2000000 

1800000 

1600000 

1400000 

1200000 

1000000 

800000 

600000 

400000 

200000 

0 

Mate & Non-Mate Score Frequencies 

0 50 100 150 200 250 300 350 400 

Score 

Non-Mate Scores Mate Scores 


80 

70 

60 

50 

40 

30 

20 

10 

0 

14

Analysis Methodology 

Based on score frequency analysis, a base threshold of 20.0 was 

selected. This yields overall performance statistics of: 

FNMR = 14.4% 

FMR = 0.5% 

This threshold was used to count the number of times each 

subject matched, or was matched by another subject. 


15

Lamb & Wolf Analysis 

Rather than FNMR & FMR, this analysis examined performance 

characteristics that could be measured and controlled for in a 

credentialing system. 

– Lambs – subjects who were matched by others at a rate 4x the mean 

FMR for the overall population (characterized by False Match By Others 

Rate) 

– Wolves – subjects who matched others at a rate 4x the mean FMR for 

the overall population (characterized by False Match Of Others Rate) 

– The analysis identified 318 Wolves and 314 Lambs. 

– These groups had 80 members in common. 


16

Using FMBOR and FMOOR in a system 

In a credentialing system, the only points at which decisions can 

be made are at card personalization and at the point of use. 

When anyone attempts to match a Lamb, credential consumers 

can detect a ‘Lamb’ indicator on the credential and raise the 

matching threshold. Lambs can be identified empirically in a 

credentialing system. 

Wolves are a bit more difficult; since they tend to match others 

we would need to identify them at the point of use based on 

image characteristics. Whether this is possible is not yet known. 


17

Application of higher thresholds 

Once Wolf and Lamb groups were identified, data was reanalyzed using 

higher thresholds when the person in the role of a credential holder was a 

Lamb or the person attempting to use the credential was a Wolf. Other 

subjects were considered to be Sheep, and the original threshold was used 

for them. Although the same threshold was used for Lambs and Wolves in 

this analysis, preliminary data suggests that if different thresholds are used, 

the Lamb threshold should be used for a person is both a Lamb and a Wolf. 

Base 

Threshold 

# 

Sheep 

# 

Wolves 

# 

Lambs 

Sheep 

Threshold 

Wolf 

Thresh 

old 

Lamb 

Thres 

hold 

20 5368 318 314 20 20 20 

20 5368 318 314 20 25 25 

Sheep 

FNMR 

15.44 

% 

15.44 

% 

Wolf 

FNMR 

2.20 

% 

3.14 

% 

Lamb 

FNMR Sheep FMR 


Wolf 

FMOOR 

Lamb 

FMBOR 

5.41 

% 0.36% 1.40% 2.73% 

7.32 

% 0.36% 0.29% 0.63% 

18

Varying the Thresholds 

20.00% 

18.00% 

16.00% 

14.00% 

12.00% 

10.00% 

8.00% 

6.00% 

4.00% 

2.00% 

0.00% 

Lamb FNMR & FMBOR vs Threshold 

10 12 14 16 18 20 22 24 26 28 30 

Threshold 

20.00% 

18.00% 

16.00% 

14.00% 

12.00% 

10.00% 

8.00% 

6.00% 

4.00% 

2.00% 

Lamb FNMR Lamb FMBOR 

Sheep FNMR & FMR vs Threshold 

0.00% 

10 12 14 16 18 20 22 24 26 28 30 

Threshold 

Sheep FNMR Sheep FMR 

Wolf FNMR & FMOOR vs Threshold 


20.00% 

18.00% 

16.00% 

14.00% 

12.00% 

10.00% 

8.00% 

6.00% 

4.00% 

2.00% 

0.00% 

10 12 14 16 18 20 22 24 26 28 30 

Threshold 

Wolf FNMR Wolf FMOOR 

19

Impact on Overall System FMR 

All matched against threshold of 20.0 

Matched Sheep Wolves Lambs Totals 

Sheep 80,149 7,520 29,936 117,605 

Wolves 30,896 4,116 18,367 53,379 

Lambs 13,787 1,812 7,895 23,494 

Wolves & Lambs matched against threshold of 25.0 

Matched Sheep Wolves Lambs Totals 

Sheep 80,051 7,518 5,682 93,251 

Wolves 6,023 920 5,496 12,439 

Lambs 2,375 382 2,097 4,854 

194,478 FMs using uniform threshold of 20.0 

110,544 FMs when FMBOR and FMOOR used to apply a higher threshold 

Quick estimate of system FMR ceiling = 0.31%, versus 0.5% 


20

Observations 

– Wolves and Lambs are not the same group, though they do 

overlap (by ~25% in this dataset) 

– Wolves and Lambs had significantly lower FNMRs than Sheep 

at the same threshold 

– If a person is both a Wolf and a Lamb, more advantage is 

obtained by treating the person as a Lamb 

– It appears that this may be a fruitful approach to improving 

the security of credential-based access control systems 


21

Caveats 

– Parameter selections were arbitrary; further analysis may 

reveal greater benefits at other settings 

– Analysis was based on a single dataset using a single matching 

algorithm; the effect discerned may not be present in the real 

world 

– The relatively weak discriminative powers of the algorithm 

used to generate the scores tends to disguise the quantitative 

possibilities of this approach 

– Note that system FMR calculation is complicated by the fact 

that FMs by subjects who are both Lamb and Wolf are counted 

twice 


22

Current Work 

– Defining what a ‘subpopulation’ means in rigorous 

mathematical terms 

– Determining whether Lamb & Wolf populations have further 

subdivisions 

– Determining whether there are ‘Wolfs at Large’; these are 

persons who tend to match random individuals in the Sheep 

population 


23

Next Steps 

– Acquire dataset that lends itself better to broader analysis 

� Images would allow evaluation with a variety of algorithms and 

analysis of pattern class correlation with subpopulations 

� Multiple samples from each subject would allow goat analysis 

and improve confidence in other subpopulation identification 

– Determine whether effect is present in other populations and 

with other matching algorithms 

– Determine whether subpopulation membership varies with 

matching algorithm 

– Seek common image characteristics that could be used to 

algorithmically identify subpopulation membership 

– Quantify benefits, if any, with matching algorithms that might 

be deployed in the real world 


24

Want to help? 

– Seeking significant sets of fingerprint image data 

� Multiple samples from each finger 

� Multiple samples from each individual (ideally, 10 prints) 

� Largest populations possible 

� Prefer high quality data for concept evaluation 

� Mixed quality data for real-world performance analysis 


25


26

Doddington's Zoo and Fingerprint System Security - Noblis

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?