Annotated Bibliography Assignment
sources/139/Collins - 2019 - Hardware Trojans in FPGA Device IP Solutions Thro.pdf
Hardware Trojans in FPGA Device IP: Solutions Through Evolutionary Computation
by
Zachary Collins
A Thesis In Partial Satisfaction
of the Requirements for the Degree of
Master of Science
in
Computer Engineering
in the
Graduate Division
of the
University of Cincinnati College of Engineering and Applied Science
Committee in Charge:
Professor Rashmi Jha, Chair Professor Anca Ralescu
Dr. David Kapp
Spring 2019
Abstract
Evolutionary Removal of Hardware Trojans in FPGA IP
by
Zachary Collins
Master of Science in Computer Engineering
University of Cincinnati
College of Engineering and Applied Science
Prof. Rashmi Jha, Chair
As the hardware supply chain continues to globalize, hardware designers are becoming
more concerned about the security of their designs. Because FPGAs are accessible to nearly
anybody, and few organizations design and manufacture them, the FPGA supply chain is
particularly complex. In recent years, hardware designers and the academic community
have begun considering the possibility of hardware Trojans, malicious modifications to
hardware circuitry, in FPGA IP - the code used to program an FPGA device. As more and
more hardware designers incorporate somewhat unverified IP purchased from 3rd-party
hardware vendors into their designs, the need for a comprehensive model of FPGA security
increases.
Many design strategies for detecting and tolerating Trojans in FPGA devices have been
proposed. Many of these strategies focus on catching Trojans at test-time. This is undesir-
able, as Trojans often employ complex techniques to hide themselves during testing. Some
Trojan tolerance systems have been suggested, but no system exists that will allow FPGA
systems to completely mitigate the effects of, or remove, all types of Trojans from FPGA
IP.
i
In order to develop a comprehensive system for FPGA protection, it is important to
understand what types of threats might exist in FPGA IP. Trojans can have a variety of
different activation mechanisms and effects, and it is important to thoroughly understand
all of them in order to ensure security of FPGA designs. There exist some taxonomies, or
classifications, of the Trojans that may exist in FPGA hardware, rather than in the IP. Some
of these taxonomies acknowledge the existence of Trojans in FPGA IP, but none provide a
classification of the threats Trojans in IP can pose.
This work presents a comprehensive taxonomy of the payloads (effects) of Trojans in
FPGA IP. This taxonomy is built on the taxonomies of hardware Trojans, and other work
in the field of FPGA IP viruses and Trojans. The goal of this taxonomy is to concisely
categorize and summarize the different threats hardware designers face when integrating
3rd-party IP into their designs, and to provide an analysis of existing mitigation strategies
and their effectiveness against the various types of Trojans. This work also examines what
Trojans that may exist in FPGA IP are relatively unaffected by existing Trojan detection
and tolerance schemes. It is important for hardware designers to be able to design systems
that can tolerate any type of Trojan, not just a small subset of them.
Finally, this work presents a novel Trojan tolerance strategy using genetic program-
ming, a type of biologically-inspired computation. Genetic programming, inspired by ge-
netic crossover and mutation in biological organisms, can be used to modify software and
guide it toward a better solution by iteratively improving on the design using a variety of
biological operations. Because 3rd-party IP is often delivered as hardware design language
(HDL) code, genetic programming is uniquely adept at removing Trojans when they are
detected in the code. Results show that genetic programming can be used to remove a va-
riety of Trojans from FPGA IP. This Trojan tolerance scheme can be used to repair FPGAs
at run-time, without human intervention. This effect is desirable because many FPGAs are
deployed in aerospace and other uptime-sensitive fields, where having to bring a device
down may endanger lives or incur large monetary costs.
ii
understand, v.: To reach a point, in your investigation of some subject, at which you cease
to examine what is really present, and operate on the basis of your own internal model
instead.
/usr/bin/fortune
Acknowledgements
I would first like to thank my thesis advisor, Professor Rashmi Jha, for supporting me
and providing direction in the time I’ve spent working on my research and writing. I would
similarly like to thank Professor Anca Ralescu and Dr. David Kapp for their valuable
advice and feedback on my work, and for serving on my thesis committee.
I would like to thank my labmates, particularly Michael Santacroce, for watching my
work’s progress and providing suggestions for improvement week by week, and Michael
for supporting me through writing this thesis.
Finally, I would like to express my sincere gratitude to my parents, and to Laura Tebben,
for providing me their encouragement throughout my years of study and the time I have
spent on this work. Finishing my thesis would never have been possible without them.
iv
TABLE OF CONTENTS
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2: Hardware Trojans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Taxonomies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Taxonomy of Trojans in FPGA IP . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Trojans that Cause Malfunction . . . . . . . . . . . . . . . . . . . 11
2.2.2 Trojans that Prevent FPGA Operation . . . . . . . . . . . . . . . . 13
2.2.3 Trojans that Inject Faults . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.4 Trojans that Cause Side Effects . . . . . . . . . . . . . . . . . . . . 16
2.2.5 Trojans that Leak Information . . . . . . . . . . . . . . . . . . . . 16
2.2.6 Trojans that Waste FPGA Resources . . . . . . . . . . . . . . . . . 18
2.2.7 Trojans that Introduce Vulnerabilities . . . . . . . . . . . . . . . . 19
2.3 Existing Trojan Mitigation Strategies and FPGA IP . . . . . . . . . . . . . 20
2.3.1 Trojan Detection Techniques . . . . . . . . . . . . . . . . . . . . . 21
v
2.3.2 Trojan Tolerance Techniques . . . . . . . . . . . . . . . . . . . . . 23
Chapter 3: Evolutionary Algorithms and Evolvable Hardware . . . . . . . . . . 25
3.1 Evolutionary and Genetic Programming . . . . . . . . . . . . . . . . . . . 25
3.2 Evolvable Hardware In FPGAs . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Genetic Programming-based Evolvable Hardware . . . . . . . . . . . . . . 33
3.3.1 Background and Justification . . . . . . . . . . . . . . . . . . . . . 33
3.3.2 Past Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.3 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Trust-Oriented Applications of Evolvable Hardware . . . . . . . . . . . . . 41
3.4.1 Applications to Trojans in FPGA IP . . . . . . . . . . . . . . . . . 43
3.4.2 Applications to Trojans in FPGA Hardware . . . . . . . . . . . . . 44
Chapter 4: Genetic Programming-based Evolvable Hardware for FPGA Security 46
4.1 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2 Experimental Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Chapter 5: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
vi
LIST OF TABLES
3.1 Fault Tolerance in Evolved Hardware - Canham et al. . . . . . . . . . . . . 42
3.2 Fault Tolerance Systems - Larchev et al. . . . . . . . . . . . . . . . . . . . 43
vii
LIST OF FIGURES
1.1 FPGA Supply Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.1 Taxonomy Proposed by Wang et al. [8] . . . . . . . . . . . . . . . . . . . . 6
2.2 Taxonomy Proposed by Chakraborty et al. [10] . . . . . . . . . . . . . . . 8
2.3 Taxonomy Proposed by Mal-Sarkar et al. [1] . . . . . . . . . . . . . . . . . 10
2.4 Taxonomy of Trojans in FPGA IP . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Triple Modular Redundant FPGA System . . . . . . . . . . . . . . . . . . 23
3.1 Sample Abstract Syntax Tree . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Mutation in an AST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Crossover in an AST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Intrinsic Evolvable Hardware System . . . . . . . . . . . . . . . . . . . . . 30
3.5 Extrinsic Evolvable Hardware System . . . . . . . . . . . . . . . . . . . . 32
3.6 Reduced Verilog BNF [38] . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 AST of 8 to 1 MUX from one run . . . . . . . . . . . . . . . . . . . . . . 38
3.8 Alternate AST of 8 to 1 MUX from same run . . . . . . . . . . . . . . . . 39
3.9 AST of 8 to 1 MUX from a second run . . . . . . . . . . . . . . . . . . . . 39
3.10 AST of 8 to 1 MUX from a third run . . . . . . . . . . . . . . . . . . . . . 40
4.1 GENPEFS System Design . . . . . . . . . . . . . . . . . . . . . . . . . . 48
viii
4.2 AST of correct 4 to 1 MUX . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 4 to 1 MUX Number of Generations To Remove Trojans . . . . . . . . . . 51
4.4 AST of correct 8 to 1 MUX . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 8 to 1 MUX Number of Generations To Remove Trojans . . . . . . . . . . 52
4.6 AES Encryption Module Number of Fitness Evaluations To Remove One Trojan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.7 AES Encryption Module Number of Fitness Evaluations To Remove Two Trojans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
ix
CHAPTER 1
INTRODUCTION
Recent years have had increasing globalization of the semiconductor supply chain. Hard-
ware designers are becoming more concerned about the security of their designs as more
of the hardware design and manufacturing process moves overseas. FPGAs have a partic-
ularly long supply chain - FPGA vendor, foundry, PCB vendor, 3rd-party IP1 vendors, the
end hardware designer, and the end user. This work focuses on security concerns that the
hardware designer, the person writing code to be programmed onto the FPGA, might face
- particularly security concerns from integrating 3rd-party IP into FPGA designs.
Figure 1.1: FPGA Supply Chain
Concerns about security in FPGA designs have lead hardware designers and academics 1IP refers to the code programmed onto the FPGA. FPGA code is typically written in Verilog or VHDL,
and 3rd-party IP is delivered to hardware designers in one of these languages.
1
to increasingly consider the possibility of Trojans at any point in the hardware supply chain.
Hardware and FPGA design security have existed for some time, but the integration of 3rd-
party IP into designs is a recent phenomenon. This is primarily because hardware designs
are becoming increasingly complex. It is often not feasible for hardware designers to create
a design end-to-end in-house. Instead, hardware designers look to 3rd-party hardware IP
vendors, who sell commercial-off-the-shelf components for hardware designs.
While a lot of focus has been put into understanding hardware Trojans in silicon and in
circuit boards, the idea of Trojans existing in FPGA IP is somewhat novel. Recently, work
has acknowledged the existence of Trojans in FPGA IP, but there is no comprehensive
discussion and analysis of what threats hardware designers might face when they integrate
3rd-party IP into their designs [1]. In order for hardware designers to better understand the
threats they face when integrating 3rd-party IP intro designs, it would be useful to have a
comprehensive classification and analysis of hardware Trojans in FPGA IP.
In addition to a need for understanding what threats are present in FPGA IP, there is
also a need for more comprehensive FPGA Trojan mitigation strategies. There are many
Trojan detection and tolerance strategies in existence, but no strategy is able to overcome
every type of Trojan. In general, Trojan tolerance is more desirable than Trojan detection.
Test-time Trojan detection methods are useful, but Trojans often hide themselves during
testing and surface later, meaning the only way to operate an FPGA without fear of Trojans
is to develop Trojan tolerant designs.
Most Trojan tolerance strategies use redundancy at some level in the system in order to
duplicate results and hopefully choose the correct one. One such strategy is Triple Modular
Redundancy, one of the most common Trojan tolerance strategies [2]. Redundancy-based
strategies always have some sort of cost - area in the design, power consumption, and in
the case of IP the cost of purchasing multiple copies of the IP from different hardware
vendors [1]. Further, these strategies are unable to mitigate some Trojans, like those that
leak information. If a Trojan leaks information through a side channel, duplicating the IP
2
will not help [3]. The system might compare results and choose the most common one in
order to pick the correct result, but information will still be leaked each time the infected
IP is used [4]. Clearly, there is a need for a more comprehensive tolerance system.
Evolvable hardware - a hardware design strategy inspired by biological processes - has
been used in the past to repair faults in hardware [5]. FPGAs are particularly suited to
evolvable hardware design strategies due to their reconfigurability. Similarly, genetic pro-
gramming can be used to repair bugs in software [6]. Because 3rd-party IP is delivered
as HDL code, genetic programming might be uniquely fit for repairing bugs - or Trojans -
in FPGA IP. Despite some difficulties in employing evolvable hardware strategies in mod-
ern FPGAs, it might be possible to repair Trojans located in FPGA IP using a genetic
programming-inspired evolvable hardware technique.
Removing Trojans from FPGA IP using genetic programming would allow automatic
repair of IP Trojans without any human intervention. A system using such a strategy would
be able to minimize downtime of the hardware, and hopefully deal with the types of Trojans
that current Trojan tolerance schemes cannot handle. Minimizing downtime is particularly
important in uptime-sensitive applications like aerospace, or more recently, autonomous
vehicles. Downtime in those types of applications can endanger lives, or incur massive
costs to the developer and end user.
The rest of this work analyzes past taxonomies of hardware Trojans and their applica-
bility to Trojans in FPGA IP. It then presents a the first Taxonomy of Trojans in FPGA IP,
based on past work in creating hardware Trojans in IP. Then, it examines the effectiveness
of a variety of existing mitigation strategies, and examines the need for a new, more com-
prehensive system. It then examines how evolutionary and genetic algorithms have been
used for hardware design and fault tolerance. It proposes a novel genetic programming
(GP)-based approach to employing evolvable hardware. Then, it explores the effectiveness
of this method in removing Trojans from FPGA IP, by inserting Trojans into a variety of
circuits and attempting to remove them using this GP-based evolvable hardware strategy.
3
Finally, it considers considers future work in developing this GP-based evolvable hardware
strategy and how it might be applied to Trojans in FPGA hardware. This work demonstrates
successful removal of various hardware Trojans from different circuit designs. Importantly,
it finds that the only successful modification to a large circuit using genetic programming
results in removal of the Trojan and no other changes.
4
CHAPTER 2
HARDWARE TROJANS
Hardware Trojans are malicious modifications made to circuitry in order to produce some
undesirable effect, usually without the hardware designer’s knowledge. In recent years, the
hardware supply chain has become increasingly globalized. Few semiconductor businesses
own their own foundries, and many of those 3rd-party foundries are located outside of the
US. As hardware designers outsource more and more of their work, the threat of hardware
Trojans has increased. Often, hardware designers do not design their products end-to-end.
Many hardware designers make use of an outside foundry, or use externally designed pe-
ripheral components. In the case of FPGAs, hardware designers produce only the IP to
be run on the FPGA, while all hardware is purchased from a vendor. More recently, how-
ever, hardware vendors are offering commercial-off-the-shelf IP blocks, allowing hardware
designers to outsource even more of their work. Ultimately, any hardware or software com-
ponent that is outsourced is a potential attack vector, and should be considered untrusted.
Hardware Trojans in FPGAs can exist in many different forms. They can be located in
the FPGA silicon, on the FPGA PCB, or even in the IP used to program the FPGA. These
Trojans can be characterized by a variety of their parameters - payload, trigger mechanism,
physical size, power consumption, etc. Previous attempts have been made to comprehen-
sively characterize Trojans based on many of these parameters. This chapter explores these
characterizations, discusses Trojan mitigation strategies, and proposes a taxonomy of Tro-
jans that can be found specifically in FPGA IP, something that has not yet been done.
2.1 Taxonomies
Few taxonomies have been proposed to categorize hardware Trojans. These taxonomies
often categorize Trojans using their payload, their activation mechanism, and their physical
5
characteristics. This section explores the taxonomies that have been proposed in the past,
and considers their applications to Trojans in FPGA IP.
An early, not comprehensive taxonomy of hardware Trojans discussed that they may
have varying activation triggers and payloads [7]. A more complete taxonomy of hardware
Trojans was first discussed by Wang et al. in [8], and then expanded upon by Tehranipoor
et al. in [9]. Wang et al. proposed the taxonomy seen in Figure 2.1.
Figure 2.1: Taxonomy Proposed by Wang et al. [8]
This taxonomy categorizes Trojans by three different types characteristics - their phys-
ical characteristics, their activation characteristics, and their action characteristics.
Physical characteristics describe how the Trojan is placed in FPGA hardware. These
characteristics might not we worthwhile in the scope of FPGA IP Trojans, but it is best
to consider them here anyway. Physical characteristics include the distribuiton, structure,
size, and type of the Trojan. Distribution describes how and where the Trojan is placed
on the FPGA hardware - is there one Trojan in each LUT, are the Trojans on select LUTs,
and so on. It describes the Trojan’s physical location on the chip. Structure refers to the
Trojan’s internal logic and routing, rather than where the Trojan is located on the chip.
6
Size categorizes Trojans based on their physical size on the chip. Type makes a distinction
between two main types of Trojans - functional and parametric. Functional Trojans produce
errors or undesirable effects in logic by addition or deletion of gates, while parametric
Trojans modify functionality of existing wires [9].
Activation characteristics explain how a Trojan is triggered. Trojans are either inter-
nally or externally activated. Externally activated Trojans are activated when a sensor or
communication device reads an external signal or environmental condition telling the Tro-
jan to activate. Internally activated Trojans are activated either combinationally (some rare
condition) or sequentially (after a certain amount of time).
Action characteristics explain what a Trojan actually does when it is activated. This is
perhaps the most worthwhile classification of Trojans, as it can be used to develop different
Trojan tolerance strategies. This taxonomy breaks action characteristics into three cate-
gories: Trojans that transmit information, Trojans that modify specification, and Trojans
that modify function. Trojans that transmit information aim to leak information from the
hardware system to somewhere else. This information can be internal to the system, like
an encryption key, or can be external information read using sensors in the hardware sys-
tem. Trojans that modify specification change nonfunctional requirements of the hardware
system, like clock speed. Finally, Trojans that modify function change the actual logic of
the system. Modified logic can be as simple as an incorrect result for an operation, or as
complex as taking over the FPGA system and using it for some other task.
This classification provides an interesting first attempt at categorizing hardware Tro-
jans. Later taxonomies provide more elaboration on the action characteristics, and mostly
ignore physical characteristics of the Trojans. Physical characteristics can mostly be ig-
nored as they are only useful in detecting the Trojan using destructive or side-channel anal-
ysis, both of which are difficult to perform and somewhat ineffective at detecting Trojans
[4, 10, 3].
The next classification to consider is that by Chakraborty et al. in [10]. This proposed
7
taxonomy of hardware Trojans omits classification by physical properties and focuses only
on triggers and payloads. A reproduction of this taxonomy is seen in Figure 2.2.
Figure 2.2: Taxonomy Proposed by Chakraborty et al. [10]
This taxonomy organizes Trojans by their trigger and their payload. Payloads can be
either digital or analog payloads to the circuit, or payloads that don’t have an effect on the
logic (other). Digital payloads change the logic in the circuit at a node, or modify memory
content in the hardware system. Analog payloads modify the circuit electrically, but often
end up producing a logical effect. Bridging analog Trojans tie a node in the circuit to Vdd or
8
Gnd. Delay Trojans modify the delay between nodes in a circuit. Activity Trojans generate
excess activity in the circuit in order to shorten its lifespan. Finally, Trojans may participate
in more software-based attacks like leaking information from the circuit or participating in
a denial of service attack to shut down important system functions [10].
Trojan triggers are organized into digital and analog triggers. Analog-triggered Trojans
are activated using sensors, or by monitoring device electrical activity. Digitally-triggered
Trojans are activated either combinationally or sequentially. Combinationally activated
Trojans are activated when some rare value occurs at a node in a circuit. Sequentially
activated Trojans are activated by a synchronous counter, asynchronous counter, hybrid
counter, or a rare sequence at a node in the circuit.
This taxonomy is useful particularly because of the expanded payload taxonomy. Pay-
loads are the most important aspect of a Trojan to understand. Understanding Triggers
helps hardware designers produce designs to avoid triggering Trojans, or to intentionally
trigger them during testing. However, the most effective form of Trojan mitigation is a
Trojan-tolerant circuit design, and comprehensively understanding what payloads Trojans
may bring helps hardware designers create designs that are able to deal with Trojans left
undetected during testing.
The final and most comprehensive taxonomy is that proposed by Mal-Sarkar et al. in [1]
and [4]. This taxonomy for the first time suggests that Trojans may also exist in FPGA IP,
though does not elaborate further. Equally importantly, this taxonomy focuses exclusively
on Trojans that may exist in FPGA devices, rather than in hardware devices in general.
Figure 2.3 shows a reproduction of this taxonomy.
This taxonomy has approximately the same content as that proposed by Chakraborty
et al. in [10], but organizes it somewhat better. Trojans in FPGA hardware can be either
conditionally triggered or always on. Conditionally triggered Trojans can be triggered by
an evironmental factor or some logic in the circuit. Trojan payloads can cause malfunction
or leak secret information. Secret information can be either FPGA IP or data inside the
9
FPGA. Malfunctions can be logical, like a logical error, or parametric, like modifying the
clock frequency of the circuit.
Figure 2.3: Taxonomy Proposed by Mal-Sarkar et al. [1]
This taxonomy is important because it is focused specifically on FPGAs. Earlier Trojan
taxonomies that did not have an FPGA focus ignored the possibility of leaking IP. Addi-
tionally, this work for the first time mentioned Trojans in FPGA IP as a possibility, but did
not include a detailed taxonomy of what types of Trojans might exist in IP.
These taxonomies give hardware designers an idea of what types of Trojans might exist
in hardware designs. Unfortunately, there is no comprehensive taxonomy of hardware
10
Trojans in FPGA IP. Creating such a taxonomy might help hardware designers understand
what kinds of threats they face when integrating 3rd-party IP into their FPGA systems.
2.2 Taxonomy of Trojans in FPGA IP
So that hardware designers may better understand the threats Trojans in FPGA IP pose to
hardware designs, it would be useful to have a single comprehensive taxonomy of hardware
Trojan effects in FPGA IP. Some attempts at categorizing Trojans that can be found in
FPGA hardware have been made [9, 10, 1, 4], to this date there exists no comprehensive
categorization of Trojans that can be found in FPGA IP. The categorization presented in
Figure 2.4 attempts to categorize the payloads, or effects, of Trojans that may be found in
FPGA IP. It is based on the earlier taxonomies of hardware Trojans and on FPGA Trojans
and viruses examined in other work. This categorization focuses only on the payloads of
FPGA IP Trojans because the categorization based on their triggers will likely be similar
to Trojans in FPGA hardware, and has not yet been explored enough to be mature. Mal-
Sarkar et al. suggest in [4] that categorizing Trojans based on their size and distribution
may not lead to interesting results or help hardware designers, so those categorizations
are also omitted. It is worthwhile to consider that a single Trojan may fit into multiple
categories in the taxonomy. A Trojan that hijacks an FPGA may also send the FPGA’s
previous configuration data to an attacker. A Trojan that injects faults may do so in order to
leak information, such as in fault-based cryptanalysis [11, 12, 13]. The work in this section
has been submitted in [14].
2.2.1 Trojans that Cause Malfunction
The class of Trojans existing in FPGA IP that many existing mitigation strategies aim to
tolerate is those that cause the FPGA to malfunction. The intensity of the malfunction can
range from a simple logical error to complete device failure with an electrical fault. These
Trojans can be further classified into Trojans that Prevent FPGA Operation and Trojans
11
F ig
ur e
2. 4:
Ta xo
no m
y of
T ro
ja ns
in F
P G
A IP
12
that Inject Faults.
2.2.2 Trojans that Prevent FPGA Operation
Some previously explored hardware Trojans aim to disrupt the operation of an FPGA.
This type of attack was first introduced by Hadzic as the SALT Trojan discussed in [15].
Whereas Trojans that electrically damage an FPGA are not probably not possible to pro-
duce in an HDL, it is somewhat easy for Trojan designers to produce Trojans that disable or
even reprogram the FPGA. When a hardware designer integrates a 3rd-party IP block into
a design on an FPGA, the 3rd-party IP block may have access to a multitude of advanced
FPGA reprogramming features. The 3rd-party IP block may be able to write information
to the FPGA’s configuration SRAM. Trojans that Prevent FPGA Operation can be further
divided into two subgroups, Trojans that Disable the FPGA and Trojans that Hijack the
FPGA.
Trojans that Disable the FPGA have been somewhat explored in past work. One of the
first papers on FPGA Secuity, the aforementioned ”FPGA Viruses” [15] by Hadzic et al.,
discusses three FPGA-disabling classes of viruses. The first, a permanently damaging virus
class titled MELT, is probably not possible through FPGA IP due to protections built into
FPGA synthesis software. The other attack classes, SALT and HALT, cover attacks that
cause system malfunction without permanently damaging the FPGA. These types attacks
are possible to implement in HDL code, especially when an attacker is aware of the FPGA
system the code will be deployed on.
Druyer et al. discuss the possibility of hijacking an FPGA system during configuration
in [16]. Dynamic FPGA reprogramming features provide an opportunity for attackers to
hijack an FPGA at any point in time. Inactive FPGA configurations are usually stored in
SRAM on the FPGA. Alternately, new configurations can be delivered to the FPGA through
a network interface. If a 3rd-party IP has access to the SRAM or a network interface, it may
be able to modify dormant FPGA configurations. This creates the opportunity for malicious
13
3rd-party IP blocks to reprogram the FPGA using a malicious configuration. Any FPGA
that is able to reprogram itself using a dynamic reprogramming feature, or has access to
its own configuration SRAM, may be vulnerable to a 3rd-party IP block modifying the
configuration. If a 3rd-party IP block is able to modify the configuraiton, there is no limit
to what it can do, including taking over the FPGA.
2.2.3 Trojans that Inject Faults
Hardware Trojans in FPGA IP may also manifest themselves as fault-injecting Trojans.
These Trojans aim to disrupt operation of the FPGA by providing an incorrect result in
some circuit operation. Fault injecting Trojans are one of the types that has been studied
most often in developing Trojan tolerance strategies. Many conventional Trojan tolerance
strategies, such as Adapted TMR [4] and MRVO [3] aim to mitigate the effects of fault-
injecting Trojans. We can further classify fault-injecting Trojans into Trojans that Inject
Internally Verifiable Faults and Trojans that Inject Internally Unverifiable Faults [17].
Past work in FPGA Trojans has discussed fault-injecting Trojans whose faults are in-
ternally verifiable [17]. A fault-injecting Trojan is internally verifiable if the result an IP
block produces can be determined to be correct or incorrect without duplication of the sys-
tem. An example of a Trojan whose result is internally verifiable is an FPGA that produces
instructions to control a CPU. If the FPGA sends an invalid opcode to the CPU, the result
is verifiably wrong.
In come cases, fault-injecting Trojans may be unverifiable without system duplication.
Fault-injecting Trojans produce undetectable results when their results are indistinguish-
able from a correct result [17]. Incorrect results can be indistinguishable from correct re-
sults in a variety of circumstances. For example in [17], an FPGA provides instructions for
a computer system to run. The authors explain that in the studied system, an incorrect result
is indistinguishable from a correct result as long as the incorrect result is a valid instruction
for the system to run. Fortunately, numerous redundancy-based verification strategies have
14
been proposed in past work, most of them using a majority voting scheme to determine
the correct result [9, 4, 2, 3]. In the case of Trojans in 3rd-party IP blocks, designers may
choose to purchase multiple copies of the same IP from different vendors and run those IP
blocks concurrently or one at a time using partial reconfiguration [3].
Listing 1 shows a simplified example of a fault-injecting Trojan. The Verilog code for a
4-state machine is infected with a trojan triggered by the t trojan signal that causes the state
machine to unexpectedly reset. Unless the signal is activated during testing, the Trojan will
not be caught. Although fault-injecting Trojans can be dangerous, the variety of existing
mitigation strategies makes them less of a threat than other types of Trojans [2].
module fourState (
input clk, input reset, input transition, input t_trojan, output [1:0] state
);
wire clk; wire reset; wire transition; wire t_trojan;
reg [1:0] state;
always @(posedge clk or negedge reset) begin if (!reset) begin
state <= 0; end else begin
if (t_trojan) begin state <= 0;
end else if (transition) begin
state <= state + 1; end
end end
endmodule
Listing 1: Fault-Injecting Trojan in State Machine
15
2.2.4 Trojans that Cause Side Effects
Trojans in FPGA IP may have a payload that does not interfere with the logic of the design.
These Trojans instead cause some side effect, often leaking information about the FPGA
or wasting resources on a task not intended by the victim of the Trojan. Although these
Trojans don’t always disrupt the FPGA’s work, they may have other extremely costly con-
sequences. These Trojans can be classified into Trojans that Leak Information and Trojans
that Waste FPGA Resources.
2.2.5 Trojans that Leak Information
Trojans in FPGA IP may aim to leak information about the FPGA or its environment [1, 3,
9]. These Trojans do not disrupt normal operation of the FPGA. Instead, they expose secrets
about the FPGA - other IP implemented on the FPGA, data in the FPGA, or information
from the systems surroundings. Trojans that leak information have been studied in the
context of Trojans in FPGA hardware, but few tolerance strategies exist. We further classify
these Trojans into Trojans that Leak System-Internal Information and Trojans that Leak
System-External Information.
Trojans that leak system-internal information discreetly send information about the
FPGA or data it is processing to a malicious actor. These Trojans can leak secret infor-
mation such as cryptographic keys [18]. Leaking cryptographic keys may allow attackers
to access even more data outside of the FPGA system. Trojans may also aim to leak infor-
mation about the FPGA’s design and configuration stored in the bitstream. Due to the high
design costs of FPGA systems, Trojans may attempt to steal information about the FPGA
configuration in order to reverse engineer and steal the system [1].
Trojans may also leak information external to the FPGA system. These Trojans take ad-
vantage of sensors and other peripheries to gather and transmit information such as the sys-
tem’s location and surroundings. These Trojans pose a particularly large threat to defense-
related applications, where the locations of equipment must be kept secret from potential
16
module encryption (
input [255:0] key, input [255:0] secret, input control, input t_trojan, input clk, input reset, output [255:0] encrypted
);
wire [255:0] key; wire [255:0] secret; wire control; wire t_trojan; wire clk; wire reset;
reg [255:0] encrypted; reg [255:0] tmp;
always @(*) begin . . . // Encryption logic . . .
always @(posedge clk) begin if (!reset) begin
encrypted <= 256'b0; end else begin
if (t_trojan) begin encrypted <= secret;
end else begin
encrypted <= tmp; end
end end
endmodule
Listing 2: Information Leaking Trojan in Encryption Module
attackers.
Listing 2 shows an example of an information-leaking Trojan. This Trojan, again trig-
gered by the t trojan signal, outputs an unencrypted version of the secret when triggered.
17
This again may not be caught during testing if the activation signal is never triggered.
Few strategies have been proposed to mitigate the effects of information-leaking Tro-
jans. Traditional redundancy-based approaches will not suffice, as the Trojans’ leaking
information is separate from the results the IP block produces. Alanwar et al. propose
a technique called Simple Blockage (SB) to obfuscate all information before it is sent to
other parts of the system. This technique prevents malicious actors from reading any data,
even when it is leaked [3]. Additionally, it comes at a lower design and power cost than
redundancy-based approaches to security. However, hardware designers must take care to
monitor every communication port on the FPGA, or the Trojan may be able to leak infor-
mation undetected. Additionally, this strategy is not appropriate when the 3rd-party IP is
responsible for implementing a communication protocol, as encryption of any data before
it leaves the FPGA may break the implementation.
2.2.6 Trojans that Waste FPGA Resources
Trojans may also aim to impede an FPGA system by consuming an excessive amount of
system resources. These Trojans may clog up network traffic or simply consume an ex-
cessive amount of power. Trojans that Waste FPGA Resources may do so by performing
a separate, unrelated task to the benefit of an attacker, or by maliciously wasting system
resources for no benefit. Because these Trojans have very similar effects and countermea-
sures, we do not draw a distinction between them in the taxonomy.
One potential motivation for resource-wasting Trojans is for an attacker to use the
FPGA for their own benefit while avoiding detection. Chakraborty et al. discussed the
idea of a Trojan harnessing an FPGA for a denial of service attack in [10]. These Trojans
are similar in motivation to those that hijack a system, but more difficult to detect because
the FPGA may remain completely functional while the Trojan is active. In modern FP-
GAs that use very small transistor processes, even measuring power consumption is not a
reliable way to detect Trojans in FPGAs with very small transistor sizes. The concept of
18
FPGAs used in a botnet is very similar to the recent Mirai IOT device botnet virus [19]. We
should anticipate that malicious actors may also take advantage of FPGAs in a similarly
difficult to detect attack.
Resource-wasting Trojans may also aim to disrupt operation of the FPGA by consuming
too many resources available to the FPGA. These Trojans may attempt to use all of the
bandwidth for some communication mechanism, as in a denial of service attack. They
may also simply increase the power consumption of an FPGA to an unsustainable level,
behaving like computer power viruses [20] or the hardware power viruses discussed in
[15].
Few Trojan tolerance strategies focus on these types of Trojans. Any system that aims
to tolerate resource-wasting Trojans must first be able to detect them. Although DOS-
based attacks are easy to detect, Trojans that don’t waste an excessive amount of resources
must be detected through some other type of system monitoring. Alanwar et al. discuss
two strategies that help disable infected IPs, Multiplexing Reconfigurable Variants’ Ouput
(MRVO) and Multiplexing Reconfigurable IPs’ Outputs and Cyclic Redundancy Check
Trojan Detection Schema (MCRC) in [3]. These strategies both suggest using partial re-
configuration to swap out infected IPs at run-time. These redundancy-based approaches
help mitigate the effects of resource-wasting Trojans, though they require at least one un-
infected copy of the IP to fully eliminate them.
The genetic programming-based evolvable hardware strategy we introduce in a later
section addresses side effect-inducing Trojans particularly well. It aims to disable the ma-
licious functionality while preserving all of the intended behavior, which is not possible in
most redundancy-based Trojan tolerance approaches without a golden copy of the IP.
2.2.7 Trojans that Introduce Vulnerabilities
A type of Trojan that has not been discussed in previous taxonomies is the Trojan that
introduces other vulnerabilities into the system, but has no other ill effects. These Trojans
19
have been discussed in software Trojan vulnerabilities, and there is no reason they cannot
also exist in hardware [21]. Hardware designers should be particularly careful about these
types of Trojans, as they might be very difficult to detect. Producing no immediate payload
makes the Trojan’s effect undetectable until the vulnerability allows another sort of attack
on the FPGA.
Trojans that introduce other vulnerabilities into the system might also be inserted by
hardware designers rather than only by 3rd-party IP vendors. These types of Trojans can
be difficult to distinguish from design errors, so they can allow malicious actors to have
some sort of plausible deniability.
2.3 Existing Trojan Mitigation Strategies and FPGA IP
In order to understand what work is needed in the area of FPGA IP Trojan mitigation, it is
necessary to examine current Trojan tolerance schemes and their effectiveness against the
various types of IP Trojans. This section examines a variety of different Trojan tolerance
strategies and examines the effectiveness of each in dealing with the different types of
Trojans in FPGA IP.
It is important to differentiate between Trojan detection and Trojan tolerance. Trojan
detection strategies provide a way for hardware designers to detect Trojans in their sys-
tems. These detection strategies may be at test-time, or continuous during run-time. Trojan
tolerance strategies are run-time Trojan mitigation techniques that allow a hardware design
to function normally in the presence of one or more hardware Trojans. In this sense, Trojan
tolerance techniques might be more useful in a design than Trojan detection techniques.
Because Trojans often hide themselves during testing, a tolerance technique will give hard-
ware designers more confidence that their designs are secure [22]. Trojan tolerance tech-
niques are usually based on either design for security (DFS) or run-time monitoring [23].
20
2.3.1 Trojan Detection Techniques
Trojan detection techniques can be divided into destructive and non-destructive techniques.
Destructive techniques are based on taking apart an IC, and examining it with a microscope
[24]. These are of course not useful for Trojans in FPGA IP, because they rely on examining
the hardware itself for Trojans. In place of destructive approaches, hardware designers
might consider thoroughly examining the code or netlist delivered by a 3rd-party IP vendor.
Of course, understanding the code requires a lot of engineering effort, and often that kind
of effort is not available when in-house design is forgone in favor of 3rd-party IP blocks.
Non-destructive techniques are often based on more comprehensive testing or side channel
analysis, and may be more useful for detecting Trojans in IP.
Logic testing-based Trojan detection techniques rely on detecting the Trojan through
extensive logical coverage of the device. Because combinational logic scales exponentially
with respect to the amount of inputs, it is very difficult to detect Trojans during testing.
Some strategies have been proposed in an attempt to make Trojan detection easier.
One such strategy is MERO, proposed by Chakraborty et al. in [25]. MERO creates
a set of tests to minimize test time while maximizing coverage in the device. Better test
coverage in less time should lead to better Trojan detection rates. MERO works by finding
low probability conditions at every node in a circuit, and creating vectors specifically for
triggering those rare conditions more than once. This strategy increases how often nodes
that are resistant to random pattern testing are triggered, hopefully revealing Trojans during
test-time [22].
Logic testing-based approaches to Trojan detection often use a coverage metric to deter-
mine the probability of a Trojan making it through testing undetected. Because of the huge
combinational complexity of large hardware designs, deterministically checking whether
there are any mistakes in the circuit is not possible. Instead, designers may choose to
randomly sample from known hardware Trojan triggers and Trojans and place them at dif-
ferent points in the circuit [25]. Running all tests against these Trojans develops a metric
21
of Trojan trigger coverage and detection coverage. Hardware designers may use this to test
Trojan resistance in circuits, but a fairly comprehensive set of possible Trojans to sample
from is necessary for adequate coverage [22].
Side channel analysis can also be used to detect Trojans in hardware systems. Side
channel analysis involves measuring electrical characteristics of the chip, like static current,
dynamic current, or power [22]. Static current analysis measures leakage current in the
device. Static CMOS gate idly leak some current when they are not switching. Measuring
differences in static current may be used to differentiate between golden circuits and those
with Trojans [26]. Transient current analysis measures switching current in an attempt
to measure whether more gates than expected are switching [27]. Unfortunately, process
variations in static current, switching current, and many other device parameters makes
side channel analysis techniques difficult and sometimes unreliable [22].
In [22], Bhunia et al. discuss a variety of different techniques to increase trust in 3rd-
party hardware IP. Particularly interesting are suggestions that hardware designers should
purchase multiple copies of a 3rd-party IP and programmatically compare them, as in [28].
Comparing different copies of the same IP may allow designers to find malicious modifi-
cations, working under the assumption that multiple 3rd-party vendors will not include the
same exact Trojan. This may be an extensive practice, but many Trojan tolerance schemes
rely on redundancy so purchasing multiple copies may be necessary regardless.
Hardware designers may also take advantage of proof-carrying code to ensure security
in their designs [29]. Hardware designers may agree on a set of formal proofs that a 3rd-
party IP vendor’s delivered IP must satisfy. Designers may anticipate that any malicious
changes will break these contracts. When the IP is delivered, the IP vendor must demon-
strate that all proofs still hold. Of course, the IP vendor may design proofs to accommodate
malicious modifications, but this is still somewhat more secure than receiving 3rd-party IP
code with no formal verification [22].
Despite the downfalls of many of these Trojan detection methods, there are no limits
22
to what types of Trojans most of them will detect. In that sense, these Trojan detection
methods might be more versatile than Trojan tolerance techniques that will only tolerate a
subset of all FPGA IP Trojans.
2.3.2 Trojan Tolerance Techniques
Trojan tolerance techniques aim to make hardware designs resistant to or tolerant of Trojans
that are not detected during runtime. There is a variety of design for security techniques
that aim to make hardware designs more difficult to infiltrate with a Trojan [22], but this
section will focus on run-time monitoring and other run-time techniques.
Run-time monitoring of Trojans is effective for all types of Trojans discussed in the
taxonomy, as long as the monitoring system is comprehensive enough [23]. Fault-injecting
Trojans can be monitored using an anomaly detection technique, while side effect-inducing
Trojans can be monitored by an system external to the FPGA observing activity on all
of the FPGA’s communication devices. In some cases, monitoring and checking before
output may not be enough to prevent information leakage, such as with advanced fault-
based cryptanalysis attacks [30].
Figure 2.5: Triple Modular Redundant FPGA System
Many run-time monitoring systems are based on a redundancy and majority voting
scheme, such a Triple Modular Redundancy (TMR) and its derivatives [2]. In these sys-
tems, three redundant copies of the same IP are used in conjunction with an oracle or
majority voter. Because the three redundant copies of the IP are purchased from different
23
vendors, it is unlikely that the same Trojan will exist in all 3 copies. The majority voter
determines the correct output based on what output most copies of the IP provided, and
uses that as the actual output to the system. This system only protects against logical er-
rors in the circuit, where a Trojan provides an incorrect result. The IPs are still able to
leak information or perform other malicious actions. Various improvements on this type
of system have been proposed. In [4], Mal-Sarkar et al. discuss a more energy-efficient
adaptation of TMR. In [3], Alanwar et al. suggest a variety of modular redundant systems,
including some that flag infected IPs and swap them out for unused, clean IPs using partial
reconfiguration features available in FPGAs.
Alanwar et al. also suggest a data obfuscation method called Simple Blockage (SB) [3].
SB involved obfuscating data before sending it through any communication device using
a key that is shared between the device and whatever system is receiving the information.
This can help mitigate Trojans that aim to leak information. If all output from the FPGA is
obfuscated sufficiently, attackers will not be able to use any leaked information. Of course,
SB cannot be used when an IP implements a communication protocol, as obfuscating it
may lead to the protocol no longer working.
Few or no Trojan tolerance strategies aim to tolerate all types of side effect-inducing
Trojans. Those that leak information are not affected by redundancy, and those that waste
FPGA resources are not affected by SB. Trojans that aim to disable or hijack the FPGA
are affected by neither strategy. Even though fault-injecting Trojans may be detected using
redundancy, a redundant system incurs large design area and power costs [1], and purchas-
ing additional copies of 3rd-party IP may be prohibitively expensive or impossible [3]. It
is clear that a new Trojan tolerance strategy is needed, in order to comprehensively tolerate
(or at least mitigate the effects of) all types of Trojans.
24
CHAPTER 3
EVOLUTIONARY ALGORITHMS AND EVOLVABLE HARDWARE
Evolvable hardware is a hardware design strategy where a hardware designer uses evolu-
tionary or genetic algorithms to produce a circuit, instead of designing the circuit by hand.
Evolvable hardware algorithms mutate and mate hardware configuratoins in order to im-
prove on them. They maintain a population consisting of some number of individuals, each
individual some representation of a hardware design. These algorithms iteratively improve
on the hardware design and eventually produce an optimal individual in the population.
This chapter discusses evolutionary algorithms and their uses in hardware design. In
particular, it examines the applications of evolutionary algorithms to FPGA programming.
This chapter also presents an evolvable hardware strategy that excels in the face of con-
straints imposed by complications in modern FPGA technology.
3.1 Evolutionary and Genetic Programming
Evolutionary algorithms use techniques inspired by biological evolution to improve some
metric in a population. They iteratively make modifications to individuals in a population
with the goal of improving the population over time. Evolutionary algorithms consist of
four main components - a population, a fitness function, a selection function, and a mutation
function.
The population is a set of individuals that represent whatever we are trying to create or
improve upon. In evolvable hardware, the population is some representation of a circuit de-
sign. For example, each individual may be a netlist, abstract syntax tree (AST) representing
a hardware design language (HDL) program, or an FPGA configuration bitstream. If we
choose to represent each individual as an AST, the algorithm is referred to as evolutionary
programming.
25
Figure 3.1: Sample Abstract Syntax Tree
In evolutionary (or genetic) programming, individuals are represented as abstract syntax
trees (ASTs). Figure 3.1 shows an abstract syntax tree. This AST represents the program:
Algorithm 1: Program in Sample AST 1 if x == 0 then 2 y := y + 1 3 end
A fitness function is an algorithm that evaluates and assigns a fitness to each individual
in a population. In the case of evolvable hardware, a fitness function may synthesize and
then run a test bench on a netlist, the fitness value representing what portion of tests passed
on the netlist. In genetic programming, an AST is converted to code in whatever language
it represents, and then that code is compiled and tested.
A selection function simply selects an individual from the population to be mutated and
added to the new population. Selection functions generally prefer more fit individuals but
do provide some randomness so that less fit individuals may make it to the next generation.
26
A common selection function is tournament selection, where a number of individuals are
compared one-on-one, with the more fit individual having a higher probability of winning
each match and being inserted into the next generation. Selection functions
Finally, the mutation function is an operation applied to individuals to slightly modify
them. For example, a mutation function may randomly place an additional gate in a netlist.
In evolutionary programming, a mutation function may choose a point in an AST to replace
with another randomly generated tree.
Figure 3.2: Mutation in an AST
Genetic algorithms introduce a second function used to modify individuals during each
generation. This operation is called a crossover (or mating) function. A one-point crossover
function chooses two individuals from the population and one point in each individual, and
swaps the individuals at those points. In genetic programming, a crossover function might
randomly choose one node in each AST and swap the entire subtrees at those nodes.
27
Figure 3.3: Crossover in an AST
Each iteration of the evolutionary algorithm is referred to as a generation. At each gen-
eration, every individual in the population is evaluated using the fitness function. Then, the
algorithm chooses individuals in the population to advance to the next generation using the
selection function. Each individual is mutated with some mutation probability and mated
with another individual with some crossover probability. Because the selection function
typically prefers more fit individuals, we expect the overall fitness of the population to in-
crease with each generation. However, most often the metric we care about is the fitness of
the fittest individual in the population, not the overall fitness across the entire population.
This process repeats until the algorithm reaches some stopping point: a generation limit, a
28
maximally fit individual, or some other metric chosen by the programmer.
In general, genetic algorithms follow this structure:
Algorithm 2: Genetic Algorithm 1 population := initializeRandomPopulation() 2 while numberOfGenerations ≤ maxGenerations do 3 foreach individual in population do 4 individual.fitness := evaluate(individual) 5 end 6 newPopulation := [] 7 while newPopulation.size ≤ population.size do 8 individual := selectFrom(population) 9 if random(0,1) ≤ crossoverProbability then
10 mate := selectFrom(population) 11 offspring := crossover(individual, mate) 12 newPopulation.add(offspring) 13 end 14 if random(0,1) ≤ mutationProbability then 15 offspring := mutate(individual) 16 newPopulation.add(offspring) 17 end 18 end 19 population := newPopulation 20 numberOfGenerations++ 21 end
Genetic programming algorithms are very much the same. The differences between ge-
netic programming and genetic algorithms in general lie in implementations of the fitness,
crossover, and mutation functions.
Evolutionary and genetic algorithms are useful to hardware designers because their
results are often competitive in terms of better performance, smaller area, lower power, etc.
with what human designers are capable of [31]. Hardware designers use genetic algorithms
to automate hardware design, produce results in an expanded search space human designers
would be unlikely to explore [32], or to make hardware systems adaptible without human
intervention [5, 33].
29
3.2 Evolvable Hardware In FPGAs
Evolvable hardware is most often implemented in FPGAs due to their reconfigurability.
Although evolvable hardware strategies can be used to design hardware for ASICs, evolv-
able hardware is most useful when a piece of hardware can be dynamically reconfigured
[33].
Most evolvable FPGA models have followed an intrinsic style of evolvable hardware.
We refer to an evolvable system as intrinsic when the evolutionary system is completely
contained within the hardware and the system is tested only in hardware, not in simulation
[32, 34]. This definition can be expanded on to introduce self-contained evolvable hard-
ware, where the entire evolutionary system is contained on the same piece of hardware. In
a self-contained system, the FPGA configuration is comprised of what are referred to as
an evolutionary module and a functional module [34]. The functional module contains the
evolved functionality the FPGA is responsible for. The evolutionary module is an oracle
responsible for evaluating and evolving the functional module.
Figure 3.4: Intrinsic Evolvable Hardware System
In 3.4 the evolutionary module produces a genome and programs the functional module,
while leaving itself intact. The evolutionary module evaluates the functional module’s
fitness and improves upon the genome.
Placing both modules on the same piece of hardware requires a number of restrictions
30
on the evolutionary algorithm. The evolutionary algorithm must not modify the portion
of the bitstream corresponding to the evolutionary module. If the portion of the bitstream
containing the evolutionary module is modified, the program may break itself. Further,
the evolutionary algorithm must not produce invalid FPGA configurations to avoid perma-
nently damaging the FPGA hardware [35]. These restrictions on the evolutionary algo-
rithm complicate the evolutionary module, limiting how much space is available for the
functional module. In general, intrinsic evolvable hardware systems severely limit what
functionality can be placed on an FPGA [34].
Extrinsic evolutionary models aim to fix some of these problems. An evolutionary
system is referred to as extrinsic when circuit configurations are simulated before imple-
menting them in hardware. An extrinsic system might only implement the fittest individual
from a population, while the rest are discarded after simulation. Extrinsic evolutionary al-
gorithms are not self-contained and separate the evolutionary hardware from the functional
hardware, for example running the evolutionary algorithm on a computer that manages the
FPGA [36]. This frees all of the FPGA hardware for the functional module, allowing the
algorithm to produce more complex hardware configurations. However, designers must
still ensure the algorithm will never produce an invalid bitstream or one that may damage
the FPGA hardware. Extrinsic evolutionary models come at the cost of additional hardware
used for generating and simulating the FPGA configuration before programming.
31
Figure 3.5: Extrinsic Evolvable Hardware System
In Figure 3.5 the external evolutionary system evolves a genome and uses it to program
the logic device. The evolutionary system monitors the logic device to measure fitness, and
improves upon the genome.
In modern FPGAs, the capabilities of evolutionary hardware are limited by the com-
plexity of bitstreams used to program the devices. Traditional methods of evolutionary
hardware design require manipulating FPGA bitstreams to produce different circuit config-
urations. Limiting evolutionary hardware to coherent circuits requires understanding how
the bitstream corresponds to lookup tables (LUTs) in the FPGA and limiting the evolution-
ary algorithm’s search space to coherent configurations [36]. The correspendence between
bitstream and configuration in modern FPGAs is proprietary knowledge not available to
the public. Additionally, many FPGAs encrypt bitstreams, further complicating the evolu-
tionary hardware design process. Any algorithm that evolves an FPGA bitstream requires
more knowledge about FPGA bitstreams than is available to the public about modern FP-
GAs [37]. These algorithms may necessitate an unencrypted FPGA bitstream, which may
be inappropriate for applications that require a high degree of security.
32
3.3 Genetic Programming-based Evolvable Hardware
This section is organized into three subsections. The first provides a background and
discusses the challenges solved and difficulties presented by genetic programming-based
evolvable hardware. The second discusses past work in generating Verilog HDL code. The
third discusses preliminary results in producing simple circuits using genetic programming,
in experimental results performed for this thesis.
3.3.1 Background and Justification
The idea of automatically generating Verilog HDL code has been explored in the past, by
Cullen [38] and Karpuzcu [39]. In [38], Cullen demonstrates using an extrinsic evolution-
ary programming technique referred to as Evolutionary Meta Programming to generate a
Verilog program. The program generated is a bit-slice of a full adder circuit, using behav-
ioral Verilog. In [39], Karpuzcu also builds a full adder using genetic programming. The
work demonstrates successfully building an entire adder module, including inputs and out-
puts, by evolving a circuit while following the Verilog grammar specification. These two
results are encouraging, and show that genetic programming may be a worthwhile approach
to building evolvable hardware systems.
A genetic programming-based approach to evolvable hardware might help to alleviate
issues with prior types of evolutionary hardware systems. Rather than evolving the FPGA
bitstream or a netlist, an evolutionary system should use genetic programming to produce
an FPGA program in a hardware design language (HDL), like Verilog or VHDL. A compre-
hensive system design for evolvable hardware using genetic programing to generate HDL
has not been explored. Such an evolutionary system will generate a correct HDL program
and synthesize it into an FPGA bitstream using the FPGA vendors place-and-route and
synthesis tools.
A genetic programming-based approach has many advantages over traditional evolu-
33
tionary hardware models, even extrinsic ones. It removes the need for algorithm design-
ers to add restrictions to the algorithm for adherence to an FPGA configuration standard.
In evolutionary hardware strategies that evolve an FPGA bitstream, it is possible to gen-
erate a bitstream that electrically damages the FPGA, and evolutionary hardware design
frameworks have even been proposed just to mitigate that possibility [35]. A genetic
programming-based is guaranteed to produce valid FPGA configurations as long as the
synthesis tools behave correctly. A valid HDL program compiled by an FPGA vendor’s
toolchain should not produce any electrical errors, though logical errors are of course pos-
sible and expected in an evolutionary system.
This approach can be expected to produce correct logic for any given circuit using a
smaller genome than by evolving the bitstream. For example, an adder can be represented in
Verilog code using 5 nodes in an abstract syntax tree (AST) one assignment, one addition,
two operands, and a destination for the assignment. In contrast, a creating 4-bit adder in a
bitstream requires configuring at least 256 lookup tables or LUTs.
However, the ability of such a system to produce efficient circuits is limited by the ca-
pabilities of the FPGA vendors synthesis tools. In one of the founding papers of evolvable
hardware, Thompson demonstrates that an evolved hardware design exploited quirks in the
FPGA hardware [32]. In this work, Thompson evolved an FPGA bitstream to produce
a pattern-recognizing circuit that was able to discriminate between 1kHz and 10kHz fre-
quency waves. The circuit used far fewer cells than a human-designed filter, but this was
due to the circuit exploiting unusual electrical properties of the cells in the FPGA. The cir-
cuit had logically disconnected components that when removed prevented the circuit from
working. Such a behavior may not be possible to replicate using FPGA synthesis tools.
3.3.2 Past Work
In [38], Cullen uses genetic programming to evolve a series of software programs, includ-
ing a full adder in Verilog seen in Figure 3. In this work, Cullen develops an evolution-
34
ary programming technique referred to as Evolutionary Meta Programming. Evolution-
ary Meta Programming is an evolutionary programming technique where compilation and
testing are completely separated from the evolutionary engine, as is the case in extrinsic
evolvable hardware and any genetic programming-based hardware model. An evolutionary
engine evolves a program’s genome, and creates a separate process on the computer for
evaluating each program, usually with a pipe in between the two programs to pass infor-
mation.
module chmain (x, y, z, s, c);
input x, y, z;
output s, c;
wire V0;
wire V1;
assign c = (z ˆ (˜ z & (z ˆ (x & (x + ˜y)))));
assign s = ((z + (˜c + (z + (˜c + (z ˆ (x & (x ˆ x))))))) + (x + y));
endmodule
Listing 3: Full Adder Slice [38]
Cullen’s results are relevant because this is the style of genetic programming necessi-
tated by FPGAs. Unfortunately, Cullen does not quantify results in this paper - there is
no mention of how many generations it took to evolve this circuit. Regardless, being able
to evolve a simple circuit using evolutionary programming is an encouraging result. It is
important to note that in this paper, Cullen evolved only the assignment statements and the
two miscellaneous wire declarations. The module declaration, inputs, and outputs were
fixed. Regardless, the result is still worthwhile - module declarations are usually specified,
not something that needs to be figured out by an algorithm.
In [39], Karpuzcu uses a genetic programming technique following the grammar of
Verilog to generate parse trees instead of generating ASTs. This evolutionary programming
strategy is referred to as Grammatical Evolution [40]. A grammar of a language specifies
35
< S > :: < blocking-assignment-s > < blocking-assignment-cout > < blocking-assignment-s > :: assign s = < rhs > ; < blocking-assignment-cout > :: assign cout = < rhs > ; < rhs > :: < binary-op > | < logical-not > < binary-op > :: < bitwise-and > | < bitwise-or > | < bitwise-xor > < bitwise-and >:: (< argument > & < argument > ) < bitwise-or > :: ( < argument > | < argument > ) < bitwise-xor > :: ( < argument > ˆ < argument > ) < logical-not > :: ! ( < argument >) < argument > :: < invar >| < binary-op-out > | < logical-not-out > < argument-out > :: < invar > | <binary-op-in> | < logical-not-in > < binary-op-out > :: < bitwise-and-out > | < bitwise-or-out > |
< bitwise-xor-out > < bitwise-and-out > :: (< argument-out > & < argument-out >) < bitwise-or-out > :: ( < argument-out > | < argument-out > ) < bitwise-xor-out > :: ( < argument-out > ˆ < argument-out > ) < binary-op-in > :: < bitwise-and-in > | < bitwise-or-in > |
< bitwise-xor-in > < bitwise-and-in > :: ( < invar > & < invar > ) < bitwise-or-in > :: ( < invar > | < invar > ) < bitwise-xor-in > :: ( < invar > ˆ < invar > ) < logical-not-out > :: ! ( < argument-out > ) < logical-not-in > :: ! ( < invar > ) < invar > :: a | b | cin
Figure 3.6: Reduced Verilog BNF [38]
the path a compiler will take while parsing the code. It describes how each type of statement
can expand. For example, an assignment statement right hand side can expand to either a
binary operation or a logical negation (Figure 3.6). Grammatical Evolution requires the
grammar of the language to be expressed in Backus-Naur Form (BNF) [41]. Individuals
are represented using variable-length genomes that essentially pick decisions in the parse
tree of the BNF [40].
Figure 3.6 provides the reduced BNF of the Verilog grammar used by Karpuzcu in [39].
This BNF provides the opportunity for the code generator in the algorithm to make two
types of statements: an assigment statement to the output variable s, and an assignment
statement to the output variable cout. Again the module header is predefined, and the
algorithm is only responsible for generating the correct expressions for assignment to both
the sum and the carry out variables.
36
Karpuzcu provides quantified results for generating the adder. The algorithm uses a
population size of 200, crossover probability of 0.5, and mutation probability of 0.1. In
2 of 35 runs, a completely correct idividual is generated. Those individuals took 50369
and 19772 fitness evaluations, respectively [39]. Although the results are not particularly
consistent, it is worthwhile that a completely correct individual was generated twice, using
a different evolutionary programming strategy.
module adder(a,b,cin,s,cout);
input a; input b; input cin;
output s; output cout;
assign s=(aˆ(bˆcin));
assign cout=(((bˆa)&(aˆcin))ˆa);
endmodule
Listing 4: Adder Evolved by Karpuzcu [39]
Listing 4 shows one adder generated in this work [39]. It is interesting to note that while
the logic generated was slightly more complicated than necessary (the cout assignment
could be simplified), it was significantly smaller than that generated by Cullen in [38], and
nearly ideal.
These results provide confidence in the idea that evolving Verilog (or other HDL) pro-
grams is the future of evolvable hardware.
3.3.3 Preliminary Results
To test this approach to evolvable hardware, we use the DEAP genetic programming frame-
work [42], introducing to the framework valid Verilog operators, to produce ASTs that can
be tested in the DEAP Python environment and are compiled and run on the Icarus Verilog
[43] FPGA simulator for fitness evaluation.
DEAP natively represents programs as a depth-first traversal of an AST, though the
program representation can be configured. For now, the the genetic algorithm is only al-
37
lowed to use the basic combinational logical operators available in Verilog AND, OR,
NOT, etc. These results use a genetic programming algorithm (as opposed to evolutionary
programming), adding a crossover function.
To demonstrate the model, a genetic programming algorithm is used to evolve an 8 to
1 MUX, an 11-input 1-output piece of hardware. Sample code to for a MUX evolution is
in fact provided in the DEAP framework documentation [44], and the problem was first
introduced by Koza in [45]. The algorithm uses population size of 1000, random growth
mutation with a probability of 0.3, and single-point crossover with a probability of 0.8.
Additionally, a hard tree depth limit of 80 nodes is added to prevent bloat. The initialized
population is a 1000 randomly generated ASTs with a depth between 3 and 5.
Preliminary testing shows that beginning with a randomly generated AST, the algorithm
can consistently produce a correct 8 to 1 MUX within 250 generations. Results also show
that various correct ASTs generated within a single run of the algorithm are fairly similar,
but ASTs generated in different runs of the algorithm often have no nodes in common aside
from the inputs. ASTs were graphed using the NetworkX Python library [46].
Figure 3.7: AST of 8 to 1 MUX from one run
38
Figure 3.8: Alternate AST of 8 to 1 MUX from same run
Figure 3.9: AST of 8 to 1 MUX from a second run
39
Figure 3.10: AST of 8 to 1 MUX from a third run
These results should give a good idea of whether genetic programming is scalable to
larger circuits. The circuits generated by Karpuzcu in [39] and Cullen in [38] are 3-input
circuits, significantly simpler than an 11-input circuit. The AST generated for an 8 to
1 MUX has 2048 possible input combinations, whereas a full adder has only 8. These
MUXes were evolved in approximately 250,000 evaluations, but that number may be im-
proved upon by tuning the algorithm over time.
Figures 3.7 through 3.10 provide four correct abstract syntax trees representing the
input-output relationship of an 8 to 1 MUX. Input bits IN0, IN1, and IN2 are the 3 select
bits, and the rest of the inputs are the 8 MUX input bits. Note the similarities in the ASTs in
Figures 3.7 and 3.8. These are two correct ASTs generated during the same evolution (i.e.
from the same seed population). Contrast these with Figures 3.9 and 3.10, both generated
from different random seeds.
These results suggest that evolutionary algorithms settle on a type of correctness and
produce homogeneity over time. This is likely due to an aggressive selection for more
correct individuals over less correct ones, and can be seen in any application of evolution-
40
ary algorithms. Once an individual becomes significantly more correct than the rest of a
population, it might be expected that it dominates the following generations. It is ques-
tionable whether this homogeneity is desirable. Exploring how to limit this homogeneity
by adjusting the algorithm’s parameters (or doing something radically different) may be a
worthwhile direction for future work.
3.4 Trust-Oriented Applications of Evolvable Hardware
This section explains past applications of evolvable hardware in FPGA trust and security,
and introduces the idea of removing Trojans from FPGAs or mitigating their effects using
evolvable hardawre.
One of the strongest applications of evolvable hardware has been in introducing adap-
tive fault tolerance to FPGA systems. Using evolvable computing systems to introduce
fault tolerance into a system was first discussed by Mange et al. in [47]. This work sug-
gests using evolvable computing systems to monitor and repair faults at run-time. Since the
proposal, much work has been done in the field of evolvable hardware-based fault tolerance
[48, 5, 49].
Fault tolerance built through evolvable hardware was explored by Canham et al. in
[48]. In this work, Canham et al. use evolutionary algorithms to develop a fault tolerant
hardware design. This work is particularly interesting because the hardware design did not
need to be evolved after a fault. The hardware design was created using an evolutionary
algorithm, and after it was finished faults were injected. This evolved design was more
innately fault tolerant than human-designed creations.
41
Table 3.1: Fault Tolerance in Evolved Hardware - Canham et al.
Circuit Type Number of Faults Number of Failures Failures per Fault
Standard 2443 414 0.169
Fault Tolerant 59267 796 0.0134
Canham et al. found that evolved circuits had more intrinsic fault tolerance. Figure
3.1 shows the difference in fault tolerance between standard and evolved fault tolerant
circuits [48]. This difference is striking - evolved circuits had 10 times the fault tolerance
of human-designed.
The fault tolerance introduced by evolvable hardware can be further improved by adding
adaptable evolutionary fault tolerance systems. These systems continuously monitor the
hardware and reconfigure it in an attempt to mitigate faults whenever a fault is detected.
This is a more costly practice, but may significantly improve fault tolerance in FPGA sys-
tems.
In [5], Larchev et al. present a fault tolerance system using evolvable hardware. The
purpose of this system is to repair FPGAs experiencing hardware faults while they are in
space, far from anybody who can replace the FPGA. Larchev et al. explore the ability
to generate correct circuits using evolvable hardware in the presence of multiple stuck-at
faults in the FPGA hardware. The work focuses on three circuits: a quadrature decoder, a
3-by-3 bit multiplier, a 3-by-3 bit adder, and a 4-to-7 decoder.
42
Table 3.2: Fault Tolerance Systems - Larchev et al.
Circuit Type Average Initial Average Final Average #
Correctness Correctness Fitness Evals
Quadrature Decoder 76.7% 99.5% 546,226
Multiplier 83.3% 95.83% 4,250,990
Adder 73.4% 94.38% -
Decoder 77.9% 99.2% -
This work demonstrates the ability for evolutionary algorithms to significantly improve
circuits in the presence of faults. Larchev et al. show results that are not perfect, but still
very promising [5].
Recent years have shown an increasing interest among both researchers and the general
public in hardware security. As hardware designers become more conscious of the security
implications of their designs, we should consider what applications evolutionary hardware
has to security and trust. In this section we present two methods to leverage our evolu-
tionary hardware model to improve an application’s security. Mal-Sarkar et al. proposed
a taxonomy of hardware trojans in FPGAs, and we propose methods to mitigate trojans in
FPGA IP and trojans in FPGA hardware that cause circuit malfunction [1].
3.4.1 Applications to Trojans in FPGA IP
As mentioned in earlier sections, Trojans in FPGA IP are a fairly new concept, and as
such there has been little to no work done on methods to mitigate the effects of these
trojans. Including unverified code in security-sensitive applications has not been a reality
of hardware design until fairly recently. Trojans in FPGA IP can be thought of as being
similar to malicious code in software applications. Ultimately, a trojan can manifest itself
43
in two ways - as either a fault or an unwelcome side effect - introduced in the code used to
program an FPGA.
Because these faults are introduced in the code, a genetic programming-based evolvable
hardware system is aptly equipped to repair them. This idea is promising because of the
great ability of evolutionary algorithms to repair errors in computer software [50], which is
very similar in structure FPGA IP [31]. Hardware designers can take advantage of a variety
of different fault detection techniques, like anomaly detection or triple modular redundancy
(in this case, redundant modules should be purchased from separate vendors) to monitor
for faults in the FPGA system. If a fault is detected, an evolutionary algorithm can be used
to remove the fault from the FPGA code and produce a more correct configuration.
Genetic programming’s applications to FPGA IP Trojan removal are discussed in depth
in the next section.
3.4.2 Applications to Trojans in FPGA Hardware
Trojans may also exist in FPGA hardware, i.e. in the FPGA chip [36, 1, 4, 10, 9]. As
more semiconductor manufacturing and fabrication moves overseas, hardware trojans are
becoming a greater concern to all chip designers. FPGAs are particularly vulnerable to
hardware trojans due to their easy-to-understand layouts compared to other more complex
and diverse chips, like CPUs. Hardware designers would like to know that their designs
will function correctly despite the possibility of trojans or other faults in their hardware.
One might expect a hardware trojan to manifest itself as a fault injection in an FPGA.
In this sense, a hardware trojan manifesting as a circuit malfunction is the same problem as
an accidental circuit malfunction due to a fabrication error or any other cause. Protecting
from hardware trojans can be viewed as a type of fault tolerance. Larchev et al. presented
findings on discovering and working around hardware faults using more traditional evolu-
tionary algorithms [5]. These findings show that evolutionary algorithms can be used to
produce correctly functioning FPGA circuits even in the face of hardware faults.
44
Hardware designers might expect to be able to use genetic programming in the same
way as these older evolutionary fault tolerance mechanisms. Results in figures 3.7 through
3.10 see that separate evolutions of a piece of hardware (in this case case, the 8 to 1 MUX)
produce a very diverse set of abstract syntax trees. If it can be found that different abstract
syntax trees correspond to significantly different FPGA configurations after running the
circuit through place-and-route and synthesis, then a genetic programming-based evolu-
tionary model will be able to work around hardware faults in the same manner as earlier
evolutionary fault tolerance algorithms that are no longer possible have.
45
CHAPTER 4
GENETIC PROGRAMMING-BASED EVOLVABLE HARDWARE FOR FPGA
SECURITY
Despite the various Trojan tolerance systems available to hardware designers working with
FPGAs, there is not a single system that can be used to protect against any type of Trojan.
Some Trojans, such as those that leak information, are able to produce their malicious effect
even in the presence of Trojan tolerance systems. Because evolvable hardware has been
used to reconfigure FPGAs to repair hardware faults, it is worthwhile to consider whether a
similar system can be used to repair hardware Trojans in FPGA IP. Genetic programming is
particularly appropriate, because it deals directly with the HDL code delivered to hardware
designers when they purchase 3rd-party IP. Rather than just tolerating Trojans, a system
using genetic programming should be able to completely repair Trojans, removing any
conceivable threat from FPGA IP.
This chapter proposes a novel system titled GENPEFS (GENetic Programming-based
Evolvable FPGA Security) for using genetic programming to remove Trojans from FPGA
IP. This system design and results have been submitted and in [14]. It then analyzes the
effectiveness of this approach on three hardware circuits infected with different types of
Trojans.
4.1 System Design
GENPEFS is a run-time Trojan tolerance extrinsic evolutionary system that uses a proces-
sor capable of FPGA synthesis and place-and-route to evolve a hardware configuration for
an FPGA. This processor continuously monitors the FPGA for any type of Trojan. Rather
than evolving the FPGA bitstream, the GENPEFS system uses genetic programming to pro-
duce a program in a hardware design language, like Verilog or VHDL. A similar approach
46
has been applied to generating Verilog code from scratch, and has produced promising re-
sults [38, 39]. A system using genetic programming to enable evolvable hardware should
generate a correct Verilog program and synthesize it into an FPGA bitstream using the
FPGA vendors place-and-route and synthesis tools [36]. Removing a Trojan from IP is ul-
timately a software problem, and past work has demonstrated that evolutionary and genetic
algorithms are effective at repairing software [6].
Because 3rd-party IP is often delivered as HDL code, GENPEFS is particularly useful
for removing any Trojans that are detected in IP. If a Trojan is not detected during testing, it
may be difficult to deal with its malicious effects once the FPGA is deployed. Most Trojan
tolerance schemes such as TMR are not equipped to deal with any Trojans whose payloads
are side effects. Even in a triply redundant system, infected FPGAs will still be able to leak
information. We expect GENPEFS to be particularly useful in removal of information-
leaking Trojans in FPGA applications where an FPGA cannot be immediately taken offline
for maintenance. The goal of GENPEFS is to remove Trojans from an infected IP block
at run-time while leaving the desired functionality intact. This approach provides many
advantages over traditional Trojan tolerance schemes. Rather than causing downtime by
bringing an entire FPGA offline, we prefer a solution that allows us to fix the FPGA while
it is deployed.
One difficulty in repairing Trojans in FPGA IP is the relative inability to trigger Trojans
during testing. Trojans are often intentionally hidden by their designers, and very difficult
to trigger during testing due to rare triggers. Even if a Trojan is caught at run-time, it may
be nearly impossible to recreate the conditions at test-time that triggered the Trojan, and
the Trojan may never be seen again. It would be useful to maintain a queue of candidate
FPGA configurations that have passed all tests after being evolved in the GENPEFS system.
Whenever a Trojan is detected at run-time, the system should mark the configuration as
infected, and replace it with a candidate system from the queue. It is worth considering
that smaller ASTs that consistently produce correct results might be Trojan-free, as they
47
have removed something from the tree that is rarely triggered.
Figure 4.1: GENPEFS System Design
Figure 4.1 shows a system diagram for an FPGA using GENPEFS. A genetic engine
continuously evolves FPGA configurations. All configurations that pass all test benches
in simulation are added to a queue of candidate configurations. A monitoring application
watches the FPGA for Trojans. Whenever a Trojan is detected by the monitoring system,
it prompts the synthesis of a candidate configuration. When the configuration is synthe-
sized, the FPGA is reprogrammed with the new configuration. The monitoring application
continues to watch the FPGA, and the process repeats as necessary.
A genetic programming-based approach has many advantages over traditional evolu-
tionary hardware models. It is guaranteed to produce valid FPGA configurations, and the
author of an evolutionary algorithm need not add restrictions to the algorithm to cause it to
produce valid configurations. We can expect such an approach to produce correct circuits
faster than evolving a bitstream due to the simplified and abstracted nature of HDLs. For
example, an adder can be represented in Verilog code using 5 nodes in an abstract syntax
tree (AST) one assignment, one addition, two operands, and a destination for the assign-
ment. In contrast, a creating 4-bit adder in a bitstream requires configuring at least 256
lookup tables or LUTs. However, its ability to produce efficient circuits is limited by the
capabilities of the FPGA vendors synthesis tools [32].
48
4.2 Experimental Approach
To demonstrate the effectiveness of GENPEFS, this work uses the DEAP genetic program-
ming framework [42] to produce valid Veriog programs using operators that are available
in Verilog. Initial fitness evaluation - testing combinational logic - is performed within the
DEAP simulator. When a fit individual is believed to have been created, the individual is
translated to Verilog and tested in the Icarus Verilog simulator [43]. The genetic algorithm
represents programs as an abstract syntax tree (AST) as seen in Figure 4.2. The AST using
boolean logic operations can be thought of as being similar to a digital circuit. Of course,
ASTs may also contain integer, floating point, etc. operators like + and /, and more complex
programming structures such as if statements and loops.
Figure 4.2: AST of correct 4 to 1 MUX
To demonstrate Trojan removal, this work analyzes removing Trojans from multiplex-
ers, fairly simple circuits. First this work uses a 4 to 1 MUX. An example of a program that
can be used to evolve a MUX in python from scratch is provided in the DEAP documenta-
49
tion and was originally explored by Koza in [45]. This work uses a population size of 1000,
random mutation with a probability of 0.3, and single-point crossover with a probability of
0.8. This work uses a two-parameter fitness function that measures program correctness as
the primary parameter and minimizes AST size as the secondary parameter. A hard AST
depth limit of 80 is imposed on the algorithm.
This process is used to measure the effectiveness of the GENPEFS Trojan removal
strategy:
1. Initialize a random population of hardware circuits
2. Evolve the population until there is a correct MUX in the population
3. Randomly insert a fault-injecting Trojan into the circuit
4. Replace the entire population with this individual
5. Continue evolving the circuit until the population has a correct individual
6. Measure the number of generations after inserting a Trojan it takes to have a com-
pletely correct individual in the population
The Trojans inserted into the circuit are randomly chosen from a variety of fault-
injecting Trojans. Some of these Trojans are transient (making them harder to detect),
while some of them always produce the same incorrect result. Plots are graphed using
Matplotlib [51].
4.3 Results
Results are shown in histograms, categorizing into bins of fewer than 1000, 2000, 4000,
8000, and more than 8000 fitness evaluations. 1000 fitness evaluations corresponds to one
generation. The results show that in a 4 to 1 MUX, the algorithm successfully removed
the Trojan from the circuit in under 1000 fitness evaluations in most of the runs. All of the
remaining runs took under 8000 fitness evaluations.
50
Figure 4.3: 4 to 1 MUX Number of Generations To Remove Trojans
Next, the same tests are retried instead using an 8 to 1 MUX. An 8 to 1 MUX has 2048
possible input combinations, as opposed to the 4 to 1 MUX’s 64. The 8 to 1 MUX also has
a significantly larger AST, as can be seen in Figure 4.4. This helps demonstrate that the
GENPEFS approach can scale to larger circuits.
51
Figure 4.4: AST of correct 8 to 1 MUX
Figure 4.5: 8 to 1 MUX Number of Generations To Remove Trojans
52
Removing a Trojan from the AST of an 8 to 1 MUX took approximately the same
number of generations as the 4 to 1 MUX. More Trojans took more than 4000 fitness
evaluations to remove. This is consistent with the idea that more complicated ASTs take
longer to remove Trojans from, but it is promising that the results look mostly the same.
Next, we run the same tests on an infected AES encryption module built using PyRTL
[52]. An AES encryption module is an example of an actual IP that hardware designers
may choose to purchase. It provides a much more realistic example of an IP that may be
infected. We can expect an encryption module to be a desirable IP to attack, as encryption
modules process large amounts of secret information. An infected encryption module may
allow attackers access to valuable information.
This module is infected with one Trojan that leaks information, a type of Trojan not
tested against GENPEFS yet (and one that that is unaffected by many Trojan tolerance
schemes). Each time the encryption module is used, the Trojan sends the secret to be
encrypted over a serial communication port, similar to the Trojan introduced in Figure 2.
This AES encryption module is a significantly more complex IP than either MUX, so it
should be expected that removal will take somewhat longer. The goal is to remove the
Trojans that leak the key and secret, while leaving the rest of the module intact. The trojan
is considered removed when the encryption module still passes the PyRTL test cases [52],
and the information leaking Trojan has been removed entirely from the circuit.
In the AES core with a single Trojan (Figure 4.6), removal was actually faster than
in the 8 to 1 MUX on average. This may be due to the all-or-nothing fitness of a more
complicated core. Even when there is an error in a MUX, it will evaluate many of the inputs
correctly. In the far more complicated AES core, a small change to the logic will make
every result incorrect. This means that the selection function will choose only completely
correct individuals for the next generation, and will tend to keep fewer of the mutations
between generations. Because misleading, almost-correct results are no longer likely, the
algorithm never gets stuck for a long time.
53
Figure 4.6: AES Encryption Module Number of Fitness Evaluations To Remove One Tro- jan
Finally the same tests are retried on the same AES core, this time infected with two Tro-
jans instead of just one. This test should help determine whether the approach is scalable
to systems with more than one Trojan present. Results are in Figure 4.7.
Although Trojan removal took longer than in either of the MUXes and in the AES
core with one Trojan, still the IP was Trojan-free after 8000 fitness evaluations in all but a
few runs. No runs were able to eliminate both Trojans in under 1000 fitness evaluations, or
within one generation. This is due to the nature of single point crossover. If the two Trojans
are at different points in the abstract syntax tree - the second Trojan is not within the first
Trojan’s subtree - then it is impossible for single point crossover to remove both Trojans
from the circuit in one operation. Future work might consider adding additional mutation
and crossover operations, such as two point crossover or a tree shrinking operation, to
examine whether those operations are able to more quickly eliminate Trojans from a circuit.
In the AES core, there was no difference between checking the structure of the circuit to
54
Figure 4.7: AES Encryption Module Number of Fitness Evaluations To Remove Two Tro- jans
make sure it was intact, and performing logic testing on the circuit. In other words, the only
situation in which the algorithm produced a correct individual was when the algorithm had
removed the Trojan and made no other changes to the circuit. The algorithm was not able
to make any changes to the rest of the circuit without breaking some of its functionality.
This leads to the idea that any time a component is removed from the circuit and it still
passes test benches, the removed component was unnecessary logic, possibly a Trojan.
These results demonstrate the value of genetic programming in repairing Trojans in
FPGA device IP, and especially in removing information-leaking Trojans. While most Tro-
jan tolerance strategies are unable to deal with information-leaking Trojans, GENPEFS is
able to remove such a Trojan with relative ease, returning the FPGA system to normal activ-
ity without human intervention. The ability to remove Trojans from FPGA IP should give
hardware designers confidence that their designs will be returned to normal operation after
a Trojan is detected. Future work should develop a metric for determining confidence in the
55
removal of a Trojan from a circuit, potentially using AST diversity or number of subcircuits
removed from the design to measure whether a Trojan might have been removed.
56
CHAPTER 5
CONCLUSION
This work has provided a novel classification of hardware Trojans in FPGA IP, based on
past work done in creating hardware Trojans in Verilog. This classification will allow
hardware designers to consider all possible types of Trojans when designing hardware. A
comprehensive classification helps assue hardware designers that they have covered all of
their bases, so to speak.
Future work in hardware Trojan classification will involve finding or creating new types
of hardware Trojans, particularly inspired by different types of Trojans found in software.
Although the proposed taxonomy is meant to cover all current types of Trojans, more work
is needed especially in examining Trojans that introduce other vulnerabilities into the sys-
tem. In order to more completely cover those Trojans, an analysis of different types of
vulnerabilities in FPGA systems is required.
This work has also presented GENPEFS, a novel approach to Trojan tolerance built
around using genetic programming to repair HDL code. This proposed approach to FPGA
security is novel in that it attempts to remove Trojans from the FPGA rather than just
working around them. Additionally, no past work has explored using genetic programming
to continuously monitor and repair FPGAs. This idea was largely inspired by the fault-
tolerant FPGA system proposed by Larchev et al. in [5]. Rather than just using evolvable
hardware to mitigate hardware faults, GENPEFS uses it to tolerate and remove Trojans
from an FPGA system. This can be useful when dealing with Trojans that are not tolerated
by standard Trojan tolerance strategies, such as Trojans that leak information or Trojans
that waste FPGA resources.
The experimental results of GENPEFS are very promising. GENPEFS has been shown
to be effective in small circuits, and appropriately scalable to the much larger, more prac-
57
tical AES core. Even in the complex AES core, two Trojans were removed within 8,000
fitness evaluations in most cases. This is fairly quick, and much faster than generating
even extremely simple circuits from scratch [39, 38]. So, it is more effective for hardware
designers to monitor and repair hardware designs purchased from 3rd-parties than it is to
attempt to generate them using evolvable hardware.
Future work in improving GENPEFS should involve examining even larger FPGA IP
blocks, like an entire design that makes use of encryption, multiple communication al-
gorithms, and many control structures. Demonstrating the usefulness of GENPEFS in
even larger systems should provide confidence in its ability to monitor complex real-world
FPGA systems.
Another direction for future work in GENPEFS is tuning the evolutionary parameters
to more quickly remove Trojans from the system. In all examined circuits, most runs
had the Trojans being removed within 10 generations, and often within 5. Avoiding the
outliers by tuning parameters like crossover and mutation probability, and potentially using
minimizing AST size as a metric for removing Trojans, are interesting future directions.
Finally, future work should examine the usefulness of GENPEFS in mitigating hard-
ware faults and hardware Trojans. Although GENPEFS evolves the AST, it is possible
that modifying the AST while leaving funcitonality intact will produce a differently-routed
FPGA configuration, avoiding Trojans or faults in hardware. It would be equally interest-
ing to examine how generating logic from scratch creates diverse routings after synthesis.
It is possible that producing diverse ASTs corresponds to diverse routings, meaning even if
a Trojan affects one generated circuit, another generated circuit might be immune. Exam-
ining how this diversity occurs and coming up with a diversiy metric for generated FPGA
circuitry is an interesting direction for future work.
58
REFERENCES
[1] S. Mal-Sarkar, A. Krishna, A. Ghosh, and S. Bhunia, “Hardware trojan attacks in FPGA devices,” Proc. the great lakes symposium on VLSI (GLSVLSI), pp. 287–292, 2014.
[2] Y. Li and K. Skadron, “TMR : A Solution for Hardware Security Designs,” 2015.
[3] A. Alanwar, M. A. Aboelnaga, Y. Alkabani, M. W. El-Kharashi, and H. Bedour, Dynamic fpga detection and protection of hardware trojan: A comparative analysis, 2017. arXiv: 1711.01010 [cs.CR].
[4] S. Mal-Sarkar, R. Karam, S. Narasimhan, A. Ghosh, A. Krishna, and S. Bhunia, “De- sign and Validation for FPGA Trust under Hardware Trojan Attacks,” IEEE Trans- actions on Multi-Scale Computing Systems, vol. 2, no. 3, pp. 186–198, 2016.
[5] G. V. Larchev and J. D. Lohn, “Evolutionary Based Techniques for Fault Tolerant Field Programmable Gate Arrays,” 2nd IEEE International Conference on Space Mission Challenges for Information Technology (SMC-IT’06), pp. 314–321, 2006.
[6] A. Arcuri, “Evolutionary repair of faulty software,” Applied Soft Computing Journal, vol. 11, no. 4, pp. 3494–3514, 2011.
[7] F. Wolff, C. Papachristou, S. Bhunia, and R. S. Chakraborty, “Towards trojan-free trusted ics: Problem analysis and detection scheme,” in 2008 Design, Automation and Test in Europe, 2008, pp. 1362–1365.
[8] X. Wang, M. Tehranipoor, and J. Plusquellic, “Detecting malicious inclusions in secure hardware: Challenges and solutions,” in 2008 IEEE International Workshop on Hardware-Oriented Security and Trust, 2008, pp. 15–19.
[9] M. Tehranipoor and F. Koushanfar, “A survey of hardware trojan taxonomy and detection,” IEEE Design Test of Computers, vol. 27, no. 1, pp. 10–25, 2010.
[10] R. S. Chakraborty, S. Narasimhan, and S. Bhunia, “Hardware trojan: Threats and emerging solutions,” in 2009 IEEE International High Level Design Validation and Test Workshop, 2009, pp. 166–171.
[11] H. Momeni, M. Masoumi, and A. Dehghan, “A practical fault induction attack against an fpga implementation of aes cryptosystem,” in World Congress on Internet Secu- rity (WorldCIS-2013), 2013, pp. 134–138.
59
[12] R. Karri, K. Wu, P. Mishra, and Y. Kim, “Concurrent error detection of fault-based side-channel cryptanalysis of 128-bit symmetric block ciphers,” in Proceedings of the 38th Annual Design Automation Conference, ser. DAC ’01, Las Vegas, Nevada, USA: ACM, 2001, pp. 579–584, ISBN: 1-58113-297-2.
[13] J. Breier and W. He, “Multiple fault attack on present with a hardware trojan im- plementation in fpga,” in 2015 International Workshop on Secure Internet of Things (SIoT), 2015, pp. 58–64.
[14] D. K.A. R. Z. Collins R. Jha, Submitted in ieee transactions on computer-aided design of intergated circuits and systems.
[15] I. Hadzic, S. Udani, and J. M. Smith, “Fpga viruses,” in Proceedings of the 9th International Workshop on Field-Programmable Logic and Applications, ser. FPL ’99, London, UK, UK: Springer-Verlag, 1999, pp. 291–300, ISBN: 3-540-66457-2.
[16] R. Druyer, L. Torres, P. Benoit, P. V. Bonzom, and P. Le-Quere, “A survey on security features in modern fpgas,” in 2015 10th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), 2015, pp. 1–8.
[17] L. Kafka, P. Kubalı́k, H. Kubatova, and E. Novak, “Fault classification for self- checking circuits implemented in fpga,” 2005.
[18] E. Netto, R Vaslin, J Crenne, P. Cotret, G. Gogniat, J.-P. Diguet, J.-L. Danger, P. Maurine, V Fischer, B. Badrignans, L Barthe, P Benoit, and L. Torres, “Security fpga analysis,” in. Jan. 2011, pp. 7–46, ISBN: 978-94-007-1338-3.
[19] M. Antonakakis, T. April, M. Bailey, M. Bernhard, E. Bursztein, J. Cochran, Z. Durumeric, J. A. Halderman, L. Invernizzi, M. Kallitsis, D. Kumar, C. Lever, Z. Ma, J. Mason, D. Menscher, C. Seaman, N. Sullivan, K. Thomas, and Y. Zhou, “Understanding the mirai botnet,” in Proceedings of the 26th USENIX Conference on Security Symposium, ser. SEC’17, Vancouver, BC, Canada: USENIX Association, 2017, pp. 1093–1110, ISBN: 978-1-931971-40-9.
[20] K. Ganesan, J. Jo, W. L. Bircher, D. Kaseridis, Z. Yu, and L. K. John, “System- level max power (sympo) - a systematic approach for escalating system-level power consumption using synthetic benchmarks,” in 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2010, pp. 19–28.
[21] A. Tripathi and U. K. Singh, “Towards standardization of vulnerability taxonomy,” in 2010 2nd International Conference on Computer Technology and Development, 2010, pp. 379–384.
60
[22] S. Bhunia, M. S. Hsiao, M. Banga, and S. Narasimhan, “Hardware trojan attacks: Threat analysis and countermeasures,” Proceedings of the IEEE, vol. 102, pp. 1229– 1247, 2014.
[23] S. Bhunia, M. Abramovici, D. Agrawal, P. Bradley, M. S. Hsiao, J. F. Plusquel- lic, and M. M. Tehranipoor, “Protection against hardware trojan attacks: Towards a comprehensive solution,” IEEE Design & Test, vol. 30, pp. 6–17, 2013.
[24] I. Chipworks, Semiconductor manufacturing - reverse engineering of semiconductor components, parts and process, http://chipworks.com, Accessed: 2019-02- 12.
[25] R. S. Chakraborty, F. Wolff, S. Paul, C. Papachristou, and S. Bhunia, “Mero: A statistical approach for hardware trojan detection,” in Cryptographic Hardware and Embedded Systems - CHES 2009, C. Clavier and K. Gaj, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 396–410, ISBN: 978-3-642-04138-9.
[26] J. Aarestad, D. Acharyya, R. Rad, and J. Plusquellic, “Detecting trojans through leakage current analysis using multiple supply pad iddqs,” Trans. Info. For. Sec., vol. 5, no. 4, pp. 893–904, Dec. 2010.
[27] D. Rai and J. Lach, “Performance of delay-based trojan detection techniques under parameter variations,” in Proceedings of the 2009 IEEE International Workshop on Hardware-Oriented Security and Trust, ser. HST ’09, Washington, DC, USA: IEEE Computer Society, 2009, pp. 58–65, ISBN: 978-1-4244-4805-0.
[28] T. Reece, D. Limbrick, and W. Robinson, “Design comparison to identify malicious hardware in external intellectual property,” Nov. 2011, pp. 639–646.
[29] E. Love, Y. Jin, and Y. Makris, “Proof-carrying hardware intellectual property: A pathway to trusted module acquisition,” Trans. Info. For. Sec., vol. 7, no. 1, pp. 25– 40, Feb. 2012.
[30] S.-M. Yen and M. Joye, “Checking before output may not be enough against fault- based cryptanalysis,” IEEE Trans. Computers, vol. 49, pp. 967–970, 2000.
[31] J. R. Koza, “Human-competitive results produced by genetic programming,” Genetic Programming and Evolvable Machines, vol. 11, no. 3-4, pp. 251–284, 2010.
[32] A. Thompson, “An evolved circuit, intrinsic in silicon, entwined with physics,” in Evolvable Systems: From Biology to Hardware, T. Higuchi, M. Iwata, and W. Liu, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 1997, pp. 390–405, ISBN: 978- 3-540-69204-1.
61
[33] P. C. Haddow and A. M. Tyrrell, “Evolvable hardware challenges: Past, present and the path to a promising future,” in Inspired by Nature: Essays Presented to Julian F. Miller on the Occasion of his 60th Birthday, S. Stepney and A. Adamatzky, Eds. Cham: Springer International Publishing, 2018, pp. 3–37, ISBN: 978-3-319-67997-6.
[34] G. Barlow and M. A. Edwards, “Self-evolving hardware,” Apr. 2004.
[35] D. Levi and S. A. Guccione, “Geneticfpga: Evolving stable circuits on mainstream fpga devices,” in Proceedings of the First NASA/DoD Workshop on Evolvable Hard- ware, 1999, pp. 12–17.
[36] L. Sekanina, “Evolutionary hardware design,” Proceedings of SPIE - The Interna- tional Society for Optical Engineering, vol. 8067, May 2011.
[37] R. Salvador, “Evolvable hardware in fpgas: Embedded tutorial,” in 2016 Interna- tional Conference on Design and Technology of Integrated Systems in Nanoscale Era (DTIS), 2016, pp. 1–6.
[38] J. Cullen, “Evolutionary meta programming,” GEC ’09: Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation, pp. 81–88, 2009.
[39] U. R. Karpuzcu, “Automatic verilog code generation through grammatical evolu- tion,” in Proceedings of the 7th Annual Workshop on Genetic and Evolutionary Com- putation, ser. GECCO ’05, Washington, D.C.: ACM, 2005, pp. 394–397.
[40] M. O’Neill and C. Ryan, “Grammatical evolution,” IEEE Transactions on Evolu- tionary Computation, vol. 5, no. 4, pp. 349–358, 2001.
[41] C. Ryan, J. Collins, and M. O. Neill, “Grammatical evolution: Evolving programs for an arbitrary language,” in Genetic Programming, W. Banzhaf, R. Poli, M. Schoe- nauer, and T. C. Fogarty, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 1998, pp. 83–96, ISBN: 978-3-540-69758-9.
[42] F.-M. De Rainville, F.-A. Fortin, M.-A. Gardner, M. Parizeau, and C. Gagné, “Deap: A python framework for evolutionary algorithms,” in Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation, ser. GECCO ’12, Philadelphia, Pennsylvania, USA: ACM, 2012, pp. 85–92, ISBN: 978-1-4503- 1178-6.
[43] Icarus verilog, http://iverilog.icarus.com/home, Accessed: 2019-02- 12.
[44] Multiplexer 3-8 problem, https://deap.readthedocs.io/en/master/ examples/gp_multiplexer.html, Accessed: 2019-02-13.
62
[45] J. R. Koza, Genetic programming: On the programming of computers by means of natural selection. Cambridge, MA, USA: MIT Press, 1992, ISBN: 0-262-11170-5.
[46] D. A. Schult, “Exploring network structure, dynamics, and function using networkx,” in In Proceedings of the 7th Python in Science Conference (SciPy, 2008, pp. 11–15.
[47] D. Mange and M. Tomassini, Bio-inspired computing machines: Towards novel com- putational architectures. PPUR Presses Polytechniques, 1998.
[48] R. O. Canham and A. M. Tyrrell, “Evolved fault tolerance in evolvable hardware,” in Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No.02TH8600), vol. 2, 2002, 1267–1271 vol.2.
[49] L. Ionescu, A. Mazare, and G. Şerban, “Intrinsic evolvable hardware used for fault tolerance systems,” Int. J. Organ. Collect. Intell., vol. 3, no. 2, pp. 43–80, Apr. 2012.
[50] W. Weimer, S. Forrest, C. Le Goues, and T. Nguyen, “Automatic program repair with evolutionary computation,” Commun. ACM, vol. 53, no. 5, pp. 109–116, May 2010.
[51] J. D. Hunter, “Matplotlib: A 2d graphics environment,” Computing In Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007.
[52] Pyrtl, https://pyrtl.readthedocs.io/en/latest/, Accessed: 2019- 03-01.
63
- Title Page
- Acknowledgments
- Table of Contents
- List of Tables
- List of Figures
- Introduction
- Hardware Trojans
- Taxonomies
- Taxonomy of Trojans in FPGA IP
- Trojans that Cause Malfunction
- Trojans that Prevent FPGA Operation
- Trojans that Inject Faults
- Trojans that Cause Side Effects
- Trojans that Leak Information
- Trojans that Waste FPGA Resources
- Trojans that Introduce Vulnerabilities
- Existing Trojan Mitigation Strategies and FPGA IP
- Trojan Detection Techniques
- Trojan Tolerance Techniques
- Evolutionary Algorithms and Evolvable Hardware
- Evolutionary and Genetic Programming
- Evolvable Hardware In FPGAs
- Genetic Programming-based Evolvable Hardware
- Background and Justification
- Past Work
- Preliminary Results
- Trust-Oriented Applications of Evolvable Hardware
- Applications to Trojans in FPGA IP
- Applications to Trojans in FPGA Hardware
- Genetic Programming-based Evolvable Hardware for FPGA Security
- System Design
- Experimental Approach
- Results
- Conclusion
- References
sources/142/Silwal - 2013 - Asynchronous Physical Unclonable Function using FP.pdf
A Thesis
entitled
Asynchronous Physical Unclonable Function using FPGA-based Self-Timed Ring
Oscillator
by
Roshan Silwal
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Master of Science Degree in Electrical Engineering
_________________________________________
Dr. Mohammed Y Niamat, Committee Chair
_________________________________________
Dr. Robert C. Green II, Committee Member
_________________________________________
Dr. Weiqing Sun, Committee Member
_________________________________________
Dr. Patricia R. Komuniecki, Dean
College of Graduate Studies
The University of Toledo
August 2013
Copyright 2013, Roshan Silwal
This document is copyrighted material. Under copyright law, no parts of this document
may be reproduced without the expressed permission of the author.
iii
An Abstract of
Asynchronous Physical Unclonable Function using FPGA-based Self-Timed Ring
Oscillator
by
Roshan Silwal
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Master of Science Degree in Electrical Engineering
The University of Toledo
August 2013
Field Programmable Gate Array (FPGA) security has emerged as a challenging
security paradigm in system design. Systems implemented on FPGAs require secure
operations and communication. There is a growing concern over the security attributes of
FPGAs regarding protecting and securing information processed within them, protecting
designs during distribution and protecting intellectual property rights. One of the
important aspects of improving the trustworthiness level of FPGAs is enhancing the
physical security of FPGAs. A Physical Unclonable Function (PUF) provides a means to
enhance physical security of Integrated Circuits (ICs) against piracy and unauthorized
access. PUFs exploit the inherent and embedded randomness that occurs during the
fabrication process of silicon devices.
This thesis presents a novel FPGA-based PUF design technique using
asynchronous logic. Significant process variations exist in IC fabrication, which makes
each IC unique in its delay characteristics. The statistical delay variation in transistors
and wires across FPGA chips is exploited through identically laid-out asynchronous ring
oscillators. The asynchronous ring oscillators generate oscillations of varying frequencies
iv
when the oscillators are identically mapped on a semiconductor device. These varying
frequencies produced by identically mapped self-timed ring oscillators are used to
generate unique PUF response bits, which are used in device authentication and
cryptographic applications such as generating secret keys and True Random Number
Generator (TRNG). Experimental analysis shows that asynchronous oscillators of PUFs
generate oscillations of varying frequencies, and the uniqueness for the PUF responses is
49.92%, which is very close to the desired 50% factor.
This thesis is dedicated to my parents, my sisters and my lovely wife.
vi
Acknowledgements
I would like to express my deep sense of gratitude to my thesis supervisor, Dr.
Mohammed Niamat, for giving me an opportunity to work with him in this research and
providing me a tremendous level of support and cooperation throughout my research
work and graduate studies.
I would also like to thank the thesis committee member Dr. Robert C. Green II
and Dr. Weiqing Sun for their valuable time in reviewing this thesis.
The research work in this thesis was supported in part by National Science
Foundation (NSF) grant award #203687.
vii
Table of Contents
Abstract ............................................................................................................................. iii
Acknowledgements .......................................................................................................... vi
Table of Contents ............................................................................................................ vii
List of Tables .................................................................................................................. xi
List of Figures .................................................................................................................. xii
List of Abbreviations .......................................................................................................xv
List of Symbols .............................................................................................................. xvii
1 Introduction .......................................................................................................1
1.1 Context and Motivation ....................................................................................1
1.2 Contributions………..........................................................................................3
1.3 Thesis Outline ....................................................................................................4
2 Physical Unclonable Functions .............................................................................5
2.1 Introduction .......................................................................................................5
2.2 PUF Terminologies ............................................................................................6
2.2.1 Significance of Process Variations .....................................................6
2.2.2 Environmental Variations ...................................................................7
2.2.3 Challenge-Response Pairs ...................................................................8
2.3 Sources of Noise ...............................................................................................8
2.3.1 Noise due to Manufacturing Process ..................................................8
viii
2.3.2 Local Noise .........................................................................................8
2.3.3 Environmental Noise ..........................................................................9
2.4 Measure of Quality ............................................................................................9
2.4.1 Uniqueness ..........................................................................................9
2.4.2 Reliability ..........................................................................................10
2.4.3 Resiliency ..........................................................................................11
2.5 PUF Classifications ..........................................................................................11
2.5.1 Non-electronic PUF, Electronic PUF and Silicon PUF ....................12
2.5.2 Strong PUF and Weak PUF ..............................................................13
2.5.3 Intrinsic PUF and Non-intrinsic PUF ...............................................13
2.6 PUF Circuits.....................................................................................................14
2.6.1 Delay-based PUF ..............................................................................14
2.6.1.1 Arbiter PUF ........................................................................14
2.6.1.2 Ring Oscillator PUF ...........................................................15
2.6.1.3 Glitch PUF .........................................................................17
2.6.2 Memory-based PUF ..........................................................................18
2.6.2.1 SRAM PUF ........................................................................18
2.6.2.2 Butterfly PUF .....................................................................19
2.7 PUF Applications .............................................................................................20
3 Self-Timed Rings ..................................................................................................24
3.1 Introduction .....................................................................................................24
3.2 Asynchronous Circuits .....................................................................................24
3.3 Asynchronous Logic .......................................................................................25
ix
3.3.1 Muller C-element ..............................................................................27
3.4 Self-Timed Rings .............................................................................................28
3.4.1 Self-Timed Ring Structure ................................................................28
3.4.2 Token and Bubble Propagation ........................................................30
3.4.3 Jitter in Inverter RO and Self-Timed RO ..........................................31
4 Asynchronous Approach to Ring Oscillator for FPGA-based PUF Design ...33
4.1 Introduction .....................................................................................................33
4.2 FPGA Architecture ..........................................................................................34
4.2.1 Architecture of Spartan-II .................................................................35
4.3 LUT Implementation of Muller Gate ...............................................................36
4.4 Logical Implementation of a Self-Timed Ring Oscillator ...............................39
4.5 Experimental Results .......................................................................................42
4.6 Conclusion .......................................................................................................44
5 STRO-PUF: Self-Timed Ring Oscillator based PUF ........................................45
5.1 Introduction .....................................................................................................45
5.2 Architecture of STRO-PUF .............................................................................46
5.3 Implementation of STRO-PUF .......................................................................47
5.4 Experimental Analysis .....................................................................................50
5.4.1 Analysis of Output Frequencies ........................................................51
5.4.2 Analysis of Uniqueness of STRO-PUF ............................................54
5.4.3 FPGA Authentication using STRO-PUF ..........................................59
5.4.4 Reliability Enhancement with STRO-PUF .......................................60
5.5 Conclusion ......................................................................................................61
x
6 Conclusion .......................................................................................................63
6.1 Conclusion .......................................................................................................63
6.2 Future Directions .............................................................................................64
References .........................................................................................................................66
A Source Codes .......................................................................................................73
A.1 VHDL Code for a Self-Timed Ring (STR) .....................................................73
A.2 VHDL Code for STRO-PUF...........................................................................78
A.3 UCF File for Mapping STRO-PUF in a Desired Region ................................83
A.4 Uniqueness Analysis of STRO-PUF for 16-bit Response ..............................88
A.5 Uniqueness Analysis of STRO-PUF for 256-bit Response ............................92
xi
List of Tables
2.1 Different types of PUFs .........................................................................................11
4.1 LUT mapping of reset Muller gate ........................................................................38
4.2 LUT mapping of set Muller gate ...........................................................................38
4.3 Frequency values for implemented asynchronous ring oscillators ........................44
5.1 16-bit STRO-PUF responses..................................................................................55
5.2 256-bit STRO-PUF responses................................................................................57
5.3 Comparing responses with dependent bits and independent bits...........................58
5.4 Uniqueness results for FPGA-based PUFs ............................................................58
xii
List of Figures
2-1 Optical PUF ...........................................................................................................12
2-2 An Arbiter PUF delay circuit .................................................................................15
2-3 Ring Oscillator PUF ...............................................................................................16
2-4 RO-PUF generating a single response bit ..............................................................16
2-5 Anderson PUF ........................................................................................................18
2-6 SRAM Cell.............................................................................................................18
2-7 Butterfly PUF cell ..................................................................................................19
2-8 Secret key generation using PUF ...........................................................................21
2-9 HRNG using PUF ..................................................................................................22
3-1 Synchronous circuit ...............................................................................................26
3-2 Asynchronous circuit .............................................................................................26
3-3 Abstract data-flow view of an asynchronous circuit..............................................26
3-4 Standard Muller gate and its truth table .................................................................28
3-5 Implementations of Muller C-element ...................................................................29
3-6 Three stage pipeline and ring .................................................................................29
3-7 An N-stage self-timed ring.....................................................................................29
3-8 Token-bubble propagation .....................................................................................31
3-9 Burst mode propagation and evenly-spaced mode propagation ............................31
4-1 A typical FPGA architecture ..................................................................................34
xiii
4-2 Structure of a typical logic block ...........................................................................35
4-3 Spartan-II slice .......................................................................................................36
4-4 A stage in STR .......................................................................................................37
4-5 VHDL instantiation of reset Muller gate ...............................................................39
4-6 LUT-based four-stage asynchronous ring oscillator ..............................................40
4-7 Technology schematic view of 6-stage self-timed ring oscillator .........................41
4-8 Implementation of 6-stage self-timed ring oscillator .............................................42
4-9 Placement constraint used to define position of stages of self-timed ring .............42
4-10 Simulation result of 6-stage STR oscillator with TTBBBB configuration ............43
4-11 Simulation result of 6-stage STR oscillator with TTTTBB configuration ............43
4-12 Real output of 6-stage STR oscillator with TTBBBB ...........................................43
5-1 Architecture of the proposed STRO-PUF ..............................................................47
5-2 Six-stage asynchronous ring oscillator ..................................................................48
5-3 Hard-macro implemented as 6-stage asynchronous ring oscillator .......................48
5-4 Layout view of an STRO-PUF implemented.........................................................49
5-5 Portion of an STRO-PUF in FPGA Editor ............................................................49
5-6 PUFs mapped on six different regions ...................................................................50
5-7 PUF outputs in initialization mode and oscillation mode ......................................51
5-8 Simulation result of STRO-PUF output frequencies .............................................52
5-9 Portion of STRO-PUF output frequencies in a logic analyzer ..............................53
5-10 Distribution of frequencies generated by asynchronous ring oscillator .................53
5-11 Uniqueness Analysis for 16-bit PUF response ......................................................55
5-12 Uniqueness Analysis for 256-bit PUF response ....................................................56
xiv
5-13 FPGA authentication using STRO-PUF ...............................................................59
5-14 Effect of temperature and voltage on oscillator frequencies .................................61
xv
List of Abbreviations
ASIC ..........................Application Specific Integrated Circuits
BPUF..........................Butterfly PUF
CLB ............................Configurable Logic Block
CLK............................Clock
CRP ............................Challenge-Response Pair
ECC ............................Error Correcting Code
EDA ...........................Electronic Design Automation
EMI ............................Electro-Magnetic Interference
ERAI ..........................Electronics Resellers Association International
FF ...............................Flip-Flop
FPGA .........................Field Programmable Gate Array
HD ..............................Hamming Distance
HRNG ........................Hardware Random Number Generator
I/O ..............................Input / Output
IC................................Integrated Circuit
IP ................................Intellectual Property
IRO .............................Inverter Ring Oscillator
ITRS ...........................International Technology Roadmap for Semiconductors
LAB............................Logic Array Block
LC ..............................Logic Cell
LE ...............................Logic Element
LUT ............................Look-Up-Table
MUX ..........................Multiplexer
NIST ...........................National Institute of Standards and Technology
OEM ...........................Original Equipment Manufacturer
xvi
PDF ............................Probability Density Function
PMF............................Probability Mass Function
PUF ............................Physical Unclonable Function
RFID ..........................Radio Frequency Identification
RO ..............................Ring Oscillator
RO-PUF .....................Ring Oscillator based Physical Unclonable Function
RTL ............................Register Transfer Level
SR ...............................Set / Reset
SRAM ........................Static Random Access Memory
STR ............................Self-Timed Ring
STRO .........................Self-Timed Ring Oscillator
STRO-PUF .................Self-Timed Ring Oscillator based Physical Unclonable Function
TRNG .........................True Random Number Generator
UCF ............................User Constraint File
VHDL ........................VHSIC Hardware Description Language
VLSI ...........................Very Large Scale Integration
xvii
List of Symbols
ack ..............................acknowledge signal
B .................................Bubble
C .................................Muller C-element of Muller gate
F .................................Forward input of Muller gate
f ..................................frequency
MHz ...........................Mega-Hertz
N .................................Number of stages in a ring oscillator
NB ...............................Number of bubbles
ns ................................nano-seconds
NT ...............................Number of tokens
Q .................................Current output state of Muller gate
Q’ ...............................previous output state of Muller gate
R .................................Reverse input of Muller gate
R’i ...............................Response bit from chip i in different environmental conditions
R’i,y .............................y th
sample of R’i Ri ................................Response bit from chip i
SR ...............................Set/Reset Signal
T .................................Token
TV ...............................Target value
1
Chapter 1
Introduction
1.1 Context and Motivation
FPGAs are being increasingly used in products and systems of all kinds; FPGAs
often form the core of any system. FPGAs are dominating a wide range of application
areas including military, defense, space, automotive and consumer electronics. This rise
in both the usage and importance of FPGAs in systems makes protecting the IP contained
in FPGAs as important as protecting the data processed by the FPGA. There has been a
growing concern over the security attributes of FPGAs regarding protecting and securing
information processed within them, protecting designs during distribution and protecting
intellectual property rights [1]. The design security is often thought of in terms of
protecting Intellectual Property (IP); however, potential losses extend beyond just the
financial. With the increasing use of programmable logic beyond commercial markets to
avionic, space and military applications, design security takes on the additional aspects of
safety and national security.
As FPGAs are being used in more applications that require security features,
attackers look for vulnerabilities and developers for defenses. Cloning, overbuilding,
reverse engineering and tampering are the major security vulnerabilities of FPGAs. These
2
threats can have far-reaching consequences ranging from counterfeiting to espionage, and
are faced by corporations and governments alike [2]. Cloning is making an illegal replica
of an original design without understanding the exact details of the design. The attacker
simply considers the original design as a black-box to copy the design to resell without
making an investment in the initial design effort. Cloning not only harms the revenue of
the Original Equipment Manufacturer (OEM) but also affects the OEM’s reputation
because of the poor quality of cloned products. Overbuilding is the easiest form of design
theft, which occurs when a subcontractor builds more units than have been ordered for
fabrication by an OEM. The overbuilt units produced are identical to the originals, which
makes identification difficult. Reverse engineering is making functionally equivalent
designs from an existing design by probing details of the original design. An adversary
can use this information to either develop effective countermeasures or to produce similar
equipment. In FPGAs, bitstream reversal can transform the encoded bitstream into a
functionally equivalent description of the original design. Tampering is an attempt to gain
unauthorized access to an electronic system. Tampering can either be part of a reverse
engineering program, or it can have a malicious motive.
Recently, electronic industries have been facing an increased amount of hardware
counterfeits. The increased complexity in the supply chain system of electronic
components has made counterfeit components easily available in the gray market. These
counterfeit components, when assembled into a product or a system, cannot only degrade
its performance and reliability but also create safety issues. Increasing incidents have
been reported to the Electronics Resellers Association International (ERAI) since 2008.
In 2011, there were more than 1,300 counterfeit incidents reported from around the
3
world. This number is more than double the number reported in 2010 and 2008, and
quadruple the number reported in 2009 [3].
Physical Unclonable Function (PUF) [4, 5] provides a means to enhance physical
security of Integrated Circuits (ICs) against piracy and unauthorized access. A PUF is
used to solve various security issues, such as chip authentication, cryptographic key
generation, software licensing, Intellectual Property (IP) protection, and detection and
prevention of IC counterfeiting.
Although a Self-Timed Ring (STR) is well studied in many contexts, there has
been limited work done in the field of hardware security and hardware cryptography. The
work in this thesis is also motivated by the fact that there is no previous work on the
FPGA-based implementation of PUFs using asynchronous logic. Self-timed rings are
considered robust to environmental variations, [6, 7] and this feature of the self-timed
ring oscillator is explored to build robust PUFs that strengthen the PUF responses. The
terms ‘asynchronous ring’ and ‘self-timed ring’ are used interchangeably throughout this
thesis.
1.2 Contributions
The major contributions of the work described in this thesis are as follows:
Introduces a Look-Up-Table (LUT) based implementation of asynchronous ring
oscillators for PUF design.
Proposes a novel PUF design approach using self-timed ring oscillators. The
proposed PUF is given a name; ‘Self-Timed Ring Oscillator PUF’ (STRO-PUF).
Experimental analyses are performed on real semiconductor devices. Previous
work [8] on an asynchronous PUF was limited to electrical simulations.
4
1.3 Thesis Outline
This thesis is organized as follows:
Chapter 2 gives an overview of Physical Unclonable Functions (PUFs) including
PUF definitions, terminologies related to PUFs, PUF quality measures, different types of
PUFs and applications of PUFs.
Chapter 3 gives a brief introduction of asynchronous logic and asynchronous
circuits to design a Self-Timed Ring (STR), also called an asynchronous ring. It
discusses the structure of a self-timed ring oscillator using Muller C-element and the
propagation mode of oscillation in the ring.
Chapter 4 focuses on two major implementations required for the proposed PUF
design; LUT-based implementation of Muller C-element and the asynchronous approach
to the ring oscillator for implementing the Self-Timed Ring (STR) on FPGAs. This
chapter explains the technique for logical implementation of the self-timed ring oscillator
using an underlying FPGA architecture.
Chapter 5 discusses the architecture and the detailed implementation of the
proposed Self-Timed Ring Oscillator based PUF (STRO-PUF). The experimental
analyses are performed to validate the design for calculating PUF uniqueness and
analyzing variation in output frequencies of asynchronous ring oscillators.
Finally, Chapter 6 concludes the thesis and presents ideas for future work.
5
Chapter 2
Physical Unclonable Functions
2.1 Introduction
The security in Integrated Circuits (IC) has become an important issue due to high
information security requirements. One of the important aspects of improving the
trustworthiness level of semiconductor devices and the semiconductor supply chain is
enhancing physical security. These semiconductor devices demand both computational
security and physical security. Physical Unclonable Function (PUF) [4, 5] provides a
means to enhance physical security of Integrated Circuits (ICs) against piracy and
unauthorized access. This chapter discusses PUF definitions, terminologies related to
PUFs, PUF quality measures, different types of PUFs and applications of PUFs.
PUFs exploit the inherent delay characteristics of wires and transistors that differ
from chip to chip due to manufacturing process variations [9]. These complex physical
characteristics of ICs are used to generate unique signatures which are random,
unpredictable and difficult to reproduce. A PUF generates a set of responses while
stimulated by a set of input challenges. The challenge response relation is defined by
complex physical properties of the material, such as process variability of semiconductor
devices.
6
PUFs increase physical security by generating volatile secrets in digital form
while the chip is in operation. Secret keys are essential to many security related
applications. Storing secrets in a non-volatile memory is not only expensive but can also
be an easy target for invasive attacks[1]. A PUF offers an inexpensive and secure
approach for generating secret keys. A PUF generates a unique response, or output bits
for each challenge, or input bits. This feature of PUF is used to solve various security
issues, such as chip authentication, cryptographic key generation, software licensing,
Intellectual Property (IP) protection, and detection and prevention of IC counterfeiting.
2.2 PUF Terminologies
2.2.1 Significance of Process Variations
Significant process variations exist in IC fabrication, which makes each IC unique
in its delay characteristics [10]. These variations exist die-to-die (inter-die) or within a die
(intra-die). Die-to-die parameter fluctuations resulting from lot-to-lot, wafer-to-wafer,
and a portion of the within-wafer variations affect every element on a chip equally.
Within-die parameter fluctuations consisting of both random and systematic components
produce a non-uniformity of electrical characteristics across the chip. These variations
occur during various fabrications steps. The lot-to-lot and wafer-to-wafer variations
include process temperatures and pressures, equipment properties, wafer polishing, and
wafer placement. The within-wafer variations affect both die-to-die and within-die
variations. Across a die, device delays vary due to mask variations and placement of
dopant atoms in the device channel region. Variability in device parameters, such as
effective channel length, threshold voltage and gate oxide thickness results in different
characteristics of circuit elements in a chip.
7
The process variation is becoming more difficult to control in modern Very Large
Scale Integration (VLSI) designs due to the continuous reduction in feature size. Process
variations in nanometer technologies are becoming more significant for cutting-edge
FPGAs. Though FPGA has a regular fabric with replicated layout tiles, the design-
dependent systematical variation is significant in advanced technology [11]. A
manufacturer resistant PUF can be created by exploiting statistical delay characteristics
of the PUF circuit [12].
Most of the PUF designs are based on delay variation of logic and interconnects.
The fundamental principle behind the delay based PUF is to compare a pair of identically
mapped circuit elements and measure the delay mismatch due to manufacturing process
variations. This technique demands identical implementation of two circuit elements
being compared. The identical mapping of circuit elements mapping can be achieved by
VLSI level placement and routing techniques.
2.2.2 Environmental Variations
The delay of gates and wires depends on junction temperatures which rely on
ambient temperatures. The significant variations in the ambient temperatures can result in
major variations in delays. Therefore, the ambient temperature is one of the most
significant environmental conditions that affect the circuit operating conditions. The
impact of varying junction temperatures can be compensated for by using identical
components in PUF circuit design. The main problem caused due to environmental
variation is the inconsistent result from the same design, which may pose challenges
related to robustness. The relative measure of delays can provide robustness against
environmental variations including variations in temperatures and voltages. Circuit aging
8
can also change delay characteristics of a circuit, but its effect is considerably smaller
than variations in supply voltage and temperatures.
2.2.3 Challenge-Response Pairs
An input to a PUF is called a challenge and the output a response. An applied
challenge and its measured response are generally called a Challenge-Response Pair
(CRP). A PUF generates a unique set of output bits, or response, for each secret input set,
or challenge. In PUF-based authentication, a CRP database is created from a particular
PUF by applying randomly chosen challenges to obtain unpredictable responses. During
verification, a challenge from the CRP database is applied to the PUF, and the response
produced by the PUF is compared with the corresponding response from the database.
2.3 Sources of Noise
The PUF circuit can have three major sources of randomness from its
manufacturing to its usage; noise due to the manufacturing process, local noise and
environmental noise [13].
2.3.1 Noise due to Manufacturing Process
Manufacturing process noise is due to variations in silicon layers during various
steps in the manufacturing processes. This noise is specific to each IC. An ideal PUF is
built to extract the maximum information related to manufacturing process noise to
uniquely identify a circuit or device.
2.3.2 Local Noise
Local noise arises when the circuit is in operation. This noise is due to the random
thermal motion of charge carriers. Local noise should be minimized to decrease intra-
9
chip variation for PUF designs. However, local noise can be a good source of randomness
for random number generators.
2.3.3 Environmental Noise
Environmental variations such as temperature and power supply voltages
variations are the major causes of noise in PUF responses. This environmental noise can
disrupt the consistency in PUF responses and increase the intra-chip variations, which
reduces the robustness of PUF design.
2.4 Measure of Quality
The metrics to evaluate the basic PUF functions define the trustworthiness of the
PUF. The quality factor of a PUF is measured in terms of its uniqueness, reliability and
resiliency [9, 14].
2.4.1 Uniqueness
Uniqueness is the estimation of how uniquely a PUF can distinguish different
chips based on the generated response. The uniqueness factor is the measure of inter-chip
variation, which gives information on the number of PUF output bits that are different
between two different PUFs. The uniqueness of a PUF is estimated by the average inter-
die Hamming Distance (HD) over a group of chips. It quantifies the Hamming distance of
PUF responses that are provided with the same input challenge. It is characterized by the
Probability Mass Function (PMF) or Probability Density Function (PDF) of Hamming
distances, where PUFs have PDF or PMF curves that are centered at half the number of
response bits. For binary strings, a Hamming distance between any two strings of equal
length is the number of bits that are different in the two strings.
10
Let (i, j) be a pair of chips with i ≠ j and Ri (respectively, Rj) the n-bit response of
chip i (respectively, chip j). The first metric is the average inter-die Hamming distance
among a group of k chips and is defined as [14]:
2.1
If the PUF produces uniformly distributed independent random bits, i.e. if each
binary response bit of a PUF has an equal probability of producing a ‘0’ or a ‘1’, then the
inter-chip variations should be 50% on average. Truly random bits are produced if only
the random process variation exists.
2.4.2 Reliability
Reliability indicates the reproducibility of the PUF outputs. Reliability gives
information on how many PUF output bits are changed when regenerated from the same
PUF with or without environmental variations. The responses for an ideal PUF are
expected to be consistent; however, factors such as variation in temperature, supply
voltage fluctuations and errors due to thermal noise affect the reproducibility of the PUF
responses. Reliability is the measure of consistency or stability of the PUF output
responses, when the responses are subjected to varying environmental conditions such as
variations in power supply voltages and temperature, and the same input challenge.
Since, the responses being compared are from generated from the same chip; this
variation is also called as intra-chip or intra-die variations.
An n-bit reference response (Ri) is extracted from chip i at normal operating
conditions. The same n-bit response is extracted from the same PUF at a different
11
operating condition with response bits R’i. Let, R’i, y be the y th
sample of R’i . Then, the
average intra-die HD over x samples for the chip i is defined as [14]:
2.2
The lower value of the average intra-die HD factor results in more reliable PUF
responses. The intra-chip variations for an ideal PUF should be 0%.
2.4.3 Resiliency
Resiliency of a PUF is the ability of the PUF to prevent an adversary from
revealing the PUF secrets. This is the measure of resiliency against attack or security.
2.5 PUF Classifications
PUFs can be categorized based on their construction properties, operation
principle and from a security point of view. Table 2.1 summarizes various PUFs under
different categories.
Table 2.1: Different types of PUFs
Categories Examples
Non-electronic PUF Optical PUF [15], Acoustical PUF [16]
Electronic PUF Coating PUF [17], Power Distribution PUF [18]
Delay-based PUF
Arbiter PUF [5], Ring Oscillator PUF [9], Glitch PUF [19],
Anderson PUF [20]
Memory-based PUF SRAM PUF [21], Butterfly PUF [22], Flip-flop PUF [23]
12
2.5.1 Non-electronic PUF, Electronic PUF and Silicon PUF
On the basis of construction and operation principles, PUFs can be categorized
into three categories; non-electronic PUFs, electronic PUFs and silicon PUFs [24].
Non-electronic PUFs refer to those with PUF-like properties whose construction
and/or operation is inherently non-electronic. Their PUF-like behavior is based on non-
electronic technologies or materials such as the random fiber-structure of a sheet of paper
or the random reflection of the scattering characteristics of an optical medium. For
example, optical PUFs based on transparent media as proposed in [15] are physical one-
way functions. Figure 2-1 shows the basic implementation of the Optical PUF. The CRP,
consisting of the laser orientation and the resulting hash, is saved in a public database for
later use.
In electronic PUFs, the basic operation consists of an analog measurement of an
electric or electronic quantity such as power, resistance and capacitance. An example of
Figure 2-1: Optical PUF [15]
13
an electronic PUF is the coating PUF [17], which considers the randomness of
capacitance measurements in comb-shaped sensors in the top metal layer of an IC.
Silicon PUFs [4] exhibit PUF behaviors which are embedded on a silicon chip.
Silicon PUFs are based on the hidden timing and delay information of ICs. A complex
integrated circuit can be represented as silicon based PUF, which helps in identifying and
authenticating individual ICs. Silicon PUFs can be implemented as a hardware building
block in cryptographic implementations. Silicon PUFs exploit manufacturing process
variations in integrated circuits with identical masks to uniquely characterize each IC.
Silicon PUFs are of particular interest for security solutions, and they are widely studied
as a major type of PUF. Delay-based PUFs and memory-based PUFs are considered
silicon PUFs.
2.5.2 Strong PUF and Weak PUF
The distinction between strong PUFs and weak PUFs is explained based on the
security properties of their challenge-response behavior [25]. A PUF is considered a
strong PUF; if it has a large number of CRPs such that an attack based on exhaustively
measuring the CRPs only has a negligible probability of success. For a strong PUF, it is
infeasible to build an accurate model of the PUF based on observed CRPs. If the number
of CRPs is small, then it is considered a weak PUF.
2.5.3 Intrinsic PUF and Non-intrinsic PUF
Another classification based on PUFs construction properties are intrinsic PUFs
and non-intrinsic PUFs. The intrinsic PUF was initially proposed by Guajardo et al. in
[21]. In intrinsic PUFs, its evaluations are performed internally by embedded
measurement equipment, and its random instance-specific features are implicitly
14
introduced during the manufacturing process. All silicon PUF based on random process
variations occurring during the manufacturing process of silicon chips, are intrinsic
PUFs. These silicon PUFs include both delay-based PUFs and memory-based PUFs.
The non-intrinsic PUFs are externally evaluated and their randomness features are
explicitly introduced. Optical PUF and Coating PUF are the types of non-intrinsic PUFs.
2.6 PUF Circuits
PUFs have drawn considerable attention over the past couple of years, making
them one of the potential areas in the field of hardware security and cryptography. There
have been various PUF techniques proposed for on-chip implementations; on both
Application Specific Integrated Circuits (ASICs) and FPGAs. Since this thesis is about
the FPGA-based PUF implementation, the discussion is limited to those techniques that
have been implemented on FPGAs.
2.6.1 Delay-based PUF
2.6.1.1 Arbiter PUF
Arbiter PUF is the first silicon PUF to be proposed [5]. Arbiter PUF is based on a
delay-based circuit consisting of a parallel multiplexer chain and an arbiter. Depending
on the challenge bits, the skew in propagation delay between the two paths due to process
variations is detected by an arbiter which latches out either logic ‘0’ or logic ‘1’. The two
delay paths are simultaneously excited and make the transition race against each other.
The arbiter block, which is simply a latch or a flip-flop, at the output determines which
rising edge arrives first and sets its output to ‘0’ or ‘1’ depending on the winner. If the
racing paths are symmetric or identical in layout and the arbiter is not biased to either
15
path, the response is equally likely to be ‘0’ or ‘1’ regardless of the challenge bits. The
output is determined only by the statistical delay variation due to process variations.
Figure 2-2 shows a silicon PUF delay circuit. The circuit has multiple-bit input
and computes a one-bit output based on the relative delay difference between two paths
with identical layout length. Arbiter PUF demands careful layout and routing for identical
mapping of the logic, which is quite difficult, especially in the case of FPGA.
2.6.1.2 Ring Oscillator PUF
The Ring Oscillator (RO) PUF consists of several identically mapped delay loops,
or ring oscillators, each of which oscillates with unique frequency due to manufacturing
process variations [9]. Each input challenge selects a pair of oscillator for comparison in
order to generate a response bit. A set of input challenges are given to PUF, which selects
a fixed sequence of oscillator pairs to generate a fixed number of response bits. The
frequency differences are determined by process variations if all the oscillators are
identically laid-out, which results in equal probability of getting ‘1’ or ‘0’ as a response
bit if random variation exists. The ease of duplicating a ring oscillator using hard-macros
Figure 2-2: An Arbiter PUF delay circuit [9].
x[0] x[n-1]x[2] x[n]
0 or 1
16
features has made its implementation more popular in FPGAs. Figure 2-3 and Figure 2-4
illustrate the structure of RO-PUF.
. A configurable ring oscillator has been proposed in [26] to improve reliability in
an RO-PUF. The authors have shown that an RO-PUF requires careful design decisions
to avoid the systematic process variations; and the placement techniques and the selection
of ring oscillator pairs significantly improves the PUF uniqueness.
Figure 2-4: Basic RO-PUF generating a single response bit
Counter
Counter
>?
>?
d
1
or
0
Figure 2-3: Ring Oscillator PUF [9]
Inp ut bits
Output bit
0 or 1
17
2.6.1.3 Glitch PUF
In a combinational logic, there exists a time difference between output changes
from an input change, i.e. it takes some time before the output is settled to its steady-state
value. These unintended transitions in signals are called glitches. The occurrence of
glitches is determined by the differences in delay of the different logical paths from the
inputs to an output signal.
The glitch PUF proposed in [19] exploits glitch waveforms that behave non-
linearly from delay variation between gates. It consists of an on-chip high-frequency
sampling of the glitch waveform and a quantization circuit which generates a response bit
based on the sampled data. The operation sequences of the glitch PUF are as follows:
Data input to a random logic
Acquisition of glitch waveforms at the output
Conversion of the waveforms into response bits
The Anderson PUF proposed in [20] generates a response bit depending on the
presence or absence of glitch. This design is targeted especially for FPGA-based
implementations. It consists of custom logical circuits implementing shift registers and
carry-chain multiplexers. Figure 2-5 shows a basic Anderson PUF. The shift registers are
implemented using a Look-Up-Table (LUT) and are initialized with bit strings that are
inverses of each other. The two LUTs generate square waves that are 180 degrees out of
phase. Due to the process variations in the LUTs and the multiplexers, the propagation
delay from the input to the output will vary from LUT to LUT. When an LUT’s outputs
are sufficiently out of phase, it produces a glitch at the output, which can be captured by a
flip-flop. The presence or absence of the glitch determines the PUFs output bit. Anderson
18
PUF is also analyzed using the concept of neural network and artificial intelligence [27-
29].
2.6.2 Memory-based PUF
2.6.2.1 SRAM PUF
Static Random Access Memory (SRAM) is a volatile digital memory cell, each
capable of storing a single bit. SRAM memories are available in almost every computing
device including FPGAs, and they can be used as an intrinsic PUF. It is bi-stable and can
be realized with two cross-coupled inverters as illustrated in Figure 2-6.
Figure 2-6: SRAM Cell (PUF). Logical circuit (left) and six-transistor (6T) SRAM
cell (right)
Figure 2-5: Anderson PUF
Clock
LUT A ← AAAA Output
LUT A ← 5555
19
SRAM PUF proposed in [21] is an FPGA intrinsic PUF based on random initial
states of SRAM cells. Every cell contains a certain degree of mismatch between the two
halves of the cross-coupled circuit. The random physical mismatch in the cell, caused by
manufacturing variability, determines the power-up behavior. When the cell is powered
on, it tends to attain both the stable stages. The power-on condition forces a cell to ‘0’ or
‘1’ during power-up depending on the sign of the mismatch. But, which power-up state a
cell prefers is random and not known in advance, and this random behavior can be used
as a PUF response.
2.6.2.2 Butterfly PUF
The Butterfly PUF (BPUF) is proposed in [22] to overcome the drawbacks of an
SRAM PUF. The disadvantage of intrinsic SRAM PUFs is that not all FPGAs support
uninitialized SRAM memory. In most of the FPGAs, all SRAM cells are enabled hard
reset to zero directly after power-up and hence all the randomness is lost. Also, the
SRAM PUFs require device power-up to enable the response generation.
Figure 2-7: Butterfly PUF cell
Excite
20
The construction of a butterfly PUF is similar to the SRAM PUF except BPUF
consists of a cross-coupled latch instead of an inverter. A butterfly PUF cell is depicted in
Figure 2-7. A BPUF cell can be brought to a floating or unstable state before allowing it
to settle to one of the two possible stable states. Using the clear/preset functionality of the
latches, an unstable state can be introduced after which the circuit converges back to one
of the two stable states. The preferred stable state of a butterfly PUF cell is determined by
the physical mismatch between the latches and the cross-coupling wires.
2.7 PUF Applications
Some of the major PUF applications proposed so far are as follows:
Low-cost device authentication [9]
As the PUF output is unique and unpredictable for each IC, PUF can be used for
device identification and authentication. The PUF outputs can be stored in a database and
compare that output with a re-generated signature later. The set of challenge-response
pairs act as the lock and PUFs act as the key. When a key is presented to a lock, the lock
queries the key for the response to a particular challenge. The lock opens only when the
correct key from the database responds.
Cryptographic key generation [9]
Due to the presence of noise, the PUF outputs are likely to vary slightly on every
evaluation. In order to use PUF outputs as cryptographic keys, the outputs are required to
undergo error correction process and key generation process. With error correction
process, which contains initialization and re-generation, PUF can consistently produce
the same result despite significant environmental changes. During initialization step, PUF
output is generated and the error correcting syndrome for that output is computed and
21
saved for later. The syndrome is the information that allows correcting bit-flips in re-
generated PUF outputs. In re-generation phase, the PUF uses the syndrome from the
initialization step to correct any changes in the PUF output. The key generation process
converts the PUF output into cryptographic keys.
Memoryless secret key storage [9]
In current practice, secret keys are stored in a non-volatile memory for
cryptographic primitives. Managing secrets in a memory in a secure way is difficult and
expensive. Storing secrets in a non-volatile memory is also vulnerable to invasive attacks.
PUF can generate volatile secret keys for cryptographic applications. PUFs increase the
physical security by generating volatile secret keys in digital form when the chip is
operating.
Hardware Random Number Generator (HRNG) [30]
Hardware random number generator extracts randomness directly from a complex
physical source. HRNG accepts an incoming request for a random output and produces
an output using an iterative process for generating a challenge in order to give
unpredictable results. An unpredictable challenge is saved in local registers. Once a
Figure 2-8: Secret key generation using PUF
Initialization phase Re-generation phase
Syndrome
Secret
KeyPUF Circuit
PUF Circuit
22
suitable challenge is found, a post-processing step is applied to remove bias and extract
randomness from the bit ordering. The National Institute of Standards and Technology
(NIST) test results carried out indicate that a PUF can be used as a reasonably good
hardware random number generator with low area overhead.
Software licensing [12]
A piece of code can be made to run only on a chip that has a specific identity
defined by a PUF. This prevents the execution of pirated code.
Intellectual Property (IP) protection [21]
PUFs provide IP protection of FPGAs based on public key cryptography. The
major advantage of using public-key based protocol is that it allows the design in which
the private key is always stored in a FPGA. As PUFs implemented on FPGAs are
intrinsic to the FPGAs, it provides better security.
Figure 2-9: HRNG using PUF
Error correction
PUF Circuit
Random Numbers
Save value
Resp onse
Challenge
Incoming request
23
PUF-based Radio Frequency IDentification (RFID) tags for anti-counterfeiting
[31]
A RFID-tag can be made unclonable by linking it inseparably to a PUF.
24
Chapter 3
Self-Timed Rings
3.1 Introduction
On-chip digital oscillators are ubiquitous in many IC designs. They are considered
a key component in many applications including PLLs, frequency synthesizers and clock
recovery systems. Oscillators are also an essential block for many cryptographic
applications such as on-chip TRNGs [32, 33] and PUFs [9, 14]. This chapter discusses
the Self-Timed Ring (STR), also called as asynchronous ring, as an alternative approach
to standard inverter ring oscillator.
3.2 Asynchronous Circuits
Asynchronous circuits, or self-timed circuits, use handshaking between their
components in order to perform the necessary synchronization, communication, and
sequencing of operations. Asynchronous circuits have shown many interesting potentials
including low power consumption, high operating frequency, less EMI (Electro-Magnetic
Interference), less noise, robustness towards variation in supply voltage, temperature, and
fabrication process parameters, better modularity for easier reuse of components, and no
clock skew problems [6]. However, asynchronous circuits are not yet matured enough to
25
be accepted openly in the industries, especially due to the lack of suitable Electronic
Design Automation (EDA) tools for asynchronous designs. The acceptance of
asynchronous technology by the semiconductor industries strongly depends on the
availability of synthesis tools and the possibility to prototype a design on standard
FPGAs.
The development of synchronous circuits currently dominates the semiconductor
design industry. However, there are major limiting factors to the synchronous, clocked
approach, including the increasing difficulty of clock distribution, increasing clock rates,
decreasing feature size, increasing power consumption, timing closure effort, and
difficulty with design reuse. Asynchronous circuits can offer a better solution to address
these issues. As the demand continues for designs with higher performance, higher
complexity, and decreased feature size, asynchronous paradigms will become more
widely used in the industry, as evidenced by the 2003 and 2007 International Technology
Roadmap for Semiconductors’ (ITRS) prediction of a likely shift from synchronous to
asynchronous design styles in order to increase circuit robustness, decrease power, and
alleviate many clock-related issues. The 2008 ITRS shows that asynchronous circuits
account for 11% of chip area in 2008, compared to 7% in 2007, and estimates they will
account for 23% of chip area by 2014, and 35% of chip area by 2019 [34].
3.3 Asynchronous Logic
Logic design, in general, consists of a separate computation part and storage part.
Computation takes place in a combinational block or a functional block; whereas storage
takes place in flip-flops, or registers, or latches, although they may exist combined or
separately. In synchronous logic, a global time reference, or a clock, controls activity to
26
synchronize the entire functional block in a circuit, or a system. Asynchronous logic uses
a local handshaking protocol to communicate among different modules, or functional
blocks. Local handshake between combinational blocks is also called asynchronous
control. Figure 3-1 and Figure 3-2 shows the synchronous and asynchronous
communication to control the events.
An asynchronous circuit can be represented as a static data-flow structure. The
static data-flow structure represents a high-level view of asynchronous design that is
equivalent to Register Transfer Level (RTL) in synchronous design. The data is copied
from one register to the next along the path through the circuit. The handshaking between
Figure 3-1: Synchronous circuit
Figure 3-2: Asynchronous circuit
Figure 3-3: Abstract data-flow view of an asynchronous circuit
CLK
A B C D E
data
A B C D E
ackack ack ack ack
data
Channel or link = data + handshake signals
27
the registers controls the data. The data and handshake signals connecting one register to
the next can be viewed as a handshake channel, or a link, as in Figure 3-3. The arrows
represent channels or links consisting of request, acknowledge and data signals. The
handshaking protocol is the basis of following sequencing rules of asynchronous circuits
[6, 35]:
a module starts the computation, if and only if, all the data required for the
computation are available,
as far as the result can be stored, the module releases its input ports,
it outputs the result in the output port, if and only if, the port is available.
3.3.1 Muller C-element
The Muller C-element or Muller gate is a fundamental primitive for building
asynchronous logic and implementing the synchronization required by most handshaking
protocols. Figure 3-4 shows a Muller gate representation and its truth table. ‘F’ and ‘R’
represent forward and reverse input respectively, ‘Q’ and ‘Q’’ represent current output
state and previous output state respectively. Figure 3-5 shows transistor level and logic
level implementation of Muller gate. Muller gate copies its input values to output if its
inputs are matched, otherwise it will hold the previous state. In the case of Muller gate
with inverted reverse input, it will copy forward input values to output if its inputs differ
in states, otherwise it will hold the previous states.
28
3.4 Self-Timed Rings
Rings are the backbone structures of circuits that perform iterative computations.
One can turn a pipeline into a ring by looping data from its output back around to its
input [36]. Figure 3-6 shows a three stage pipeline and the pipeline with its output
connected around to its input to form a ring. If the stages in the ring are all self-timed and
initialized with input data, then the ring will iterate under self-timed control. Self-timed
circuits use handshake protocols to control the sequencing of operations. In a self-timed
ring, events propagate between adjacent stages according to a simple
request/acknowledge handshake. These handshake signals replace the clocks of
synchronous designs.
3.4.1 Self-Timed Ring Structure
Muller C-element or Muller gate is an integral part of self-timed rings. Each stage
of a self-timed ring consists of a Muller gate and an inverter [37]. A standard N-stage
self-timed ring is depicted in Figure 3-7 [38].
Figure 3-4: Standard Muller gate and its truth table (left). Muller gate with inverted
reverse input and its truth table (right).
F R Q
0 0 0 (Reset)
0 1 Q’ (Hold)
1 0 Q’ (Hold)
1 1 1 (Set)
F R Q
0 0 Q’ (Hold)
0 1 0 (Reset)
1 0 1 (Set)
1 1 Q’ (Hold)
F
R Q F
R Q
29
Figure 3-7: An N-stage self-timed ring
[i+1] [i] [i-1]
[i-2]
Qi+1 Qi Qi-1
Qi-2 C1 C2
CN
C3
Ci Ci-1Ci+1
Ci-2
Figure 3-5: Implementations of Muller C-element
Figure 3-6: Three stage pipeline (top) and a ring (bottom)
x
z
y x
y z
x
y z
30
3.4.2 Token and Bubble Propagation
The temporal behavior of the self-timed ring can be explained on the basis of the
token-bubble abstraction model. From micro-pipeline point of view, a token usually
represents the presence of data in a stage, whereas a bubble represents an empty stage
ready to accept new data. A stage is said to have token if its output is not equal to its
input. Similarly, a stage is said to have bubble if its output is equal to its input. If Qi and
Qi+1 represent output for stage i and stage i+1 respectively, then token (T) and bubble (B)
may be represented as:
Token: if Qi ≠ Qi+1 and Bubble: if Qi = Qi+1.
Token-bubble configuration also represents the output states of each stage in a ring.
For example, for a ring having TTBBBB configuration, the stage output is either
“101111” or 010000”. A token propagates from stage i to next stage i+1 if, and only if,
the next stage i+1 contains a bubble. Similarly, a bubble propagates from stage i+1 to
previous stage i if, and only if, the previous stage i contains a token. Figure 3-8 illustrates
propagation of tokens and bubbles in a self-timed ring. For example, with initial ring
configuration as TTB (101 or 010), propagation occurs as:
TTB (101) → TBT (011) → BTT (110) → TTB (101)
An STR will create an oscillation only if the following conditions are satisfied[7, 39]:
N ≥ 3 and N = NT + NB, where N is the number of stages in an STR with NT
number of tokens and NB number of bubbles.
NB > 1
NT is a positive even number
31
The oscillation depends on process variability and the initial stages of the ring
defined by NT and NB. STR provides two different propagation modes; burst mode and
evenly-spaced mode, as shown in Figure 3-9. In burst mode, the tokens get together to
form a cluster that propagates all around the ring. In evenly-spaced mode, the tokens get
distributed evenly around the ring with constant spacing.
3.4.3 Jitter in Inverter RO and Self-Timed RO
Inverter Ring Oscillators (IROs) and self-timed ring oscillators exhibit thermal
noise [8]. This thermal noise is called jitter in time-domain and phase noise in frequency
domain. Self-timed ring oscillators and inverter ring oscillators differ in the way jitter
accumulates. There are two major jitter sources in FPGAs; local Gaussian jitter and
global deterministic jitter [39, 40].
Local Gaussian jitter is the source of randomness. For FPGA-based
implementation, where each stage of ring oscillators is implemented in a single Look-Up-
Figure 3-8: Token-bubble propagation
Figure 3-9: Burst mode propagation (top) and evenly-spaced mode propagation
(bottom)
1
0
1
0
32
Table (LUT), each stage of ring oscillators is considered source of the local Gaussian
jitter. In inverter ring oscillators, oscillation period is defined by two loops of a single
token around the ring and the jitter accumulates from the number of crossed stages.
Whereas, in asynchronous ring oscillators, several tokens propagate around the ring and
the oscillation period is defined by the elapsed time between successive tokens. Each
token crossing a stage experiences varying delay characteristics due to local Gaussian
jitter contribution of the stage. So, the period jitter in STRs is mostly composed of the
jitter generated locally in the ring stage. This provides better robustness against noise
instabilities caused by jitter in inverter ring oscillators in PUF design.
Global deterministic jitter is due to the non-random variations in delay
characteristics caused from external environmental variations. The global deterministic
jitter accumulates linearly throughout the ring in IROs. In STR oscillators, several events
propagate simultaneously, so deterministic jitter affects each event in the same way rather
than the whole ring structure. This gives increased robustness in self-timed ring
oscillators than inverter ring oscillators.
33
Chapter 4
Asynchronous Approach to Ring Oscillator for FPGA-
based PUF Design
4.1 Introduction
Recent development and advancement in design and process technology has made
Field Programmable Gate Array (FPGA) a key component in most of the electronic
systems. FPGAs are semiconductor devices consisting of matrix of Configurable Logic
Blocks (CLBs), which are interconnected using programmable interconnects. FPGA is
dominating a wide range of application area including military, defense, space,
automotive and consumer electronics. It is believed that FPGA may emerge as a potential
security platform due to their desirable features including flexibility, rapid time-to-
market, and post-silicon validation of the functionality. There has been growing concern
over the security attributes of FPGAs regarding protecting and securing information
processed within it, protecting designs during distribution and protecting intellectual
property rights [1].
This chapter mainly discusses two major implementations required for the
proposed STRO-PUF design; LUT-based implementation of Muller C-element and the
34
asynchronous approach to the ring oscillator for implementing Self-Timed Ring (STR) on
FPGAs.
4.2 FPGA Architecture
The typical FPGA architecture consists of an array of logic blocks, Input / Output
(I/O) pads and routing channels. The array is surrounded by programmable I/O blocks,
which provides external interface to the FPGA. The logic block is also called as
Combinational Logic Block (CLB) or Logic Array Block (LAB) depending on vendors.
Xilinx and Altera are the two major FPGA vendors in the current market. The detail
architecture of FPGAs differs from one vendor to another vendor; however, the typical
FPGA architecture is shown in Figure 4-1.
Figure 4-1: A typical FPGA architecture
Logic Block
I/O Pad
Routing
Channels
35
Logic blocks implement logic functions. They form the basic computation and
storage element of digital logic functions on FPGA. The logic block consists of Logic
Cells (LCs), which is also called as Logic Elements (LEs) or a slice. The typical logic cell
consists of Look-Up-Table (LUT) and storage elements such as latches or flip-flops. The
input signals consist of inputs to LUTs and a clock input; and can have registered or
unregistered output. The basic structure of a logic block is shown in Figure 4-2.
4.2.1 Architecture of Spartan-II
The proposed design is implemented using Xilinx XC2S100 FPGA device. This
section describes the overview of a Spartan-II family architecture, which helps in
implementing the STR on the FPGA. The particular XC2S100 device has 20 rows by 30
columns CLBs, which totals 600 CLBs and has 2700 logic cells [41].
The basic building block of the Spartan-II FPGA CLB is the Logic Cell (LC). An
LC includes a 4-input function generator, carry logic, and a storage element. Each
Spartan-II FPGA CLB contains four LCs, organized in two identical slices. Each CLB
consists of two identical slices. A Spartan-II slice is shown in Figure 4-3. The function
generators are implemented as 4-input LUTs.
Figure 4-2: Structure of a typical logic block
Inputs
CLK
LUT FF
Or
Latch
Output
36
4.3 LUT Implementation of a Muller Gate
Every Look-Up-Table (LUT) implements a Boolean logic equation, which is
defined by an INIT attribute. The INIT attribute defined with an appropriate hexadecimal
digits is attached to the LUT inputs to specify its logical function [42]. The INIT
Figure 4-3: Spartan-II slice
LUT
LUT
Carry
+
Control
Logic
Carry
+
Control
Logic
37
parameter for the LUT primitive defines the logical values of the LUT. This value is zero
by default, which drives the output to a zero regardless of the input values. The LUT can
be loaded with custom hexadecimal values, defined by INIT attribute, to perform a
particular logical function.
A self-timed ring requires its initial states to be loaded with required configuration
of tokens and bubbles, which can be defined by assigning the output of each stage with
either ‘0’ or ‘1’. A Muller gate with a set/reset feature (as shown on the left side of Figure
4-4) is used to force its output to either set or reset as desired. A Muller gate with set
input is called set Muller gate and a Muller gate with reset input is called reset Muller
gate. The set Muller gate forces its output to ‘1’ and reset Muller gate forces its output to
‘0’ during the initialization process.
A 4-bit LUT with general output is considered in the implementation to define
STR stages. Figure 4-4 shows a single stage of a self-timed ring oscillator for its
implementation in LUT. One of the inputs is configured as a Set/Reset (SR) signal, which
is responsible for setting stage output value at either ‘0’ or ‘1’. The remaining three
inputs are configured as forward input (F), reverse input (R) and feedback (Q’).
Figure 4-4: A stage in STR. Muller gate with set/reset option (left). LUT mapped as
Muller gate (right) for FPGA implementation
Q
R
F
Set/Reset I3
I2 O
I1 LUT
I0Q’
F
R Q
Set/Reset
C
38
A common technique to determine the desired INIT value for realizing a logical
function with LUT is using a truth table. The logical function of set Muller gate and reset
Muller gate is mapped in the Table 4.1 and Table 4.2. The custom hexadecimal digits to
define INIT attribute are obtained by grouping the output bits. The INIT attribute can be
obtained by reading the output states in groups of four from the bottom-up fashion and
converting them into hexadecimal characters. From the tables below, the INIT attribute
obtained for reset Muller gate and set Muller gate are “00B2” and “FF02” respectively.
Figure 4-5 shows the VHDL instantiation of reset Muller gate using a 4-input LUT with
INIT attribute.
Table 4.1: LUT mapping of reset Muller gate. INIT = > x“00B2”
I3 = SR I2 = F I1 = R I0 = Q’ O = Q INIT
0 0 0 0 0
“0010” = 2 0 0 0 1 1
0 0 1 0 0
0 0 1 1 0
0 1 0 0 1
“1011” = B 0 1 0 1 1
0 1 1 0 0
0 1 1 1 1
1 0 0 0 0
“0000” = 0 1 0 0 1 0
1 0 1 0 0
1 0 1 1 0
1 1 0 0 0
“0000” = 0 1 1 0 1 0
1 1 1 0 0
1 1 1 1 0
Table 4.2: LUT mapping of set Muller gate. INIT => x“FFB2”
I3 = SR I2 = F I1 = R I0 = Q’ O = Q INIT
0 0 0 0 0
“0010” = 2 0 0 0 1 1
0 0 1 0 0
0 0 1 1 0
0 1 0 0 1 “1011” = B
39
0 1 0 1 1
0 1 1 0 0
0 1 1 1 1
1 0 0 0 1
“0000” = F 1 0 0 1 1
1 0 1 0 1
1 0 1 1 1
1 1 0 0 1
“0000” = F 1 1 0 1 1
1 1 1 0 1
1 1 1 1 1
4.4 Logical Implementation of a Self-Timed Ring Oscillator
The proposed PUF design is a logic-based design, which uses asynchronous ring
oscillators instead of basic inverter ring oscillators. The design is especially targeted for
LUT-based FPGAs. Each stage in a ring is mapped in an LUT to perform a Muller gate
function. An asynchronous ring oscillator can be constructed by replicating each stage of
Figure 4-5: VHDL instantiation of reset Muller gate.
40
the ring described in Figure 4-4 to form a ring structure, as illustrated in Figure 3-7 in
Chapter 3. The ring should be designed to meet the oscillation conditions described in
Chapter 3. It is necessary to initialize ring stages, satisfying the oscillation conditions,
before oscillation occurs. The number and positions of set Muller gates or reset Muller
gates, defines the initialization states and the token-bubble states in the ring.
Figure 4-6 depicts a four-stage asynchronous ring oscillator implemented using
LUTs. A common signal ‘SR’ is connected to every stages of the ring. SR signal controls
the initialization and oscillation of the ring oscillator. In other words, SR switches the
self-timed ring oscillator between initialization mode and oscillation mode. For the
purpose of this design, initialization occurs when SR = ‘1’ and oscillation occurs when
SR = ‘0’.
The placement constraints [43] are used in the coding to ensure each stage of the
ring is mapped in a separate LUT. Placement constraints are used to prevent alteration of
design mapping, which may be caused by a synthesis tool. Figure 4-7 shows the
Figure 4-6: LUT-based four-stage asynchronous ring oscillator
I3
I2
I1 LUT
I0
I3
I2
I1 LUT
I0
I3
I2
I1 LUT
I0
I3
I2
I1 LUT
I0
SR
O OOO
41
schematic view of the implemented 6-stage self-timed ring oscillator with 2T4B
configuration and the initial states of “101111”.
Each stage of the ring is mapped in a separate LUT. Since six different LUTs are
used for implementing the ring oscillator, three different slices are used, as shown in
Figure 4-8. The position of each stages of the self-timed ring oscillator is defined by
using placement constraints, as shown in Figure 4-9.
Figure 4-7: Technology schematic view of 6-stage self-timed ring oscillator with
2T4B configuration and the initial states of “101111”.
42
4.5 Experimental Results
To observe the oscillatory behavior of a self-timed ring, the design in
implemented on XSA board with Xilinx XC2S100 FPGA device. For experimental
analysis, the self-timed ring oscillator is implemented with different numbers of stages,
and with different spatial configurations. Figure 4-10 through Figure 4-12 below show
the oscillation pattern of post-place & route simulation results and the real output tapped
Figure 4-8: Implementation of 6-stage self-timed ring oscillator shown in Xilinx
FPGA Editor
Figure 4-9: Placement constraint used to define position of stages of a self-timed ring
6-stage STR mapped
in 3 sep arate slices
43
from a logic analyzer. Table 4.3 shows the frequency observed for different
configurations of self-timed ring oscillators.
The oscillation frequency of the ring oscillator depends on the number of events,
i.e. number of bubbles or number of tokens; but not on the spatial arrangement or
distribution for the same number of tokens and bubbles. From the Table 4.3, it can be
observed that the 6-stage ring oscillator with spatial distribution of “TTBBBB” or
“TBTBBB” results in the same frequency. Also, with the different initialization states,
the same stage ring oscillator can give different oscillation frequencies. This is one of the
Figure 4-10: Simulation result of 6-stage STR oscillator with TTBBBB configuration
Figure 4-11: Simulation result of 6-stage STR oscillator with TTTTBB configuration
Figure 4-12: Real output of 6-stage STR oscillator with TTBBBB configuration
obtained from a logic analyzer
44
benefits of the self-timed ring to add reconfigurable features within the design. Unlike,
conventional inverter oscillator, the oscillator frequency of the asynchronous ring
oscillator does not decrease with the number of stages.
Table 4.3: Frequency values for asynchronous ring oscillators with different
configurations
No. of Stages NT.NB Time Period
(ns)
Frequency
(MHz)
Spatial Configuration
6 2T4B 10 100 TTBBBB, TBTBBB
6 4T2B 6.2 169.29 TTTTBB, TTBBTT
8 2T6B 8.3 120.48 TTBBBBBB
8 4T4B 5.9 169.49 TTTTBBBB
8 6T2B 8.3 120.48 TTTTTTBB
4.6 Conclusion
The technique for LUT-based implementation of Muller gate to construct a self-
timed ring oscillator, or an asynchronous ring oscillator is described in this chapter. The
experimental analysis illustrates the oscillation generating from an asynchronous ring
oscillator with different configurations.
It is a well known fact that significant process variations exist in IC fabrication,
which makes each IC unique in its delay characteristics [11, 44]. The statistical delay
variation in transistors and wires across FPGA chips can be exploited through identically
laid-out asynchronous ring oscillators. The next chapter discusses the proposed FPGA-
based PUF using the self-timed ring oscillator.
45
Chapter 5
STRO-PUF: Self-Timed Ring Oscillator based PUF
5.1 Introduction
This chapter introduces the implementation of self-timed ring oscillators as a
novel PUF approach on FPGAs. The proposed PUF is given a name; ‘Self-Timed Ring
Oscillator based Physical Unclonable Function (STRO-PUF)’. Like RO-PUF, the self-
timed ring oscillator based PUF generates oscillations of different frequencies when
identically mapped on a semiconductor device. These varying frequencies produced by
all identically mapped self-timed ring oscillators can be used to generate unique PUF
response bits.
Although the self-timed ring is well studied in many contexts, there has been very
limited work done in the field of hardware cryptography and the areas of security
applications using the concept of asynchronous logic. In [8], the author has initiated PUF
implementation using asynchronous ring oscillators to address robustness and entropy.
However, the result is limited to electrical stimulation. The work described in this thesis
is implemented on real silicon devices. In [39], authors have analyzed a self-timed ring
oscillator as the entropy source for the True Random Number Generator (TRNG)
46
implemented on FPGA. This chapter aims to explore the implementation of asynchronous
ring oscillators in PUF design targeting FPGA devices.
5.2 Architecture of STRO-PUF
The proposed PUF architecture is also based on a ring oscillator, but it uses a self-
timed ring oscillator instead of a conventional inverter ring oscillator. The architecture of
the proposed design for a self-timed ring oscillator based PUF is shown in Figure 5-1. It
consists of two groups of identically laid-out self-timed ring oscillators. A Set/Reset
(SR) signal is common to all the oscillators present in both groups. The SR signal
initializes the states of every ring oscillator in order to create oscillations.
The initialization is done setting SR = ‘1’; SR can be switched back to SR = ‘0’ so
that oscillation is created. Each oscillator oscillates with different frequencies due to
process variations. Outputs of each oscillator are fed to the multiplexers (MUX) of
corresponding groups. Inputs to the PUF are given through a challenge generator, which
selects two self-timed ring oscillators from each group. The frequency comparator
captures the frequency differences between these two oscillators and generates a single
output bit. A frequency comparator consists of two counters counting TV (target value)
periods of two frequencies coming from each MUX. Whichever counter reaches the
targeted value of TV first, the frequency driving that counter is greater than the other. For
example, if the frequencies of STROs from group A and group B are f1 and f2
respectively, then the response bit = 1 if f1 ≥ f1; otherwise the response bit = 0. A unique
set of output responses is generated for each set of input challenges, which is used in
identifying a particular device and also used in various cryptographic applications.
47
5.3 Implementation of STRO-PUF
FPGAs are considered an efficient platform for implementing cryptographic
algorithms on hardware. The implementation of PUFs on FPGAs involves significant
challenges because it is difficult for a designer to exploit full layout level design
techniques, and there is not sufficient information available about the gate level structure
of the FPGA fabric. Also, many PUF designs require careful routing symmetry, and this
is quite difficult to achieve in FPGA-based design.
A six-stage asynchronous ring oscillator is considered for the purpose of the
implemented PUF design. The prototype asynchronous ring oscillator, which is
implemented using an LUT-based approach, is shown in Figure 5-2. The details of LUT-
based implementation of a self-timed ring oscillator have already been discussed in
Chapter 4. The proposed PUF design requires the identical mapping of each self-timed
ring oscillator. This includes both the symmetrical routing and the placement of identical
Figure 5-1: Architecture of the proposed STRO-PUF
Group A Group B Challenge
Generator
Frequency
Comparator
SR SR
Response bits
M
U
X
M
U
X
48
circuit instances. The FPGA Editor in the Xilinx toolset allows the user to create identical
instances using hard-macros. Figure 5-3 shows the layout of a six-stage self-timed ring
oscillator implemented as a hard-macro. The bull’s eye symbol represents the reference
point of the hard-macro.
Each group in a PUF circuit can have a number of asynchronous ring oscillators.
The number of ring oscillators in the groups determines the possible combination of input
challenges, the number of responses and the number of bits in each response. The
response generated from the PUF circuit also depends on how the comparisons are made
among the oscillators. Depending on the number of oscillators required in each group, the
self-timed ring is duplicated using the hard-macro to ensure all the oscillators are
identical.
Figure 5-2: A 6-stage asynchronous ring oscillator.
Figure 5-3: Hard-macro implemented as a six-stage asynchronous ring oscillator.
C1 C3C2 C6C5C4 F1 F2 F3 F4 F5 F6
R1
R2
R3
R4
R5
R6
Q1 Q2 Q3 Q4 Q5 Q6
SR
49
Figure 5-4: Layout view of an STRO-PUF implemented with 16 pairs of identical
STR oscillators in each group.
Figure 5-5: Portion of an STRO-PUF in FPGA Editor.
50
Hard-macros are instantiated in the main program and the locations of the hard-
macros are defined in a User Constraint File (UCF) to map the PUF as desired. Figure 5-
4 shows the duplication of a self-timed ring oscillator instance, which is created using
hard-macros, in order to map 16 pairs of identical oscillators for implementing the
STRO-PUF. Figure 5-5 shows a portion of the implemented STRO-PUF mapped in a
region defined in the user constraint file.
5.4 Experimental Analysis
The proposed design is implemented on three different Xilinx Spartan-II boards.
PUFs are mapped onto six different regions of each device as shown in Figure 5-6. Each
PUF is realized using 16 pairs of identically laid-out STROs with 16 STROs in each
group. For the purpose of the implemented design, a six-stage self-timed ring oscillator is
used with two token and four bubble configurations, which are represented by their initial
states of either ‘101111’ or ‘010000’ (TTBBBB).
Figure 5-6: PUFs mapped on six different regions of XC2S100 FPGA (20 X 30 CLBs)
51
The Set/Reset (SR) signal initializes the PUF states when SR = ‘1’ and generates
oscillations when SR = ‘0’. Figure 5-7 illustrates PUF output read from a logic analyzer
during initialization mode and oscillation mode.
5.4.1 Analysis of Output Frequencies
Frequencies generated from each of the self-timed ring oscillators of the STRO-
PUFs are read through a logic analyzer. The varying oscillatory behavior of STROs is
observed in the logic analyzer. In the simulation output, however, the same PUF design
gives identical oscillatory behavior with same frequency for all STROs. Figure 5-8 and
Figure 5-9 show the simulated waveform, and the real output taken from the logic
analyzer. Figure 5-10 shows the frequency variations for 36 groups of asynchronous ring
oscillators, which are mapped across six different regions of all three FPGAs. The
maximum and the minimum frequencies observed are 125 MHz and 16.2438 MHz,
respectively. The average frequency observed is 101.4460 MHz. The simulation result
shows the identical frequency of 100 MHz for all the oscillators, which is different from
the real responses. The robust responses can be determined by selectively comparing the
frequencies of the oscillators, which have larger frequency differences.
Figure 5-7: PUF outputs during initialization mode and oscillation mode.
Initialization mode, SR =1 Oscillation mode, SR=0
52
Figure 5-8: Simulation result of STRO-PUF output frequencies.
53
Figure 5-10: Distribution of frequencies generated by asynchronous ring across FPGA
devices
0
20
40
60
80
100
120
140
0 5 10 15 20 25 30 35 40
F re
q u e n c y
Asynchronous Ring
Frequency Variation RO1
RO2
RO3
RO4
RO5
RO6
RO7
RO8
RO9
RO10
RO11
RO12
RO13
RO14
RO15
RO16
Figure 5-9: Portion of STRO-PUF output frequencies taken from a logic analyzer.
54
5.4.2 Analysis of Uniqueness of STRO-PUF
For each challenge provided, a pair of oscillators is selected to generate a single
bit response. For k number of ring oscillators, k (k-1)/2 distinct pairs can be selected to
generate k (k-1)/2 response bits. But generating response bits from all the possible pairs
reduces entropy due to the inclusion of dependent bits [13]. To avoid correlation, a
simple approach is to use each oscillator only once in order to generate a single bit. The
uniqueness can be calculated by using equation 2.1.
The uniqueness analyses are performed for 16-bit PUF response and 256-bit PUF
response, which are generated based on how the comparisons are made. Table 5.1 and
Table 5.2 show 18 different PUF responses for two different comparisons. If each
oscillator is used only once to generate a response bit, the STRO-PUF, having 16 pairs of
STROs, can generate a 16-bit response. To analyze the overall signature uniqueness of
the implemented design, all the PUF responses are considered. There are six different
PUFs mapped on each of three FPGAs, which gives total of (6X3 = 18) 18 PUFs,
producing (18*(18-1)/2 = 153)153 data points. The average Hamming distance for 16-bit
responses is calculated as 7.99. Figure 5-11 illustrates the probability histogram of
responses from the PUFs, indicating an average uniqueness of 49.92%, which is very
close to the desired 50% factor.
55
Table 5.1: 16-bit STRO-PUF responses
16-bit STRO-PUF responses
B0A2
6FFF
7F2A
B8DF
A647
F49B
F6E1
06FF
4041
77A7
82F5
EB70
7FFF
41DB
AF9D
4062
8590
6EE4
Figure 5-11: Uniqueness Analysis for 16-bit PUF response
56
If comparisons are made with each oscillator in a group being compared with
every oscillator in another group, it can give a 256-bit (16X16 = 256) response. The
entropy of these responses is reduced because the bits obtained also include the correlated
bits. For example, consider two ring oscillators ‘a’ and ‘b’ in group ‘A’ and two ring
oscillators ‘c’ and ‘d’ in group ‘B’. The possible combinations are (a, c), (a, d), (b, c) and
(b, d), generating 4-bit response. If a>c, c>b, b>d then it can be easily predicted that a>d.
The uniqueness for 256-bit response is obtained as 26.28% and its histogram is shown in
Figure 5-12.
Figure 5-12 Uniqueness Analysis for 256-bit PUF response
57
Table 5.2: 256-bit STRO-PUF responses
256-bit STRO-PUF responses
F062300030003000F062F062F062F062F2E27000F062F062F2E2F2E2F062F2E2
7040500070400000FFFFFFFFFFFFFFFFFFFFF040FFFFFFFFFFFFFFFFFFFFFFFF
7000700070007000FF7AFF7AFF7FFF7AFF7F7000FF7A7042FF7AFF7AFF7FFF7A
F052100070401000FFFFFADAF052F052FFFFF052F052F8DAFFFFFFFFF052FFFF
F040100030000000F042FFF7FFF7F042F9627040F040F042FFF7FDF6F042FFF7
F040700020007000F4DBF453FDFBF4DBF4DB7000F053F053FDFBFDFBF042F043
F040704030003000F060F761F761F060F7E33040F761FFE3F761FFE3F040F761
7000300010000000F048FEFFFFFFFEFFFEFF7040FFFFFEFFFEFFFEFFFFFFFFFF
7000700000000000F040F040F040F841F040F040F841F040F040F841F040F841
7040700070007000F040F440FFFFFDC3FFC77000FFEFFC41F040FFC7FFC7FFC3
F040300010000000F040FA55FED5FEFDFEF57040FEFDFED5FEF5FA55FEFDFA55
F040700020002000F860F840FEFDFFFFF8607040FC79FEFDF040F860FEFDF860
7060704070407040FFFBFFFBFFFFFFFFFFFF7040FFFFFFFFFFFFFFFFFFFFFFFF
70424000400040007142714279FA79FAFFFF70427142FFFF79FA71427042FFFF
F000000020000000FFFFFFFFFFFFFFFFFFFF0000F050FC7DFC7DFFFFF050FFFF
7040704000000000F062F062F062F062F062F040F062F062F062F062F062F062
F040300010000000F0C2FCFBFCFBFDFBF0D3F000F0C3FCFBF0C2F0DBF040F042
7000600020002000FC67FE67FFF7FC67FFFFF040FC67FC67FFF7FC65F040F440
Table 5.3 summarizes the analysis based on two different comparisons;
comparing each oscillator only once, which gives the responses without dependent bits
58
and comparing each oscillator in group A to every oscillator in group B, which gives
responses with dependent bits.
Table 5.3: Comparing responses with dependent bits and independent bits
Response without dependent bits Response with dependent bits
No. of output bits 16 256
Uniqueness 49.92 % 26.28 %
Average HD 7.99 67.27
Minimum HD 1 10
Maximum HD 15 123
The uniqueness (inter-die variation) achieved with the proposed STRO-PUF is the
closest to the desired factor of 50% compared to the previous work on FPGA-based PUF.
For the comparison, uniqueness analysis with Table 5.4 shows the uniqueness results of
the implemented STRO-PUF with 16-bit response versus previous work.
Table 5.4: Uniqueness results for FPGA-based PUFs
Different PUFs Uniqueness
STRO-PUF (proposed) 49.92 %
Configurable RO-PUF [45] 47 %
RO-PUF [9] 46.15 %
RO-PUF [46] 48.4 %
Configurable RO-PUF[14] 47.31 %
Anderson PUF [20] 48.28 % (Average HD of 61.8 for 128-bit
output)
59
RO-PUF based on placement [47] Random placement : 43.40 %
Chain-like placement : 48.51 %
5.4.3 FPGA Authentication using STRO-PUF
STRO-PUFs can be used to authenticate individual ICs without costly primitives.
Figure 5-13 shows a basic PUF-based FPGA authentication process. Trusted parties
create a Challenge-Response Pair (CRP) database from an authentic FPGA for future
authentication operations. To verify the authenticity, the trusted party selects a challenge
from the database and checks whether it matches its corresponding response or not. Each
CRP is used only once to increase security. Both the 16-bit responses and the 256-bit
responses generated from STRO-PUFs can be applied for this device authentication
mechanism.
Figure 5-13: FPGA authentication using STRO-PUF
Authentic FPGA Unknown FPGA
PUF PUF
CRP
Da ta ba se
--------------
FB19 F22F
AB43 653A
BBF2 EC31
Untrusted
Environment
Supply Cha in
Cha llenge : Response1 Cha llenge :Response2
Response1 = Response2
?? ?
60
5.4.4 Reliability Enhancement with STRO-PUF
Frequencies of ring oscillators can change significantly as environmental effects
can cause the oscillators to flip their output bits. The effect of temperature and voltage on
frequencies of ring oscillators is shown in Figure 5-14. When temperature increases,
frequencies of oscillators slow down at different rate due to different device or physical
parameters. In the Figure 5-14, at certain initially temperature, a ring oscillator
represented by a dotted line is faster than a ring oscillator represented by a solid line.
When temperature changes significantly, these ring oscillators flip. Similarly, with
significant changes in voltages, the frequencies of oscillators change at different rate,
which gives different result. It shows that ring oscillators with greater frequency
differences are much less likely to flip than ring oscillators with narrow frequency
differences. The errors caused due to the bit flips can be significantly reduced by
comparing ring oscillators, whose frequencies are far apart, to generate response bits.
In STRO-PUF, the number of token and bubble can be configured from its
initialization stage. By determining the configuration of the self-timed ring oscillators
with the maximum frequencies differences, maximum reliability can be achieved.
61
5.5 Conclusion
This chapter described the implementation of a PUF using self-timed ring
oscillators on FPGA. It uses a logic-based design of an underlying FPGA architecture.
The approach can be used to implement low-cost authentication of the FPGA device and
to generate secret keys for many cryptographic applications. The frequency analysis
shows that asynchronous oscillators generate varying frequencies due to process
variations. These frequencies can be selectively compared to generate response bits. The
uniqueness of the implemented STRO-PUF for 16-bit response is calculated as 49.92 %,
Figure 5-14: Effect of temperature and voltage on oscillator frequencies and PUF
response bits
Temperature
F re
q u
e n
c y
Flipped
bits
Narrow frequency
difference
Temperature
F re
q u
e n
c y
Original
bits retained
Wider frequency
difference
Voltage
F re
q u
e n c y
Flipped
bits
Narrow frequency
difference
F re
q u
e n c y
Original
bits retained
Wider frequency
difference
Voltage
62
which is close to desired 50% factor. The uniqueness and the strength of PUF responses
also depend on how the comparisons are done. With the inclusion of dependent bits, the
uniqueness factor reduces.
63
Chapter 6
Conclusion
6.1 Conclusion
Today’s global marketplace has opened up not only new opportunities but also
new threats. In the current global marketplace, commercial products can be obtained
easily, either by legitimate means or simply by theft. Counterfeiting has become one of
the most significant threats to the free market. Physical Unclonable Functions (PUFs)
have emerged as a potential technique to fight against hardware counterfeiting. PUFs are
methods of extracting unique identity information from silicon devices or circuits based
on their physical properties for device authentication.
Since the inception of a PUF concept, there have been various PUF techniques
proposed, each with their own implications. In this thesis, a novel approach towards
FPGA-based PUF design using asynchronous ring oscillators has been described. It uses
the logic-based design of the underlying FPGA architecture. The frequency analysis
shows that asynchronous oscillators generate varying frequencies due to process
variations. These frequencies can be selectively compared, based on input challenges
provided to the PUF, to generate response bits. The responses generated from the STRO-
PUF are used in device authentication and in many cryptographic applications such as
64
generating secret keys and TRNG. From the experimental analysis, it is observed that the
proposed PUF has a uniqueness factor of 49.92 %, which is close to desired factor of
50%. The uniqueness achieved with the STRO-PUF is better than the previous work in
FPGA-based PUF designs (Table 5.4). The experimental analyses also show that the
uniqueness of PUF responses also depends on how the input challenges are given to the
PUF in order to generate response bits. The input challenge decides the selection of
oscillators and the number of response bits.
The STRO-PUF can achieve better re-configurability features without significant
hardware overhead. The initial stages of asynchronous ring oscillators in an STRO-PUF
can be configured by setting different number of tokens and bubbles. Reliability of
STRO-PUF can be enhanced by selectively comparing the frequencies of asynchronous
oscillators which have wider frequency differences.
6.2 Future Directions
The work in this thesis is an initial step toward PUF design using asynchronous
logic. Some possible extensions to this work include the following:
The proposed design can be extended to have reconfigurable features by adding
control logic to load a different token-bubble word during the initialization stage.
A self-timed ring oscillator with same number of stages can generate different
frequencies with different token-bubble configurations. This feature is not
possible in a conventional inverter oscillator.
Experimental analysis of the robustness of STRO-PUF in varying environmental
conditions such as varying temperatures and varying voltages.
65
Implementing STRO-PUF in other PUF applications such as a secret key
generator and TRNG.
Power analysis of STRO-PUF compared to other PUF designs.
66
Reference
[1] S. Drimer, "Volatile FPGA design security–a survey," IEEE Computer Society
Annual Volume, pp. 292-297, 2008.
[2] C. Hu, "Solving Today’s Design Security Concerns," Xilinx Corporation, 2010.
[3] C. Gorman. Counterfeit Chips on the Rise [Online]. Available:
http://spectrum.ieee.org/computing/hardware/counterfeit-chips-on-the-rise
[4] B. Gassend, D. Clarke, M. Van Dijk, and S. Devadas, "Silicon physical random
functions," in Proceedings of the 9th ACM conference on Computer and
communications security, 2002, pp. 148-160.
[5] J. W. Lee, D. Lim, B. Gassend, G. E. Suh, M. Van Dijk, and S. Devadas, "A
technique to build a secret key in integrated circuits for identification and
authentication applications," in VLSI Circuits, 2004. Digest of Technical Papers.
2004 Symposium on, 2004, pp. 176-179.
[6] J. Sparsø, "Asynchronous Circuit Design: A Tutorial," Technical University of
Denmark, 2006.
[7] J. Hamon, L. Fesquet, B. Miscopein, and M. Renaudin, "Constrained
Asynchronous Ring Structures for Robust Digital Oscillators," Very Large Scale
Integration (VLSI) Systems, IEEE Transactions on, vol. 17, pp. 907-919, 2009.
67
[8] J. Murphy, "Asynchronous Physical Unclonable Functions–A sync PUF,"
Multimedia Communications, Services and Security, pp. 230-241, 2012.
[9] G. E. Suh and S. Devadas, "Physical unclonable functions for device
authentication and secret key generation," in Proceedings of the 44th annual
Design Automation Conference, 2007, pp. 9-14.
[10] K. A. Bowman, S. G. Duvall, and J. D. Meindl, "Impact of die-to-die and within-
die parameter fluctuations on the maximum clock frequency distribution for
gigascale integration," Solid-State Circuits, IEEE Journal of, vol. 37, pp. 183-190,
2002.
[11] H.-Y. Wong, L. Cheng, Y. Lin, and L. He, "FPGA device and architecture
evaluation considering process variations," in Proceedings of the 2005
IEEE/ACM International conference on Computer-aided design, 2005, pp. 19-24.
[12] B. Gassend, D. Clarke, M. Van Dijk, and S. Devadas, "Controlled physical
random functions," in Computer Security Applications Conference, 2002.
Proceedings. 18th Annual, 2002, pp. 149-160.
[13] F. Bernard, V. Fischer, C. Costea, and R. Fouquet, "Implementation of ring-
oscillators-based physical unclonable functions with independent bits in the
response," International Journal of Reconfigurable Computing, vol. 2012, p. 13,
2012.
[14] A. Maiti and P. Schaumont, "Improved ring oscillator PUF: an FPGA-friendly
secure primitive," Journal of cryptology, vol. 24, pp. 375-397, 2011.
[15] R. S. Pappu, "Physical one-way functions," Massachusetts Institute of
Technology, 2001.
68
[16] B. Škorić, P. Tuyls, and W. Ophey, "Robust key extraction from physical
uncloneable functions," in Applied Cryptography and Network Security, 2005, pp.
407-422.
[17] P. Tuyls, G.-J. Schrijen, B. Škorić, J. van Geloven, N. Verhaegh, and R. Wolters,
"Read-proof hardware from protective coatings," in Cryptographic Hardware and
Embedded Systems-CHES 2006, ed: Springer, 2006, pp. 369-383.
[18] R. Helinski, D. Acharyya, and J. Plusquellic, "A physical unclonable function
defined using power distribution system equivalent resistance variations," in
Proceedings of the 46th Annual Design Automation Conference, 2009, pp. 676-
681.
[19] D. Suzuki and K. Shimizu, "The glitch PUF: a new delay-PUF architecture
exploiting glitch shapes," in Cryptographic Hardware and Embedded Systems,
CHES 2010, ed: Springer, 2010, pp. 366-382.
[20] J. H. Anderson, "A PUF design for secure FPGA-based embedded systems," in
Proceedings of the 2010 Asia and South Pacific Design Automation Conference,
2010, pp. 1-6.
[21] J. Guajardo, S. Kumar, G.-J. Schrijen, and P. Tuyls, "FPGA intrinsic PUFs and
their use for IP protection," Cryptographic Hardware and Embedded Systems-
CHES 2007, pp. 63-80, 2007.
[22] S. S. Kumar, J. Guajardo, R. Maes, G.-J. Schrijen, and P. Tuyls, "The butterfly
PUF protecting IP on every FPGA," in Hardware-Oriented Security and Trust,
2008. HOST 2008. IEEE International Workshop on, 2008, pp. 67-70.
69
[23] R. Maes, P. Tuyls, and I. Verbauwhede, "Intrinsic PUFs from flip-flops on
reconfigurable devices," in 3rd Benelux workshop on information and system
security (WISSec 2008), 2008.
[24] R. Maes and I. Verbauwhede, "Physically unclonable functions: A study on the
state of the art and future research directions," in Towards Hardware-Intrinsic
Security, ed: Springer, 2010, pp. 3-37.
[25] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, "Physical unclonable
functions and public-key crypto for FPGA IP protection," in Field Programmable
Logic and Applications, 2007. FPL 2007. International Conference on, 2007, pp.
189-195.
[26] A. Maiti and P. Schaumont, "Improving the quality of a physical unclonable
function using configurable ring oscillators," in Field Programmable Logic and
Applications, 2009. FPL 2009. International Conference on, 2009, pp. 703-707.
[27] S. Pappala, M. Niamat, and W. Sun, "FPGA Based Device Specific Key
Generation Method using Physically Uncloanble Functions and Neural
Networks," presented at the IEEE 55th International Midwest Symposium on
Circuits and Systems (MWSCAS), 2012.
[28] S. Pappala, M. Niamat, and W. Sun, "FPGA based trustworthy authentication
technique using Physically Unclonable Functions and artificial intelligence," in
Hardware-Oriented Security and Trust (HOST), 2012 IEEE International
Symposium on, 2012, pp. 59-62.
[29] S. Pappala, M. Niamat, and W. Sun, "FPGA based key generation technique for
anti-counterfeiting methods using Physically Unclonable Functions and artificial
70
intelligence," in Field Programmable Logic and Applications (FPL), 2012 22nd
International Conference on, 2012, pp. 388-393.
[30] C. W. O’donnell, G. E. Suh, and S. Devadas, "PUF-based random number
generation," In MIT CSAIL CSG Technical Memo, vol. 481, 2004.
[31] P. Tuyls and L. Batina, "RFID-tags for Anti-Counterfeiting," in Topics in
Cryptology–CT-RSA 2006, ed: Springer, 2006, pp. 115-131.
[32] M. Baudet, D. Lubicz, J. Micolod, and A. Tassiaux, "On the security of oscillator-
based random number generators," Journal of cryptology, vol. 24, pp. 398-425,
2011.
[33] H. Bock, M. Bucci, and R. Luzzi, "An offset-compensated oscillator-based
random bit source for security applications," Cryptographic Hardware and
Embedded Systems-CHES 2004, pp. 27-83, 2004.
[34] S. C. Smith and J. Di, "Designing asynchronous circuits using NULL convention
logic (NCL)," Synthesis Lectures on Digital Circuits and Systems, vol. 4, pp. 1-
96, 2009.
[35] L. Fesquet, J. Quartana, and M. Renaudin, "Asynchronous systems on
programmable logic," Reconfigurable Communication-centric SoCs,
ReCoSoC’05, pp. 105-112, 2005.
[36] T. Williams, Latency and throughput tradeoffs in self-timed speed-independent
pipelines and rings: Computer Systems Laboratory, Stanford University, 1990.
[37] A. J. Winstanley, "Temporal Properties of Self-Timed Rings," The University of
British Columbia, 2001.
71
[38] A. J. Winstanley, A. Garivier, and M. R. Greenstreet, "An event spacing
experiment," in Asynchronous Circuits and Systems, 2002. Proceedings. Eighth
International Symposium on, 2002, pp. 47-56.
[39] A. Cherkaoui, V. Fischer, A. Aubert, and L. Fesquet, "Comparison of Self-Timed
Ring and Inverter Ring Oscillators as Entropy Sources in FPGAs," in Design,
Automation & Test in Europe Conference & Exhibition (DATE), 2012, 2012, pp.
1325-1330.
[40] V. Fischer, F. Bernard, N. Bochard, and M. Varchola, "Enhancing security of ring
oscillator-based trng implemented in FPGA," in Field Programmable Logic and
Applications, 2008. FPL 2008. International Conference on, 2008, pp. 245-250.
[41] Xilinx. Spartan—II FPGA Family Data Sheet [Online]. Available:
http://www.xilinx.com/support/documentation/data_sheets/ds001.pdf
[42] Xilinx. Spartan-II and Spartan-IIE Libraries Guide for HDL Designs [Online].
Available:
http://www.xilinx.com/itp/xilinx10/books/docs/spartan2_hdl/spartan2_hdl.pdf
[43] Xilinx. User Constraints Guide 10.1 [Online]. Available:
http://www.xilinx.com/itp/xilinx10/books/docs/cgd/cgd.pdf
[44] K. A. Bowman and J. D. Meindl, "Impact of within-die parameter fluctuations on
future maximum clock frequency distributions," in Custom Integrated Circuits,
2001, IEEE Conference on., 2001, pp. 229-232.
[45] Y. Haile, P. H. W. Leong, and X. Qiang, "An FPGA Chip Identification
Generator Using Configurable Ring Oscillators," Very Large Scale Integration
(VLSI) Systems, IEEE Transactions on, vol. 20, pp. 2198-2207, 2012.
72
[46] C. Costea, F. Bernard, V. Fischer, and R. Fouquet, "Analysis and enhancement of
ring oscillators based physical unclonable functions in FPGAs," in
Reconfigurable Computing and FPGAs (ReConFig), 2010 International
Conference on, 2010, pp. 262-267.
[47] D. Merli, F. Stumpf, and C. Eckert, "Improving the quality of ring oscillator PUFs
on FPGAs," in Proceedings of the 5th Workshop on Embedded Systems Security,
2010, p. 9.
73
Appendix A
Source Codes
A.1 VHDL Code for a Self-Timed Ring (STR)
-- six-stage self-timed ring oscillator
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
library UNISIM;
use UNISIM.VComponents.all;
entity str6 is
Port ( ring_out : out STD_LOGIC;
init : in STD_LOGIC);
end str6;
architecture Behavioral of str6 is
signal qout : std_logic_vector (1 to 6) := (others => '0');
74
attribute loc: string;
attribute loc of LUT4_inst1: label is "CLB_R1C1.S1";
attribute loc of LUT4_inst2: label is "CLB_R1C1.S1";
attribute loc of LUT4_inst3: label is "CLB_R1C1.S0";
attribute loc of LUT4_inst4: label is "CLB_R1C1.S0";
attribute loc of LUT4_inst5: label is "CLB_R1C2.S1";
attribute loc of LUT4_inst6: label is "CLB_R1C2.S1";
begin
-- SET cell
LUT4_inst1 : LUT4
generic map (
INIT => X"FFB2")
port map (
O => qout(1), -- LUT general output
I0 => qout(1), -- LUT input, fedback
I1 => qout(2), -- LUT input, reverse signal
I2 => qout(6), -- LUT input, forward
I3 => INIT -- LUT input, set/reset
);
-- End of LUT4_inst instantiation
75
-- RESET cell
LUT4_inst2 : LUT4
generic map (
INIT => X"00B2")
port map (
O => qout(2),
I0 => qout(2),
I1 => qout(3),
I2 => qout(1),
I3 => INIT
);
-- SET cell
LUT4_inst3 : LUT4
generic map (
INIT => X"FFB2")
port map (
O => qout(3),
I0 => qout(3),
I1 => qout(4),
I2 => qout(2),
76
I3 => INIT
);
-- SET cell
LUT4_inst4 : LUT4
generic map (
INIT => X"FFB2")
port map (
O => qout(4),
I0 => qout(4),
I1 => qout(5),
I2 => qout(3),
I3 => INIT
);
-- SET cell
LUT4_inst5 : LUT4
generic map (
INIT => X"FFB2")
port map (
O => qout(5),
I0 => qout(5),
77
I1 => qout(3),
I2 => qout(4),
I3 => INIT
);
-- SET cell
LUT4_inst6 : LUT4
generic map (
INIT => X"FFB2")
port map (
O => qout(6),
I0 => qout(6),
I1 => qout(1),
I2 => qout(5),
I3 => INIT
);
ring_out <= qout(6);
end Behavioral;
78
A.2 VHDL Code for STRO-PUF
-- STRO-PUF; 16 STROs per group; 32 STROs per PUF
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
library UNISIM;
use UNISIM.VComponents.all;
entity str6PUF8 is
Port ( init : in STD_LOGIC;
ringout : out STD_LOGIC_VECTOR (1 to 32));
end str6PUF8;
architecture Behavioral of str6PUF8 is
component hm_str6 is
port (hm_init : in std_logic;
hm_ringout : out std_logic);
end component;
begin
-- instantiating hard-macros
79
puf1: hm_str6
port map( hm_init => init, hm_ringout => ringout(1));
puf2: hm_str6
port map( hm_init => init, hm_ringout => ringout(2));
puf3: hm_str6
port map( hm_init => init, hm_ringout => ringout(3));
puf4: hm_str6
port map( hm_init => init, hm_ringout => ringout(4));
puf5: hm_str6
port map( hm_init => init, hm_ringout => ringout(5));
puf6: hm_str6
port map( hm_init => init, hm_ringout => ringout(6));
puf7: hm_str6
port map( hm_init => init, hm_ringout => ringout(7));
puf8: hm_str6
port map( hm_init => init, hm_ringout => ringout(8));
puf9: hm_str6
port map( hm_init => init, hm_ringout => ringout(9));
puf10: hm_str6
port map( hm_init => init, hm_ringout => ringout(10));
80
puf11: hm_str6
port map( hm_init => init, hm_ringout => ringout(11));
puf12: hm_str6
port map( hm_init => init, hm_ringout => ringout(12));
puf13: hm_str6
port map( hm_init => init, hm_ringout => ringout(13));
puf14: hm_str6
port map( hm_init => init, hm_ringout => ringout(14));
puf15: hm_str6
port map( hm_init => init, hm_ringout => ringout(15));
puf16: hm_str6
port map( hm_init => init, hm_ringout => ringout(16));
puf17: hm_str6
port map( hm_init => init, hm_ringout => ringout(17));
puf18: hm_str6
port map( hm_init => init, hm_ringout => ringout(18));
puf19: hm_str6
port map( hm_init => init, hm_ringout => ringout(19));
puf20: hm_str6
port map( hm_init => init, hm_ringout => ringout(20));
81
puf21: hm_str6
port map( hm_init => init, hm_ringout => ringout(21));
puf22: hm_str6
port map( hm_init => init, hm_ringout => ringout(22));
puf23: hm_str6
port map( hm_init => init, hm_ringout => ringout(23));
puf24: hm_str6
port map( hm_init => init, hm_ringout => ringout(24));
puf25: hm_str6
port map( hm_init => init, hm_ringout => ringout(25));
puf26: hm_str6
port map( hm_init => init, hm_ringout => ringout(26));
puf27: hm_str6
port map( hm_init => init, hm_ringout => ringout(27));
puf28: hm_str6
port map( hm_init => init, hm_ringout => ringout(28));
puf29: hm_str6
port map( hm_init => init, hm_ringout => ringout(29));
puf30: hm_str6
82
port map( hm_init => init, hm_ringout => ringout(30));
puf31: hm_str6
port map( hm_init => init, hm_ringout => ringout(31));
puf32: hm_str6
port map( hm_init => init, hm_ringout => ringout(32));
end Behavioral;
83
A.3 UCF File for Mapping STRO-PUF in a Desired Region
#PACE: Start of Constraints generated by PACE
#PACE: Start of PACE I/O Pin Assignments
# Mapping onto Region 1
NET "init" LOC = "p59" ;
# output pins for oscillators in Group A
NET "ringout<1>" LOC = "p43" ;
NET "ringout<2>" LOC = "p48" ;
NET "ringout<3>" LOC = "p47" ;
NET "ringout<4>" LOC = "p42" ;
NET "ringout<5>" LOC = "p40" ;
NET "ringout<6>" LOC = "p29" ;
NET "ringout<7>" LOC = "p28" ;
NET "ringout<8>" LOC = "p27" ;
NET "ringout<9>" LOC = "p68" ;
NET "ringout<10>" LOC = "p44" ;
NET "ringout<11>" LOC = "p46" ;
NET "ringout<12>" LOC = "p49" ;
NET "ringout<13>" LOC = "p26" ;
NET "ringout<14>" LOC = "p23" ;
84
NET "ringout<15>" LOC = "p57" ;
NET "ringout<16>" LOC = "p22" ;
# output pins for oscillators in Group B
NET "ringout<17>" LOC = "p75" ;
NET "ringout<18>" LOC = "p50" ;
NET "ringout<19>" LOC = "p51" ;
NET "ringout<20>" LOC = "p60" ;
NET "ringout<21>" LOC = "p62" ;
NET "ringout<22>" LOC = "p54" ;
NET "ringout<23>" LOC = "p56" ;
NET "ringout<24>" LOC = "p63" ;
#NET "ringout<25>" LOC = "p64" ;
#NET "ringout<26>" LOC = "p65" ;
#NET "ringout<27>" LOC = "p66" ;
#NET "ringout<28>" LOC = "p76" ;
#NET "ringout<29>" LOC = "p79" ;
#NET "ringout<30>" LOC = "p80" ;
#NET "ringout<31>" LOC = "p77" ;
#NET "ringout<32>" LOC = "p67" ;
85
#PACE: Start of PACE Area Constraints Region 1
#Region 1 Group A
INST "puf1" LOC=CLB_R1C1.S1;
INST "puf2" LOC=CLB_R1C3.S1;
INST "puf3" LOC=CLB_R1C5.S1;
INST "puf4" LOC=CLB_R1C7.S1;
INST "puf5" LOC=CLB_R1C9.S1;
INST "puf6" LOC=CLB_R1C11.S1;
INST "puf7" LOC=CLB_R1C13.S1;
INST "puf8" LOC=CLB_R1C15.S1;
INST "puf9" LOC=CLB_R2C1.S1;
INST "puf10" LOC=CLB_R2C3.S1;
INST "puf11" LOC=CLB_R2C5.S1;
INST "puf12" LOC=CLB_R2C7.S1;
INST "puf13" LOC=CLB_R2C9.S1;
INST "puf14" LOC=CLB_R2C11.S1;
INST "puf15" LOC=CLB_R2C13.S1;
INST "puf16" LOC=CLB_R2C15.S1;
86
#Region 1 Group B
INST "puf17" LOC=CLB_R3C1.S1;
INST "puf18" LOC=CLB_R3C3.S1;
INST "puf19" LOC=CLB_R3C5.S1;
INST "puf20" LOC=CLB_R3C7.S1;
INST "puf21" LOC=CLB_R3C9.S1;
INST "puf22" LOC=CLB_R3C11.S1;
INST "puf23" LOC=CLB_R3C13.S1;
INST "puf24" LOC=CLB_R3C15.S1;
INST "puf25" LOC=CLB_R4C1.S1;
INST "puf26" LOC=CLB_R4C3.S1;
INST "puf27" LOC=CLB_R4C5.S1;
INST "puf28" LOC=CLB_R4C7.S1;
INST "puf29" LOC=CLB_R4C9.S1;
INST "puf30" LOC=CLB_R4C11.S1;
INST "puf31" LOC=CLB_R4C13.S1;
INST "puf32" LOC=CLB_R4C15.S1;
87
# End of UCF
88
A.4 Uniqueness Analysis of STRO-PUF for 16-bit Response
% 6 PUF /FPGA , 3 devices, 16 STROs/group, 32 STROs / PUF
% independent bits, one on one comparison
% simple comparison, comparing each oscillator only once
group_size = 16;
Npuf = 36 ;
k=1;
t=0;
rbit = (0);
for p=1:2:Npuf
k=1;
t=t+1;
for j=1:group_size
if (data(j,p) >= data(j,p+1))
rbit(t,k) = 1;
else
rbit(t,k) = 0;
end
k = k +1;
end
89
end
disp(rbit);
%binary to decimal
nbit1=2.^(size(rbit,2)-1:-1:0);
%decimal to hex
hexR1=dec2hex(nbit1*rbit.');
disp(hexR1); % PUF responses
%probability density function (PDF)
%calculating hamming distance between 1-1 pairs
%(generate 16-bit per comparisons from 16 pairs of STROs)
c=0;
int_hd = (0);
for i=1:(Npuf/2 - 1)
for j = i+1 : Npuf/2
c =c+1;
int_hd(c,:)= sum(abs(rbit(i,:)-rbit(j,:)));
end
end
90
display combinations:
disp(c);
disp(int_hd);
hd_data= int_hd; %frequency of occurrence of #bits flipped
binWidth = 1;
binCtrs = 1:1:16; %Bin centers, depends on data
n=length(hd_data);
counts = hist(hd_data,binCtrs);
prob = counts / (n * binWidth); %pmf = prob = counts / n
bar(binCtrs,prob,'hist');
min1 = min (int_hd); % minimum HD
max1 = max (int_hd); % maximum HD
fprintf ('\n minimum hamming distance = %d', min1);
fprintf ('\n maximum hamming distance = %d', max1);
s = sum(int_hd); % sum of HD
avg_hd = s/153;
fprintf ('\n average hamming distance = %f', avg_hd);
uniqueness = s/(153*16) *100; % sum_hd /(no. of combination * no. of bits in an output)
91
fprintf ('\n uniqueness = %f%% \n', uniqueness);
92
A.5 Uniqueness Analysis of STRO-PUF for 256-bit Response
% 6 PUF /device , 3 devices, 16STROs/group, 32 STROs
% inclusion of dependent bits
% comparing each oscillator in a group with every oscillator in another group
group_size = 16;
Npuf = 36 ;% no of oscillator. so no of pufs = Npuf/2
t=0;
rbit = (0);
for p = 1:2: (Npuf-1)
k=1;
t=t+1;
for i=1:(group_size)
for j=1:group_size
if (data(i,p) >= data(j,p+1))
rbit(t,k) = 1;
else
rbit(t,k) = 0;
end
k = k +1;
end
end
93
end
disp(rbit);
% converting binary to decimal to hex
% can’t convert more than 52 bit ; breaking 256-bit into set of 32-bit
r1 = rbit(1:(Npuf/2),1:32);
r2 = rbit(1:(Npuf/2),33:64);
r3 = rbit(1:(Npuf/2),65:96);
r4 = rbit(1:(Npuf/2),97:128);
r5 = rbit(1:(Npuf/2),129:160);
r6 = rbit(1:(Npuf/2),161:192);
r7 = rbit(1:(Npuf/2),193:224);
r8 = rbit(1:(Npuf/2),225:256);
%binary to decimal
nbit1=2.^(size(r1,2)-1:-1:0);
%decimal to hex
hexR1=dec2hex(nbit1*r1.');
nbit2=2.^(size(r2,2)-1:-1:0);
hexR2=dec2hex(nbit2*r2.');
nbit3=2.^(size(r3,2)-1:-1:0);
94
hexR3=dec2hex(nbit3*r3.');
nbit4=2.^(size(r4,2)-1:-1:0);
hexR4=dec2hex(nbit4*r4.');
nbit5=2.^(size(r5,2)-1:-1:0);
hexR5=dec2hex(nbit5*r5.');
nbit6=2.^(size(r6,2)-1:-1:0);
hexR6=dec2hex(nbit6*r6.');
nbit7=2.^(size(r7,2)-1:-1:0);
hexR7=dec2hex(nbit7*r7.');
nbit8=2.^(size(r2,2)-1:-1:0);
hexR8=dec2hex(nbit8*r8.');
rbitHex = [hexR1 hexR2 hexR3 hexR4 hexR5 hexR6 hexR7 hexR8];
disp(rbitHex); % PUF responses
%probability density function (PDF)
%calculating hamming distance
c=0;
int_hd = (0);
for i=1:(Npuf/2 - 1)
for j = i+1 : Npuf/2
95
c =c+1;
int_hd(c,:)= sum(abs(rbit(i,:)-rbit(j,:)));
end
end
display combinations:
disp(c);
disp(int_hd);
hd_data= int_hd; %frequency of occurrence of #bits flipped
binWidth = 1;
binCtrs = 1:5:250; %Bin centers, depends on data
n=length(hd_data);
counts = hist(hd_data,binCtrs);
prob = counts / (n * binWidth); %pmf = prob = counts / n
bar(binCtrs,prob,'hist');
min1 = min (int_hd); % minimum hd
max1 = max (int_hd); % maximum hd
fprintf ('\n minimum hamming distance = %d', min1);
fprintf ('\n maximum hamming distance = %d', max1);
96
s = sum(int_hd); % sum of hd
avg_hd = s/153; % average hamming distance
fprintf ('\n average hamming distance = %f', avg_hd);
uniqueness = s/(153*256) *100; % sum_hd /(no. of combination * no. of bits in an
output)
fprintf ('\n uniqueness = %f%% \n', uniqueness);
sources/146/8551286-aam.pdf
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 1
Thwarting Security Threats from Malicious FPGA
Tools with Novel FPGA-Oriented Moving Target
Defense (FOMTD) Zhiming Zhang, Laurent Njilla, Member, IEEE, Charles A. Kamhoua, Senior Member, IEEE,
and Qiaoyan Yu, Senior Member, IEEE
Abstract—The increasing usage and popularity of FPGA sys- tems bring in security concerns. Existing countermeasures are mostly based on the assumption that the computer-aided-design (CAD) tools for FPGA configuration are trusted. Unfortunately, this assumption does not always hold. In this work, we investigate the potential security threats originated from the untrusted CAD tools. Further, we exploit the principle of moving target defense (MTD) to propose a FPGA-oriented MTD (FOMTD) method. The three defense lines in FOMTD generate uncertainties, from the attacker’s point of view, to thwart hardware Trojan insertion attacks. The theoretical upper bound of the hardware Trojan hit rate for each defense line is provided in this work. Experimental results show that the proposed defense lines 2 and 3 reduce the Trojan hit rate by up to 40% and 91%, respectively, for the scenario that the malicious CAD tool can insert Trojans in the occupied FPGA slices. The proposed gate replacement technique in the defense line 3 further improves the attack resilience and obtains 88% reduction on the Trojan hit rate. Compared to the static redundancy based Trojan detection method, the proposed method achieves better resilience against Trojan insertions and consumes 50% less dynamic power.
Index Terms—Moving target defense, FPGA, Xilinx, Altera, hardware Trojan, FPGA design suite, hardware security.
I. INTRODUCTION
Field Programmable Gate Arrays (FPGAs) enter a rapid
growth era due to their attractive flexibility and CMOS-
compatible fabrication process. Global Market Insights pre-
dicts that the FPGA market size is expected to reach 9.98
billion US dollars by 2022 [1]. The increasing popularity of
FPGA may drive more attackers to compromise FPGA-based
systems through various channels. The work [2] highlights that
FPGA security embraces four aspects: (1) secure operations
conducted by FPGA devices, (2) utilization of FPGAs for
system security enhancement, (3) secure bitstream delivery to
FPGA devices, and (4) exploitation of FPGA devices as an
attack surface to breach FPGA-based systems. The aspects (1)
and (2) emphasize that the programmable features of FPGAs
Z. Zhang and Q. Yu are with the Department of Electrical and Computer Engineering, University of New Hampshire, Durham, NH, 03824 USA. e- mail: [email protected].
L. Njilla is with the Cyber Assurance Branch of Air Force Research Laboratory, Rome, NY 13441, USA. e-mail: [email protected].
C. Kamhoua is with the Network Security Branch of Army Research Lab- oratory, Adelphi, MD 20783, USA. e-mail: [email protected].
DISTRIBUTION A. Approved for public release: distribution unlimited. Case Number: 88ABW-2018-2036. Dated 11 May 2018.
Manuscript received April 7, 2018, revised August 10, 2018, September 14, 2018, accepted October 16, 2018.
have been exploited to address the security challenges that
Application-Specific Integrated Circuits (ASICs) are facing.
For example, the embedded FPGA is used to perform locking
key authentication [3], [4]. However, FPGAs have their own
security vulnerabilities. The literature [3], [5]–[8] extensively
discuss the aspects (3) and (4).
For the reason of efficiency and economy, the supply chain
of modern FPGAs is getting globalized. This trend potentially
increases the chance that FPGA devices or FPGA design
tools are not trustworthy. Intellectual property (IP) stealing
and tampering could happen in different data formats, such
as hardware description language (HDL) and bitstream [3],
[12]. The integrity of FPGA systems may be harmed by the
hardware Trojans induced in some stages of the FPGA design
flow [9]–[11].
This work aims to address the Trojan insertion threat from
malicious FPGA CAD tools. More specifically, we make the
following contributions in this work:
• We use two practical examples to demonstrate that a hardware Trojan can be injected during several stages
of the FPGA design flow without disturbing the original
HDL design file.
• We exploit the principle of moving target defense and propose an FPGA-Oriented Moving Target Defense
(FOMTD) countermeasure to resist the attacks from
malicious FPGA tools. To the best of our knowledge,
together with our preliminary work [10], [12], our re-
search is the first effort that assesses the feasibility of
applying the MTD concept to defeat hardware Trojan
insertion via malicious FPGA software.
• We propose three defense lines to generate three types of unpredictability, which facilitate to thwart the stealthy
design modification induced by compromised FPGA
software. The first defense line utilizes a user constraints
file to designate a portion of the design to specific FPGA
slices. The second defense line randomly selects one of
the design replicas at runtime and uses an input gating
technique to mute the unused replicas for power saving.
The third defense line divides a design into multiple
submodules and assembles the complete design with
hot-swappable submodules at runtime, increasing the
number of design configurations on the FPGA device.
• We analyze the theoretical upper bound of hardware Trojan hit rate for each defense line, and validate the
analysis through FPGA emulation.
Digital Object Identifier: 10.1109/TVLSI.2018.2879878
1557-9999 c© 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 2
The remainder of this work is organized as follows. Section
II discusses related work. Section III describes the attack
model used in this work. Practical attack examples on two
commercial FPGA design suites are provided in Section IV.
The FOMTD method is presented in Section V. The theoretical
security strength achieved by the three defense lines in the
FOMTD method is analyzed in Section V, as well. Extensive
evaluation of our method and relevant work is conducted in
Section VI. We conclude this work in Section VII.
II. RELATED WORK
One category of existing countermeasures against security
threats on FPGA systems focus on IP theft issues during FPGA
deployment phase. To avoid information leaking through hard-
ware Trojans, the MORPH architecture [13] combines multiple
levels of protection schemes, including morph operation, onion
encryption, replication, partial runtime reconfiguration and
hardware abstraction layer to mitigate the Trojans induced
in fabrication time or design time. No hardware cost and
detailed assessment are available in [13]. In the work [6], a
bitstream encryption method is implemented for the Xilinx
Virtex FPGA family. The security protocol for that encryp-
tion scheme protects the IP from being illegally copied via
restricting the access to the configuration file and key bits.
The method in [14] manipulates a state transition graph to
create a rare property and form watermarks. In the PUF-
FSM binding protection mechanism [15], the FSM in an IP
can only be activated by the correct response from the PUF
embedded in the FPGA. The MUTARCH approach in [16]
assigns each FPGA device a unique architecture to encrypt
the bitstream distinctively. Only the authorized device can
recognize the encrypted bitstream. However, those bitstream
protection methods only secure the FPGA implementation
during the bitstream generation stage. Without considering the
potential threats from the mapping and place&route stages in
the FPGA design flow, FPGA deployment is still vulnerable
to the threats from untrusted FPGA design suite.
Another category of defense efforts is to thwart the se-
curity threats originated from malicious FPGA devices. The
work [17] detects anomalies in the physical layer of the
FPGA by identifying the basic building block on the FPGA
die, which has different physical statistical characteristics
with neighboring blocks. In [18], a specific taxonomy of
FPGA-based hardware Trojan attacks is illustrated. That work
also presents an adapted triple modular redundancy (ATMR)
to detect hardware Trojans on FPGAs. The ATMR method
replicates the design three times and the third replica is
activated only when mismatch is found between the first two.
In the work [19], the normalized parameters (e.g., power
consumption and timing variation) are weighted and combined
as a threat detectability metric, which is compared with a
threshold to determine whether a hardware Trojan exists in
the design. The work [20] fills up the unoccupied FPGA space
with low-level dummy logics to eliminate the FPGA resource
available for hardware Trojan insertion. Those methods could
be nullified if Trojans are inserted during the process of FPGA
configuration.
There are limited works addressing the attacks from CAD
tools for FPGAs. Logic testing and side-channel analysis
have been exploited to detect the hardware Trojans inserted
through malicious FPGA design suites [21]. The Multiple
Excitation of Rare Occurrence (MERO) method [22] provides
a compact way to generate test patterns for Trojan detection.
The work [23] leverages the dependency between dynamic
current and maximum operating frequency to detect hardware
Trojan on FPGAs. Our preliminary work [12] addresses the
security challenges occurred during the FPGA deployment for
legacy systems. We apply the pin grounding scheme to the
unused FPGA I/O pins to block the communication between
FPGA Trojans and off-chip world, and further propose a
hardware MTD to thwart the Trojan insertion by malicious
CAD tools. We expand our work for legacy systems to general
FPGA applications in [10].
According to the discussions above, we conclude that most
of the existing solutions aim for the FPGA security threats
either from supply chain or FPGA devices, not from malicious
FPGA design suites. Although the FPGA vendors [24] adopt
bit encryption, authentication, and key/register zeroization
techniques to prevent bitstreams from being tampered, those
methods do not thwart the design modification before the
bitstream is generated by the FPGA software. Our previous
work [10], [12] exploit the principle of MTD to generate
uncertainty from the attacker’s point of view, effectively miti-
gating hardware Trojans and thus protecting the bitstream from
being maliciously modified. In this work, we add theoretical
analyses for our countermeasures, improve the MTD defense
strength, and perform more extensive performance and over-
head assessments.
III. ATTACK MODEL
A. Attacks from Malicious FPGA Design Suites
FPGA design software has been considered as potential
hardware threats challenging the FPGA security [25]. Un-
trusted FPGA CAD tools can be exploited by attackers to
insert hardware Trojans [26], [27]. As shown in Fig. 1(a),
our attack model assumes that the FPGA deployment engi-
neers, in-house designs, the bitstream downloading channel
and procedure are trusted. The untrusted phase interested in
this work is the FPGA configuration, especially the design
mapping, place and route stages. The attacks are originated
from malicious software mounted on top of the original
FPGA design suite for SRAM FPGAs, as shown in Fig. 1(b).
The FPGA design suite may not be malicious initially, but
advanced attackers could exploit the vulnerability of the FPGA
design suite to implant malicious software to the original suite
through software upgrading. We argue that the FPGA design
suite will be propagated through computer network or retailers,
so the integrity of the software may be sabotaged by advanced
attackers. One motivation example for this type of attacks
could be: if an attacker knows the military or bank is about
to purchase an FPGA to perform some specific functions.
The source of FPGA devices and the functional modules (in
a format of hardware description language) are trusted after
the rigorous examination. A stealthy way to compromise the
system is through a compromised FPGA design suite.
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 3
(a)
(b)
Fig. 1. Contaminated FPGA design suite leading to a stealthy modification on the placelist for an FPGA device. (a) Software compromising stage, and (b) malicious software add-on in the supply chain of FPGA tools.
B. Three Levels of Attacks
As the malicious program is mounted on top of the original
FPGA CAD tool before FPGA users utilize the FPGA tool
and development board, it is reasonable to assume that the
attacker does not know what exact design will be mapped
to the FPGA die. Depending on the attacker’s capability, we
classify the attacks into three levels.
• L-1 attack: Based on his/her experiences, the attacker places hardware Trojans in the most popular FPGA
die area. At this level, the attacker does not have any
knowledge of the design to be configured on the FPGA.
• L-2 attack: The attacker is capable of extracting informa- tion like which FPGA slices are utilized by the current
design from the FPGA placelist (i.e., the output after
placement and routing). Although the attack at this level
does not analyze the exact function of the design, the
exploration space of L-2 attacks is significantly smaller
than that of L-1 attacks.
• L-3 attack: The malicious FPGA CAD tool searches for the design replicas used by duplication based defense
techniques, and inserts identical Trojans to each replica.
Attacks at this level is powerful, as L-3 attacks are
able to nullify the countermeasure that simply duplicates
the design. In spite of being the most challenging, L-3
attacks will cost attackers more resources to guarantee
the success of Trojan insertion attacks.
To make an effective impact on the design function, hard-
ware Trojans interested in this work are the ones altering the
look-up table (LUT) configuration for the original design. A
hardware Trojan is composed of a trigger and a payload. In
our attack model, the trigger circuit can be located in either the
Synthesize -XST
PlanAhead
NGDBuild (Translate)
Map
PAR (Place & Route)
Bitgen
.ngc .ucf
.ngd
_map.ncd
.ncd
.bit
Attack
surface 1
_map.ncd*
Attack
surface 2
.ncd*
Attack
surface 3
.bit*
HDL
Fig. 2. Attack surfaces on the Xilinx FPGA design flow. The rectangles represent the output file from each step. The file with the symbol of * is an output file modified by the malicious FPGA software.
occupied or unoccupied FPGA slices, but the payload circuit
must interact with the FPGA area occupied by the original
design.
IV. DEMONSTRATION OF ATTACKS FROM MALICIOUS
FPGA SOFTWARE
In this section, we demonstrate two practical attacks through
two commercial FPGA design suites. The design suite’s built-
in tools are exploited to manually disturb the placelist.
A. Attacks on Xilinx ISE
Figure 2 depicts the design flow for a Xilinx FPGA design
suite. There are three potential attack surfaces for maliciously
implanted FPGA tools to land on. We use Xilinx ISE 14.1
as an example in the following discussion. In the step of
mapping, an attacker could introduce additional I/O pins,
exchange the existing I/O pin connection, modify the slew
rate and the voltage level of I/O pins. As the tampered
mapping output map.ncd* is not readable (unless the FPGA
design suite provides a program like ncd2xdl to read back
the native circuit description file), it is not easy to notice
the modification performed by the malicious FPGA software.
More tampering on the FPGA configuration can be done
in the step of Place and Route (PAR) than in the mapping
stage because all the LUTs, flip-flops, SRAM blocks, and
interconnects are specifically designated on the FPGA die. The
attack on the stage of bitstream generation is mainly for the
purpose of IP piracy, which is out of the scope in this work.
For interested readers, many existing literatures [15], [16], [28]
have extensive discussion on this issue. Our work focuses on
the first two attack surfaces shown in Fig. 2.
We successfully modified the configuration of the target
slice through the FPGA editor tool from Xilinx. Figure 3
shows the graphic interface. In the edit mode of the FPGA
editor, we changed the logic configuration after the PAR stage,
and then re-did bitstream generation. The attack can also be
performed via XDL file editing followed by the command
xdl2ndc. All attack actions here can be implemented in a
malicious FPGA software implanted in the original FPGA
design suite.
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 4
Fig. 3. An example of practical attack performed through the FPGA editor tool available in the Xilinx ISE 14.1 design suite.
Fig. 4. An example of practical attack performed through Quartus Chip Planner.
B. Attacks on Altera Quartus
The Altera FPGA design suite, Quartus, leaves similar back
doors for attackers to insert hardware Trojans. The security
vulnerability of Quartus is in the process of placement and
routing Fitter, like PAR in the Xilinix ISE. Attackers can,
in theory, manipulate the entire FPGA configuration if they
control Fitter or access and alter the design file that the tool
Fitter is dealing with. As shown in Fig. 4, attackers can
change buffer slew rate, I/O standard or logic function of
the design via the Quartus built-in tool Chip Planner. The
malicious changes can be done after design compilation and
no re-compilation process is needed to save the changes. The
attacks performed through Chip Planner are stealthy because
they do not disturb the functional module in a format of HDL
and the constraint settings.
V. PROPOSED FPGA-ORIENTED MOVING TARGET
DEFENSE (FOMTD) METHOD
Defenders must protect every entry point from potential
security threats. In contrast, an adversary only needs to find
one way to breach the attack surface. Moreover, the attacker
may even have unlimited time to perform attacks. The main
motivation of applying the MTD concept to a system is to
reduce, if not completely eliminate, the imbalanced advantage
that an attacker could have. MTD techniques can make the
system less predictable and thus the attack surface is changed
over time [29]. The early concept of MTD was illustrated
in [30] and the application of MTD has been observed in the
domain of cyber security [31].
A. Method Overview
We exploit the principle of MTD as a mean to proactively
address the security threats from malicious FPGA software.
Different with the traditional MTD methods in the domain
of cyber security, FOMTD explores the unpredictability of a
hardware design being configured on FPGAs to deter attackers
from precisely inserting hardware Trojans. More specifically,
the key idea of FOMTD is to make the output of FPGA
placement and routing unpredictable, such that attackers who
mount a malicious program on the original FPGA design suite
cannot easily and successfully alter the original implemen-
tation. Note, our method does not guarantee to completely
prevent all hardware intrusions. Instead, our approach will
increase the difficulty of a Trojan successfully landing on one
(or more) FPGA slices occupied by the design.
The desired unpredictabilities are achieved by the three de-
fense lines provided by our method. In the domain of hardware
(i.e. FPGA), we exploit the following configuration resources
to realize the FOMTD method: (i) the availability of multiple
replicas of the intended design, (ii) random selection of one
replica for operation at runtime, (iii) random designation of
FPGA slice positions for the selected LUTs, and (iv) hot-
swappable submodules for runtime design assembling.
B. Defense Line 1 (DFL1): Slice Position Selection through
User Constraints File
1) Method description: The use of FPGA default settings
for placement and route will make the location of occupied
FPGA slices predictable, which eases the Trojan insertion
through malicious FPGA CAD tools. To address this issue,
we propose to specify some slice locations for the selected
LUT configurations. This specification can be performed by
appending commands to the user constraints file, which is
typically used to specify pin and timing constraints. Figure 5
shows the effect of the proposed defense line 1 (DFL1). As
can be seen, the entire design is mapped to a different area of
the FPGA grid thanks to the reallocation of three LUTs (black
squares in Fig. 5).
The selection of slice positions is conducted by FPGA users
at the FPGA deployment stage. As FPGA deployment happens
after the implementation of the malicious FPGA software, it
is not easy for the malicious software designers (attackers) to
ensure the injected hardware Trojans successfully alter user
designs. Here, we assume that attackers do not have access
to the user constraints file applied after the FPGA CAD tool
is delivered to the FPGA user. A blindly inserted Trojan may
not effectively impact the design on the FPGA.
2) Case study: We used the ISCAS benchmark circuit
c6288 as an example to show the effect of slice position
specification. In the first case, we followed the default settings
of the Xilinx ISE 14.1 to generate the placelist for c6288. In
the second case, we chose one slice position for four randomly
selected LUTs (we refer this as the single-slice case). In
the third case, three slice locations are designated to twelve
LUTs (triple-slice case). We can observe the design placement
details in the FPGA editor. Figure 6 shows the slice occupation
results (red dots) for the three cases described above. As can
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 5
Fig. 5. FPGA mapping modified by proposed defense line 1. Three parts in different colors represent three partitions of the intended design. Black squares are three LUT configurations. Proposed defense line 1 alters the default LUT mapping on the FPGA grid.
(a) (b) (c)
Fig. 6. Design placement observed from the Xilinx FPGA editor for (a) default setting, (b) single-slice selection, and (c) triple-slice selection cases.
be seen, our defense line 1 indeed significantly changes the
design placement on the FPGA die.
3) Theoretical bound for defense line 1 thwarting different
Trojan attacks: The baseline here is the original design
without any protection. We assume that the intended baseline
design occupies φ slices, the entire FPGA die is composed of Φ user controllable slices. We define the hardware Trojan hit rate, Γ, as the probability that a randomly-picked slice is indeed one of the slices utilized by the design. As long as the
Trojan payload is located in the area occupied by the original
design, we consider it as a Trojan hit. If an attacker blindly
inserts a hardware Trojan to the FPGA die (i.e., blind attack),
the Trojan hit rate is equal to Eq. (1).
Γbaseline vs. blind attack = φ
Φ (1)
When the attacker has knowledge of the commonly used
slice area (i.e. L-1 attack), the target FPGA area will be smaller
than the entire FPGA die. The empirical number ξ is the coefficient for how much Trojan insertion space is narrowed by
the attacker based on his/her experience, and the range of ξ is between 0 and 1. Hereafter, we name ξ as the space coefficient of Trojan attacks. The function f(ξ) represents the degree of accuracy regarding whether the real design placement matches
to the attacker’s prediction. The detailed function of f(ξ) varies with the attacker’s LUT occupation guessing algorithm.
Now, the hardware Trojan hit rate for the design without any
protection against L-1 attack is calculated in Eq. (2). If f(ξ) reaches its maximum value, the entire design will be covered
in the attack space. Γbaselinevs. L−1 will decrease with the increasing space coefficient of Trojan attack ξ.
Γbaselinevs. L−1 = f(ξ) ∗ φ
ξ ∗ Φ (2)
When the L-2 attacker has the knowledge of the detailed
slice utilization, each inserted hardware Trojan will absolutely
impact the original design because the Trojan exploration
space is equal to the injection space. The Trojan hit rate for
L-2 attacks Γbaseline vs. L−3 is expressed in Eq. (3).
Γbaselinevs. L−2 = φ
φ = 1 (3)
In contrast, our proposed defense line 1 (DFL1) does not
use the default FPGA mapping settings. Thus, the target FPGA
area remains as the entire FPGA die Φ. Our Trojan hit rate turns to Eq. (4). Comparing Eq. (2) and Eq. (4) we can see
that the denominator of Eq. (4) is larger than that in Eq. (2).
Hence, our defense line 1 reduces the Trojan hit rate in the
scenario of L-1 attacks. Once the attacker knows the exact slice
utilization, the proposed defense line 1 cannot thwart L-2 and
L-3 attacks and the corresponding Trojan hit rate is 1.
ΓDF L1 vs. L−1 = f(ξ) ∗ φ
Φ (4)
C. Defense Line 2 (DFL2): Pseudo-Random Replica Selection
1) Method description: FPGA has a nature of reconfigura-
tion and redundancy. We exploit this nature to implement the
principle of MTD on FPGAs. Suppose a design is composed
multiple parts (however design portioning is not always neces-
sary). We duplicate the entire design (as a single unit) n times. Only one of the replicas will be active at a time, and the rest of
the replicas are inactive by using input gating technique. The
replica selection and input gating are controlled by a pseudo-
random selector, which is not a true random number generator.
Because we only have a limited number of replicas on the
FPGA, the range of the random number is not large. A user-
defined arbitrary logic function and a set of external inputs are
good enough to pseudo-randomly choose one of the replicas.
Meanwhile, the use of user-defined arbitrary logic can prevent
attackers from searching the typical random number generator
circuit to nullify the countermeasure in advance. Figure 7
shows the concept of our defense line 2, in which we do not
have a comparison logic to examine the consistency among
the n replicas to save power. As the fact that which replica
will be active is determined after the FPGA configuration, an
attacker (at L-1) needs to blindly place the hardware Trojan
to the entire FPGA die to make a successful attack.
2) Theoretical bound for defense line 2 thwarting different
Trojan attacks: Figure 8 depicts an example of exploration
expansion by our proposed defense line 2. A complete design
(including replication) consists of multiple units and the con-
trol logic for replica selection is small enough (compared to
φ) to be ignored for the simplicity of analysis. Because of the slice position specification, the rough size of the Trojan
exploration space SF OMT D can be expressed by Eq. (5).
SF OMT D = max (|Xi − Xk|) ∗ max (|Yk − Yj|) (5)
Compared to the baseline, our method achieves the theo-
retical worst-case hardware Trojan hit rate for L-2 and L-3
attacks as described in Eqs. (6) and (7), respectively. If L-2
attacks are taken place in the design, Γbaselinevs. L−2 increases
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 6
Fig. 7. Schematic diagram of proposed defense line 2.
Fig. 8. Hardware Trojan attack exploration space for (a) the design placement with default FPGA setting and (b) the design protected with FOMTD defense lines 1 and 2.
to 1. In contrast, ΓDF L1&2 vs. L−2 remains low due to the expanded Trojan exploration space by the defense line 2. The
exact Trojan hit rate depends on the size of the design unit
for duplication, ν. Eq. (6) is greater than φ SF OMT D
. Under
the condition of L-3 attacks, our Trojan hit rate will not go
beyond 1/n (theoretically, the worst-case Trojan hit rate is a uniform distribution of random replica selections). In our
simulation section, we observed that our actual Trojan hit rate
never reaches this upper bound.
ΓDF L1&2 vs. L−2 = φ
n ∗ ν + (φ − ν) (6)
ΓDF L1&2 vs. L−3 = φ
n ∗ φ =
1
n (7)
D. Defense Line 3 (DFL3): Runtime Design Assembling
1) Method description: Our defense line 3 is the hot-
swappable submodule assembling technique, as shown in
Fig. 9. We partition the original design into m submodules
and each submodule is duplicated by n times. Only one
replica of each submodule will be assembled into a complete
design. The pseudo-random selector is utilized to determine
which replica to be chosen at runtime. After a period of time,
the selection of submodule replicas will be changed without
stopping the normal operation (i.e. hot-swappable assembling).
The maximum number of design configurations is nm. This
Fig. 9. Schematic diagram of the Hot-swappable submodule assembling technique provided by proposed defense line 3.
Fig. 10. Two styles of applying defense line 3 to sequential circuits.
large number of configurations further increases the difficulty
for the attacker to recognize the entire design for attack.
The hot-swappable assembling technique shown in Fig. 9
is directly applicable for combinational circuits. We tailor this
technique to make it suitable for sequential circuits. As shown
in Fig. 10, two styles are available for the circuit composed of
combinational logic and memory elements. In style I, we do
not duplicate the registers so that the submodule assembling
techniques for combinational and sequential circuits are the
same. In style II, the registers have replicas, too. To realize the
hot-swappable feature, we copy the content of active registers
to the hot-swap registers (HS Reg. in Fig. 10) before the
runtime submodule swapping happens. Then, we load the
value saved in HS Reg. to all register replicas to resume the
operation after runtime submodule swapping.
Additional option 1: input gating. To thwart L-3 attacks,
we could further strengthen our defense line 3 by loosing the
input gating and enabling two replicas active, such that the
two replicas can be used to examine the consistency between
their final outputs. However, the enhanced defense capability
comes with more power consumption.
Additional option 2: gate replacement on replicas. To
defeat L-3 attacks better, we enhance our defense line 3
by bringing diversity to the replicas of hot-swappable sub-
modules. In the work [18], the diversity on implementation
is introduced by using different hard macros, which are
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 7
Fig. 11. Gate replacement for the security enhancement in defense line 3.
obtained by applying different constraint conditions during
FPGA synthesis. Inspired by that work, we create hard macros
at gate level so that we have more flexibility to facilitate
the implementation of heterogeneous replicas for submodules.
Those gate-level hard macros are used to replace some gates in
one of the replicas. As a result, even if an attacker searches for
the same FPGA configuration patterns between two replicas,
the success rate of finding two identical copies for future
Trojan insertion will be extremely low.
The flowchart for the proposed gate replacement on replicas
is depicted in Fig. 11. First, we randomly choose one (or
more) type(s) of logic gates, for instance NAND (c, a, b),
in one replica. Next, we apply the de Morgan’s laws to
replace the chosen gate with other types of logic gates, while
maintaining the same Boolean function. For the 2-input NAND
gate, we can replace it with OR (c, ∼a, ∼b). Note, all the gate replacement is done in the Verilog description. To prevent
the FPGA synthesis tool from removing our gate replacement
during the logic optimization process, we implement the OR
(c, ∼a, ∼b) with three customized hard macros, HM OR (ā, b̄, c), HM NOT (a, ā) and HM NOT (b, b̄). HM OR and HM NOT defined in Verilog work as the logic OR and
inversion operations. By using hard macros, the gates for
replacement can be mapped into one independent slice and
they will not be merged with other LUT configuration. We
can conduct gate replacement for one or multiple replicas so
that the identical LUT configurations will be removed. Hence,
our enhanced defense line 3 can thwart L-3 attacks.
2) Theoretical bound for defense line 3 thwarting different
Trojan attacks: With the defense line 3, we can obtain nm
configurations in total. Given a design, more submodules lead
to more dynamic configurations and thus more unpredictability
for Trojan insertion. The coefficient ξi varies for each configu- ration and so does f(ξi). To obtain the Trojan hit consistently, the overall Trojan hit rate for L-1 attacks is as expressed in Eq.
(8), in which sp is the number of hot-swapping configurations. As the slides for the non-duplicated portion of the design
change in each FPGA configuration, the overall Trojan hit
rate is the product of the Trojan hit rate for sp different hot- swapping configurations. The maximum value of sp is nm.
ΓDF L3 vs. L−1 =
n m
∏
i=1
(
f(ξi) ∗ φ
Φ
)sp
(8)
With respect to L-2 attacks, the attacker knows which slices
are occupied by the design but cannot differentiate which
submodule belongs to which replica. Hence, the target slice
for Trojan insertion is not clear. The attacker has to randomly
chooses φ slices out of all the occupied slices n ∗ ν + (φ − ν). The corresponding Trojan hit rate for this scenario is expressed
in Eq. (9).
ΓDF L3 vs. L−2 =
(
φ
n ∗ ν + (φ − ν)
)sp
(9)
In L-3 attacks, the attacker has full knowledge of which
slices are configured for the design protected with the defense
line 3, but he/she could only form the complete design by
guessing which submodule replicas will be used. Without gate
replacement, the corresponding Trojan hit rate is shown in Eq.
(10), where ∑m
i=1 xi is equal to ν. The more swapping during
the runtime operation (i.e., higher sp), the less Trojan hit rate
the attacker could achieve.
ΓDF L3 vs. L−3 =
(
(m + 1)! · ∏m
i=1 nxi
∏m
i=1 (n ∗ ν + (φ − ν) − i)
)sp
(10)
VI. EXPERIMENTAL RESULTS
A. Experimental Setup
In the following experiments, we used the Xilinx ISE 14.1
design suite to synthesize, place and route the netlist of
ISCAS’85 and ISCAS’89 benchmark circuits, and the Amber
23 processor core (hereafter, a23) and the communication
controller Ethernet MAC (hereafter, ethmac) downloaded from
the OpenCores website. The ISCAS circuits were configured
for a Xilinx Spartan-6 XC6SLX16 FPGA, and the large-scale
a23 and ethmac circuits were mapped to a Xilinx XC6SLX75
FPGA. The detailed slice utilization of each circuit was
analyzed by our Python script to extract the occupied FPGA
slice positions. We used MATLAB to insert hardware Trojans
blindly or purposely (depending on the experimental goal)
in the extracted placelists to mimic the Trojan injection in
the FPGA mapping and PAR stages, and then measured the
hardware Trojan hit rate. We assume that only the Trojans
having payloads in the FPGA slices occupied by the design
under protection will lead to a Trojan hit. The FPGA slice
utilization and worst-case delay were obtained from the tools
available in the Xilinx design suite.
B. Variation on FPGA Slice Utilization
Variation on slice allocation for a design is critical to ensure
the high unpredictability offered by our method. Hence, we
first examined the impact of our defense line 1 on the FPGA
slice utilization. We compared all the slices used by the
baseline design and the one applied user-specified slice des-
ignations. The baseline means the original benchmark circuits
without any protection. We define a metric non-similarity rate
to assess the slice location difference that have been made
by our defense line 1. The non-similarity rate represents the
ratio of the number of the LUT instances being placed to new
positions due to our method over the total number of slices
used in the baseline.
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 8
c432 1s
c1355 1s
c1908 1s
c6288 1s
c432 3s
c1355 3s
c1908 3s
c6288 3s
0.47
0.48
0.49
0.5
0.51
0.52
0.53
N o n -s
im il a ri ty
R a te
Fig. 12. Non-similarity rate achieved by proposed defense line 1. The subscripts 1s and 3s means the location of a single slice or three slices are specified in the user constraints file for the FPGA implementation. On each bar, the central mark indicates the median, and the bottom and top edges of the box indicate the 25th and 75th percentiles, respectively.
TABLE I MEDIANS OF NON-SIMILARITY RATE ON FPGA CONFIGURATION.
Circuits c4321s c13551s c19081s c62881s Std. deviation
Median 0.49167 0.50595 0.49351 0.49123 0.0070
Circuits c4323s c13553s 19083s c62883s Std. deviation
Median 0.5000 0.50595 0.49351 0.50125 0.0051
Circuits s3442s s5262s s14882s s132072s Std. deviation
Median 0.48333 0.42105 0.4878 0.43367 0.0340
As shown in Fig. 12, our method achieves an average non-
similarity rate in the range of 0.49 to 0.51. This means, on
average, about 50% of the LUT instances for each benchmark
circuit being placed to different positions on the FPGA die
due to our defense line 1. We repeated the simulation on
non-similarity rate for sequential circuits and summarized the
median values for all non-similarity rates in Table I. As shown,
the proposed defense line 1 approximately achieves a non-
similarity rate of 0.5. The increase on the number of user
specified slice locations slightly enlarges the variation on the
non-similarity rate (but still close to 0.5). Each non-similarity
rate in Fig. 12 and Table I was based on five test trails.
According to our case study, the average standard deviation
on the median value of different non-similarity rates is in the
range of 0.0070 to 0.034, which is very small.
Figure 13 provides the average non-similarity rates for seven
benchmark circuits (c432, c1355, c1908, c6288, s444, s1488
and s13207) based on five trials. The non-similarity rates are
all near 0.5, regardless of the number of re-allocated FPGA
slices by defense line 1. Based on the results above, we do
not suggest users re-allocating more than three slices even for
large designs.
C. Assessment on Attack Resilience
The attack resilience of the baseline and our method are
compared in this section. Three attack levels mentioned in
Section III.B are considered in the following assessment.
1) Hardware Trojan Hit Rate for L-1 Attacks: Recall that
attackers who execute L-1 attacks do not know the locations
Fig. 13. Average non-similarity rate for different number of re-allocated slices.
0% 10% 20% 30% 40% 50%
Space coefficient of Trojan attacks, ξ
0
0.05
0.1
0.15
0.2
0.25
H a rd
w a re
T ro
ja n h
it r
a te
, Γ c432-baseline
c432-proposed
(a)
0% 10% 20% 30% 40% 50%
Space coefficient of Trojan attacks, ξ
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
, Γ c1355-baseline
c1355-proposed
(b)
0% 10% 20% 30% 40% 50%
Space coefficient of Trojan attacks, ξ
0
0.1
0.2
0.3
0.4
0.5
H a rd
w a re
T ro
ja n h
it r
a te
, Γ c1908-baseline
c1908-proposed
(c)
0% 10% 20% 30% 40% 50%
Space coefficient of Trojan attacks, ξ
0
0.1
0.2
0.3
0.4
0.5
0.6
H a rd
w a re
T ro
ja n h
it r
a te
, Γ
c6288-baseline
c6288-proposed
(d)
0% 10% 20% 30% 40% 50% Space coefficient of Trojan attacks, ξ
0
0.005
0.01
0.015
0.02
0.025
0.03
H a
rd w
a re
T ro
ja n
h it r
a te
, Γ
s444-baseline
s444-proposed
(e)
0% 10% 20% 30% 40% 50%
Space coefficient of Trojan attacks, ξ
0
0.1
0.2
0.3
0.4
0.5
H a
rd w
a re
T ro
ja n
h it r
a te
, Γ
a23-baseline
a23-proposed
(f)
Fig. 14. Hardware Trojan hit rate reduction by proposed defense line 1 applied in the benchmark circuit (a) c432, (b) c1355, (c) c1908, (d) c6288, (e) s444, and (f) a23 in the scenario of L-1 attacks.
of all occupied slices for the design of interest. We varied the
range of attack exploration space from 5% to 50% of the entire
FPGA die. Figure 14 shows that the proposed defense line 1
achieves a lower hardware Trojan hit rate Γ than the baseline in a wide range of the attack exploration space. This is because
our defense line 1 makes the LUT placement unpredictable and
not targetable for L-1 attackers. The hardware Trojan hit rate
for c432, c1908, c6288, s444, and a23 first increases with the
increasing ξ. This is because f(ξ) ∗ φ in Eq. (2), the number of occupied slices falling in the attack space, grows faster
than ξ ∗ Φ, the attack space. As the maximum value of f(ξ) is 1, Γbaseline starts to drop after ξ exceeds a threshold. In our case studies, the ξ thresholds for c432, c1355, c1908, c6288, s444, and a23 are 15%, 5%, 15%, 25%, 40%, and
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 9
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
Baseline vs. L- 2 at t ack
DFL2 vs. L- 2 at t ack
DFL3 vs. L- 2 at t ack
(a)
c432 c1355 c1908 c6288 s444 s1488 s13207 a23 ethmac
Benchmark circuits
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
Baseline vs. L2 at t ac k
DFL2 vs. L2 at t ac k
DFL3 vs. L2 at t ac k
(b)
Fig. 15. Hardware Trojan hit rate for (a) c432, and (b) nine benchmark circuits suffering from four hardware Trojans inserted via L-2 attacks.
35%, respectively. The case of c1355 has a smaller ξ threshold than the other benchmark circuits, so we do not observe that
the corresponding Γbaseline increases with ξ. The hardware Trojan hit rate of our method increases much slower with the
increasing ξ than the baseline. When the attack exploration space is large enough to cover the entire design placed on
the FPGA die, the Trojan hit rate of proposed method will
approach to the Trojan hit rate of the baseline eventually.
2) Hardware Trojan Hit Rate for L-2 Attack: Different with
L-1 attacks, L-2 attacks are able to retrieve the exact locations
of the occupied slices. Consequently, the baseline design does
not have any resilience against L-2 attacks. The proposed
defense line 2 (DFL2) activates one complete design replica
according to the pseudo-random selection and the defense
line 3 (DFL3) assembles the hot-swappable submodules at
runtime. Here, we used two design replicas and each replica
composed of four submodules. Our simulation indicates that
DFL2 and DFL3 further increase the unpredictability of the
truly activated design copy and achieve a lower Trojan hit
rate than the baseline. As shown in Fig. 15(a), the baseline
yields a hardware Trojan hit rate of 1, which means Trojans are
always injected to the occupied slices. In contrast, our DFL2
and DFL3 significantly reduce the Trojan hit rate over the
baseline especially for the small number of injected Trojans.
When more Trojans are placed in the utilized FPGA slices, our
Trojan hit rate eventually increases due to the limited number
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
Baseline vs. L2 at t ack
DFL2 vs. L3 at t ack
DFL3 vs. L3 at t ack
(a)
c432 c1355 c1908 c6288 s444 s1488 s13207 a23 ethmac
Benchmark circuits
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
Baseline vs. L3 at t ac k
DFL2 vs. L3 at t ac k
DFL3 vs. L3 at t ac k
(b)
Fig. 16. Hardware Trojan hit rate for (a) c432, and (b) nine benchmark circuits suffering from four hardware Trojans inserted via L-3 attacks.
of replicas available in the design.
We examined the Trojan hit rate for nine benchmark circuits,
which suffer from four Trojan insertions via L-2 attacks. Each
hardware Trojan hit rate was obtained from 10,000 test cases.
The average Trojan hit rate of DFL2 (DFL3) is 71% (38%).
As shown in Fig. 15(b), the DFL2 reduces the hit rate by up
to 40% over the baseline. The reduction on the Trojan hit rate
can be further improved to 91% with DFL3.
3) Hardware Trojan Hit Rate for L-3 Attack: L-3 attacks
can recognize the multiple replicas of the design by searching
for the exactly same or approximately similar LUT configura-
tion. We repeated the same experiments as we did for Section
VI.C.2), except with a different attack level. For the sequential
circuits s1488 and s13207, they were implemented in style I.
As shown in Fig. 16(a), the Trojan hit rate for the design under
L-3 attacks increase with the increasing number of Trojans.
This trend is similar with that for the L-2 attack case. However,
the average Trojan hit rate of DFL2 (DFL3) against L-3 attacks
increases to 76% (48%), which is higher than in the scenario
of L-2 attacks. As shown in Fig. 16(b), the DFL2 reduces the
hit rate by up to 35% over the baseline. The DFL3 further
improves the attack resilience by up to 72%. From Figs. 15(b)
and 16(b) we can also conclude that L-3 attacks indeed are
more powerful than L-2 attacks. This is because L-3 attacks
can search for the matched LUT configuration patterns.
Figure 17(a) shows that the average number of exactly
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 10
c432 c1355 c1908 c6288
Benchmark circuits
10 0
10 1
10 2
10 3
A v e ra
g e N
u m
b e r
o f
E x a c t M
a tc
h in
g
w/o gate replacement
w/ gate replacement
(a)
c432 c1355 c1908 c6288
Benchmark circuits
10 0
10 1
10 2
10 3
A v e ra
g e N
u m
b e r
o f
A p p ro
x im
a te
M a tc
h in
g
w/o gate replacement
w/ gate replacement
(b)
Fig. 17. Comparison of the number of Trojan hits for without and with gate replacement to thwart L-3 pattern searching attack. (a) Exact matching and (b) Approximate matching.
matched LUT configurations per each benchmark circuit is
close to 100 (i.e. 1). If attackers search for the LUT con- figurations which have a similar format but use different
input/out pins (i.e approximate matching), the number of
matched cases increases. To address this issue, we applied
the gate replacement technique to the defense line 3. As can
be seen from Fig. 17(a), our enhanced method can increase
the number of exact matching LUT configurations, so that
the same LUT configurations do not stand for the identical
logic function for the benchmark circuit any more. Therefore,
when an attacker performs the L-3 attack, the Trojan hit
rate of our method can be reduced. Not only increasing
the number of exact matching cases, our gate replacement
technique also increases the number of approximate matching
patterns, as shown in Fig. 17(b). As a result, our enhanced
DFL3 reduces the hardware Trojan hit rate. From Fig. 18 we
can see, the proposed gate replacement technique reduces the
Trojan hit rate for different circuits. On average, our method
makes the Trojan hit rate decrease by 62% and 88% for
the attacker searching for exact matching and approximate
matching configurations, respectively.
D. Dependent Design Factors for Trojan Hit Rate Reduction
In the proposed defense line 3, our method swaps the
replicas of submodules at runtime. We examined the impact of
the number of hot swaps on the Trojan hit rate. As depicted in
Figs. 19(a) and (b), a larger number of hot swaps used in the
design yields a lower hardware Trojan hit rate. However, as
the number of inserted hardware Trojans increases, the Trojan
hit rate reduced by hot swapping gradually decreases. This
c432 c1355 c1908 c6288
Benchmark circuits
0
0.02
0.04
0.06
0.08
0.1
H T
H it R
a te
b y S
e a rc
h in
g
fo r
E x a c t M
a tc
h in
g
w/o gate replacement
w/ gate replacement
(a)
c432 c1355 c1908 c6288
Benchmark circuits
0
0.2
0.4
0.6
0.8
H T
H it R
a te
b y S
e a rc
h in
g
fo r
A p p ro
x im
a te
M a tc
h in
g
w/o gate replacement
w/ gate replacement
(b)
Fig. 18. Comparison of hardware Trojan hit rate for without or with proposed gate replacement to thwart L-3 pattern searching attack. (a) Exact matching and (b) approximate matching.
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
no hot swaps
2 hot swaps
4 hot swaps
6 hot swaps
8 hot swaps
(a)
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
no hot swaps
2 hot swaps
4 hot swaps
6 hot swaps
8 hot swaps
(b)
Fig. 19. Impact of number of hot swaps on hardware Trojan hit rate for c432 under (a) L-2 attacks, and (b) L-3 attacks.
conclusion applies to all the benchmark circuits we tested,
and remains consistent with the scenario of L-3 attacks shown
in Figs. 20(a) and (b).
The impact of the number of replicas n and submodules m on the Trojan hit rate are shown in Figs. 21(a) and (b), respectively. The increase on n helps to reduce the hardware
Trojan hit rate, as a larger n yields more unpredictability for attackers. The reduction on the Trojan hit rate becomes
more noticeable if more hardware Trojans are injected to the
design. The impact of m on the Trojan hit rate is not as significant as the impact from n (which is also indicated by the mathematical analysis in Eq. (10)). However, the number
of submodules in the original design will slightly affect the
area overhead, as shown in Table II. The overhead on the
worst-case delay varies, depending on how the submodules
are divided. In general, more submodules lead to an increase
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 11
2 4 6 8
Number of hot swaps
0
20
40
60
80
100
R e
d u
c ti o
n o
n h
a rd
w a
re T
ro ja
n
h it r
a te
( %
)
ethmac
a23
s13207
s1488
s444
c6288
c1908
c1355
c432
(a)
2 4 6 8
Number of hot swaps
0
20
40
60
80
R e
d u
c ti o
n o
n h
a rd
w a
re T
ro ja
n
h it r
a te
( %
)
ethmac
a23
s13207
s1488
s444
c6288
c1908
c1355
c432
(b)
Fig. 20. Impact of number of hot swaps on hardware Trojan hit rate for nine benchmark circuits affected by four hardware Trojans inserted via (a) L-2 attacks, and (b) L-3 attacks.
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a
rd w
a re
T ro
ja n
h it r
a te
n=2
n=3
n=5
(a)
1 2 3 4 5 6
Number of inserted hardware Trojans
0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te m=4
m=6
m=8
m=10
(b)
Fig. 21. Impact of the number of (a) replicas n and (b) submodules m on hardware Trojan hit rate for c6288 affected by four hardware Trojans.
TABLE II IMPACT OF NUMBER OF SUB-MODULES (M) ON FPGA COST AND DELAY.
LUTs c432 c1355 c1908 c6288 s444 s1488 s13207
m = 4 110 158 181 1118 96 259 433
m = 6 110 169 186 1139 99 264 458
m = 8 119 172 195 1152 83 269 468
m = 10 129 179 195 1151 87 273 472
Delay(ns) c432 c1355 c1908 c6288 s444 s1488 s13207
m = 4 6.747 5.249 5.974 10.666 1.43 4.559 3.587
m = 6 6.638 5.195 6.121 10.768 1.376 4.677 3.587
m = 8 6.954 5.288 6.136 11.288 1.634 4.418 3.641
m = 10 7.034 5.344 5.92 10.755 1.635 4.472 3.589
on the delay. The results in Table II are based on the DFL3
without gating technique and the replica number of 2.
E. Assessment on Hardware Cost, Delay and Power
The following experiments are based on the setup below:
the replicas for DFL2 and DFL3 were two, each circuit was
TABLE III NUMBER OF FPGA LUTS UTILIZED BY DIFFERENT METHODS.
Circuits c432 c1355 c1908 c6288 s444 s1488 s13207
Baseline 58 62 58 530 33 117 180
DFL1 58 62 59 530 33 117 181
DFL2 158 156 178 1123 67 261 429
DFL3.G 173 167 216 1157 84 296 443
DFL3.NG 110 158 181 1118 96 259 433
divided into four submodules, the style I was applied to DFL3,
four hot swappings were conducted during simulation.
1) Hardware Utilization: Table III summarizes the number
of utilized LUTs for different methods. Since our DFL1 only
changes the location of designated slices, on average, our
method consumes 0.33% more LUTs than the baseline. In
DFL2, we duplicated the design under protection once and
utilized a pseudo-random selection unit for replica selection.
The unselected replica was muted through input gating. For
the small circuits, the increase on the LUT utilization could
be large due to the relative large size of pseudo-random
selection and input gating. However, when the object for
protection is large, the FPGA overhead can be reduced through
optimization. The LUT overheads for the largest combinational
circuit c6288 and sequential circuit s13207 in our case studies
are 111.9% and 138.3%, respectively.
During the hot-swapping process, our DFL3 without input
gating (i.e. DFL3.G) interleaved multiple sections of the orig-
inal design and its replicas. In addition to the primary inputs,
the input gating technique was also applied to the inputs for
hot-swappable submodules. As a result, the LUT overheads for
c6288 and s13207 increase to 116.8% and 145%, respectively.
If we remove the input gating (i.e. NG) option, the correspond-
ing overheads on the utilized LUTs for the largest circuits
are reduced to 110.9% and 140.6%, respectively. Certainly,
removing the input gating will cost more power consumption.
Although our DFL3 incurs comparable LUT utilization for
double modular redundancy (DMR), our runtime replica se-
lection ensures lower power consumption and provides better
unpredictability. We also examined the hardware utilization
on the large-scale circuits a23 and ethmac. Our experiments
indicated that the overheads on LUT utilization for a23 and
ethmac are 196.4% (212.4%) and 119.3% (156.3%) for DFL2
(DFL3), respectively.
2) Power Consumption: We synthesized the Verilog codes
for the benchmark circuits in the Synopsys Design Compiler.
The clock frequency was set to 100 MHz for each design. We
measured the power consumption in the tool Design Compiler
and reported in Table IV. On average, the proposed DFL2
leads to an increase on the total power by 8.86% over the
baseline. Our DFL3 with input gating provides better resilience
against advanced attacks, at the cost of 11% more total power
than the baseline. The increased power consumption is due to
the pseudo-random selection and input gating logic, as well
as the multiplexers before the final outputs.
3) Worst-case Delay: We measured the worst-case delays
for different designs using the PlanAhead tool in Xilinx ISE
14.1 design suite. As shown in Table V, slice designation
used in the proposed DFL1 could lead to more or less worst-
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 12
TABLE IV TOTAL POWER CONSUMPTION BY DIFFERENT METHODS. UNIT: µW.
Circuits Baseline DFL 2 DFL3.G
c432 10.37 (100%) 11.05 (107%) 11.82 (114%)
c1355 48.66 (100%) 50.56 (104%) 50.56 (104%)
c1908 40.14 (100%) 42.50 (106%) 43.56 (109%)
c6288 217.41 (100%) 232.75 (107%) 233.67 (107%)
s444 20.01 (100%) 21.68 (108%) 21.85 (109%)
s1488 12.50 (100%) 15.25 (122%) 15.56 (124%)
s13207 303.33 (100%) 329.03 (108%) 335.07 (110%)
case delay, depending on where the slice is designated. To
examine the impact of the slice designation on the worst-case
delay, we varied the number of designated slices from 1 to 3,
and performed five test cases for each designation condition.
Based on our case studies, DFL1 induces a delay overhead
as large as 1.74% and 3.52% for the single-slice designation
and triple-slice designation, respectively. Given a tight timing
budget, several slice selections should be examined for the best
slice re-allocation in terms of incurring minor delay overhead.
Compared to the baseline, our DFL2 leads to the worst-case
delay increase in the range of 4.23% to 17.19% for different
benchmark circuits. Due to the hot-swappable logic, the delay
overhead induced by DFL3 is no more than 22.02%. For the
large-scale benchmark circuits a23 (ethmac), the DFL2 incurs
4.4% (6.2%) more delay than the baseline. Our DFL3 causes
the worse-case delay increase by 16% and 8.3% over the
baseline for a23 and ethmac, respectively.
F. Comparing FOMTD with Static Trojan Detection Method
In this section, we compare our FOMTD with static Trojan
detection methods, which are based on double or triple mod-
ular redundancy (DMR or TMR). Even though the attacker
who performs L-2 attacks can see the utilized LUTs, it is
not guaranteed that the attacker can successfully place two
identical hardware Trojans in two design replicas. Because
the Trojans inserted on the replica comparison logic cannot
be detected by DMR, the Trojan hit rate is not reduced to
zero. When we advance the attack method to L-3 attacks, our
DFL3 effectively reduces the Trojan hit rate. Together with
the runtime hot-swapping feature, fewer number of exactly
matched LUT configurations available in the netlist of our
method benefits us to reduce the success rate of a Trojan
inserted by L-2 and L-3 attacks. Figure 22 shows that our
method achieves a lower Trojan hit rate than DMR. On
average, our DFL3 reduces the Trojan hit rate by 63.3% and
42.5% against L-2 and L-3 attacks, respectively. Indeed, L-
3 attacks can search for the identical LUT configurations, but
the number of exactly matched LUT configurations is not high
in FPGA mapping (which is different with ASIC design).
Figure 23 shows that our DFL3 can effectively reduce the
number of exact matching cases over DMR. This explains why
DFL3 obtains a better attack resilience than DMR. Because of
the input gating, our DFL3 consumes less power than DMR.
As indicated in Fig. 24, the total power consumption for the
five benchmark circuits protected with DFL3 is less than that
for the circuits protected with DMR. On average, our method
Proposed DFL3 DMR 0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
c1355
c1908
c6288
s1488
s13207
(a)
Proposed DFL3 DMR 0
0.2
0.4
0.6
0.8
1
H a rd
w a re
T ro
ja n h
it r
a te
c1355
c1908
c6288
s1488
s13207
(b)
Fig. 22. Comparison of hardware Trojan hit rate for proposed defense line 3 and DMR affected by four Trojans inserted via (a) L-2 and (b) L-3 attacks.
c1355 c1908 c6288 s1488 s13207 1
1.2
1.4
1.6
1.8
2
2.2
A v e
ra g
e N
u m
b e
r o
f
E x a
c t
M a
tc h
in g
Proposed DFL3
DMR
Fig. 23. Comparison of number of exact matching on LUT configuration.
Fig. 24. Comparison of power consumption between proposed DFL3 and DMR.
achieves 50% reduction on the total power over the DMR
method.
Next, we applied the proposed method and adaptive TMR
(ATMR) [18] to the circuits for a practical application. We
connected the Xilinx FPGA board to a monitor through a
Video Graphics Array (VGA) cable. The function module
configured in the FPGA device was used to draw a chess
board on a screen by sending a VGA signal to the monitor.
Two hardware Trojans were inserted to the FPGA placelist by
the mean of L-3 attacks. Our DFL3 was applied to thwart the
L-3 attack from the untrusted FPGA software. Our method
guaranteed the correct display of the original picture shown in
Fig. 25(a). In contrast, the ATMR method did not eliminate the
effect of the two Trojans, yielding a distorted chess board, as
shown in Fig. 25(b). This is because the L-3 attack searches for
the identical LUT configurations in the ATMR design replicas
and inserts the Trojans in the identical LUTs, each belonging
to one design replica.
VII. CONCLUSION
Many security mechanisms for FPGA-based systems have
been investigated to prevent systems from IP theft and reverse
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 13
TABLE V COMPARISON OF WORST-CASE DELAY. UNIT: NS.
Circuits c432 c1355 c1908 c6288 s444 s1488 s13207
Baseline 5.659 4.677 5.241 10.181 1.43 4.105 3.328
DFL1
single-slice designation
case 1 5.713 4.677 5.500 10.128 1.314 4.051 3.328 case 2 5.603 4.677 5.458 10.358 1.322 3.997 3.274 case 3 5.549 4.677 5.322 10.18 1.43 3.947 3.274 case 4 5.711 4.622 5.257 10.013 1.322 3.979 3.274 case 5 5.657 4.679 5.278 9.905 1.376 4.049 3.328 +/- delay -1.94%∼0.95% -1.18%∼0 0∼0.49% -1.65%∼1.74% -0.81%∼0 -3.85%∼0 1.62%∼0
triple-slice designation
case 1 5.607 4.57 5.287 10.234 1.378 4.009 3.272 case 2 5.715 4.731 5.406 9.966 1.378 3.979 3.328 case 3 5.553 4.679 5.335 9.979 1.378 4.049 3.328 case 4 5.661 4.677 5.448 10.51 1.378 4.049 3.328 case 5 5.606 4.669 5.334 9.899 1.322 4.009 3.272 +/- delay -0.96%∼1.93% 0%∼3.52% 0∼3.05% -3.27%∼2.7% -4.06%∼0 -0.75%∼1% 0%∼1.17%
DFL2 6.164 (+8.92%)
5.249 (+12.23%)
5.702 (+8.80%)
10.612 (+4.23%)
1.578 (+10.35%)
4.699 (+14.47%)
3.900 (+17.19%)
DFL3 6.528 (+15.36%)
5.707 (+22.02%)
6.177 (+17.86%)
10.925 (+7.31%)
1.637 (+14.48%)
4.785 (+16.57%)
3.900 (+18.81%)
(a) (b)
Fig. 25. FPGA output for the circuit protected with (a) proposed defense line 3, and (b) ATMR [18].
engineering attack on bitstream. However, there is limited
literature available that studies the security threats originated
from the untrusted FPGA CAD tools. This work fills the
gap. We demonstrate two practical attacks through Xilinx
and Altera FPGA design suites. We further classify three
Trojan attack levels, depending on the attacker’s prior FPGA
experience and ability to manipulate the FPGA software.
To mitigate the hardware Trojans induced by the malicious
FPGA tools, we propose a FOMTD method which offers
three defense lines. Each defense line generates a different
degree of unpredictability from the malicious FPGA software
designer’s point of view. As our unpredictability is formed
after the CAD tool is delivered to FPGA users, our method
facilitates FPGA users to thwart Trojan insertion attacks during
the FPGA configuration phase. To the best of our knowledge,
our research effort is the first work that investigates the FPGA
based moving target defense for SRAM FPGAs.
We did extensively evaluation on the security, hardware cost,
and performance. The proposed defense line 1 changes 50%
of the default LUT mapping on the FPGA device and reduces
the hardware Trojan hit rate of L-1 attacks, at the cost of
0.33% more LUT utilization compared to the baseline. When
advance attacks occur, our defense lines maintain a low Trojan
hit rate. Defense lines 2 and 3 reduce the Trojan hit rate by
up to 40% and 91%, respectively, over the baseline. The gate
replacing technique in defense line 3 further reduces the Trojan
hit rate, on average, by 62% and 88% even if the attacker
searches for exact and approximate configuration matching,
respectively. The power increase due to the defense line 2
and defense line 3 is 8.86% and 11%, respectively, compared
to the baseline. The delay overhead varies. According to our
case studies, the worst-case delay overhead our defense line
incurs is 22.02%. We also compared the defense line 3 to
a static Trojan detection method, DMR. Experimental results
show that our method improves the hardware Trojan hit rate
by 63.3% and 42.5% against L-2 and L-3 attacks, respectively.
Because of the input gating and hot-swappable features in our
method, our defense line 3 consumes 50% less power than
DMR.
The limitation of the proposed defense lines is the hardware
cost and delay increase. However, considering the significant
improvement on the resilience against Trojan insertion attacks,
the overhead of our method is moderate and acceptable for
security-critical applications. In future work, we will work on
the cost minimization of the FOMTD method.
ACKNOWLEDGMENT
This work is partially supported by National Science Foun-
dation CAREER Award No. 1652474 and Air Force Research
Laboratory Visiting Faculty Research Program (VFRP) 2017.
REFERENCES
[1] “FPGA Market size set to exceed USD 9.98 Billion by 2022, with over 8.4from 2015 to 2022: Global Market Insights Inc..” https://goo.gl/uEmByo [Accessed 9/13/2018].
[2] “Security for volatile FPGAs.” http://www.cl.cam.ac.uk/techreports/ UCAM-CL-TR-763.pdf [Accessed 9/13/2018].
[3] M. Majzoobi, F. Koushanfar, and M. Potkonjak, “FPGA-oriented Se- curity,” Introduction to Hardware Security and Trust, pp. 1–38, Sept 2012.
[4] M. Majzoobi and F. Koushanfar, “Time-Bounded Authentication of FPGAs,” IEEE Trans. on Information Forensics and Security, vol. 6, pp. 1123–1135, Sept 2011.
[5] I. Hadzic, S. Udani, and J. M. Smith, “FPGA Viruses,” in Proc. Intl. Workshop on FPL’99, pp. 291–300, April 1999.
[6] S. Trimberger, “Trusted design in FPGAs,” in Proc. DAC’07, pp. 5–8, June 2007.
IEEE TRANS. ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. YY, YEAR ZZZ 14
[7] S. Skorobogatov and C. Woods, “Breakthrough silicon scanning dis- covers backdoor in military chip,” in Proc. CHES’12, pp. 23–40, Sept 2012.
[8] P. Swierczynski, A. Moradi, D. Oswald, and C. Paar, “Physical Security Evaluation of the Bitstream Encryption Mechanism of Altera Stratix II and Stratix III FPGAs,” ACM Trans. Reconfigurable Technol. Syst., vol. 7, pp. 34:1–34:23, Dec. 2014.
[9] S. Mal-Sarkar, R. Karam, S. Narasimhan, A. Ghosh, A. Krishna, and S. Bhunia, “Design and Validation for FPGA Trust under Hardware Trojan Attacks,” IEEE Trans. on Multi-Scale Computing Syst., vol. 2, pp. 186–198, July 2016.
[10] Z. Zhang, Q. Yu, L. Njilla, and C. Kamhoua, “FPGA-oriented moving target defense against security threats from malicious FPGA tools,” in Proc. HOST’18, pp. 163–166, April 2018.
[11] R. S. Chakraborty, I. Saha, A. Palchaudhuri, and G. K. Naik, “Hard- ware Trojan Insertion by Direct Modification of FPGA Configuration Bitstream,” IEEE Design Test, vol. 30, pp. 45–54, April 2013.
[12] Z. Zhang, L. Njilla, C. Kamhoua, K. Kwiat, and Q. Yu, “Securing FPGA- based Obsolete Component Replacement for Legacy Systems,” in Proc. ISQED’18, pp. 401–406, March 2018.
[13] G. Bloom, B. Narahari, R. Simha, A. Namazi, and R. Levy, “FPGA SoC architecture and runtime to prevent hardware Trojans from leaking secrets,” in Proc. HOST’15, pp. 48–51, May 2015.
[14] A. L. Oliveira, “Techniques for the creation of digital watermarks in sequential circuit designs,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Syst., vol. 20, pp. 1101–1117, Sept 2001.
[15] J. Zhang, Y. Lin, Y. Lyu, and G. Qu, “A PUF-FSM Binding Scheme for FPGA IP Protection and Pay-Per-Device Licensing,” IEEE Trans. on Information Forensics and Security, vol. 10, pp. 1137–1150, June 2015.
[16] R. Karam, T. Hoque, S. Ray, M. Tehranipoor, and S. Bhunia, “MU- TARCH: Architectural diversity for FPGA device and IP security,” in Proc. ASPDAC’17, pp. 611–616, Jan 2017.
[17] Y. Pino, V. Jyothi, and M. French, “Intra-die process variation aware anomaly detection in FPGAs,” in Proc. ITC’14, pp. 1–6, Oct 2014.
[18] S. Mal-sarkar, A. Krishna, A. Ghosh, and S. Bhunia, “Hardware Trojan Attacks in FPGA Devices: Threat Analysis and Effective Countermea- sures,” in Proc. GLSVLSI’14, pp. 287–292, May 2014.
[19] D. M. Shila and V. Venugopal, “Design, implementation and security analysis of Hardware Trojan Threats in FPGA,” in Proc. IEEE ICC’14, pp. 719–724, June 2014.
[20] B. Khaleghi, A. Ahari, H. Asadi, and S. Bayat-Sarmadi, “FPGA-based protection scheme against hardware Trojan horse insertion using dummy logic,” IEEE Embedded Syst. Letters, vol. 7, pp. 46–50, June 2015.
[21] S. Bhunia, M. S. Hsiao, M. Banga and S. Narasimhan, “Hardware Trojan attacks: Threat analysis and countermeasures,” Proceedings of the IEEE, vol. 102, pp. 1229–1247, Aug. 2014.
[22] R. S. Chakraborty, F. Wolff, S. Paul, C. Papachristou, and S. Bhunia, “Mero: A statistical approach for hardware Trojan detection,” in Proc. CHES’09, pp. 396–410, Aug. 2009.
[23] S. Narasimhan, D. Du, R. S. Chakraborty, S. Paul, F. G. Wolff, C. A. Papachristou, K. Roy, and S. Bhunia, “Hardware Trojan detection by multiple-parameter side-channel analysis,” IEEE Trans. on Computers, vol. 62, pp. 2183–2195, Nov 2013.
[24] R. Druyer, L. Torres, P. Benoit, P. V. Bonzom, and P. Le-Quere, “A survey on security features in modern FPGAs,” in Proc. ReCoSoC’15, pp. 1–8, June 2015.
[25] “SoC FPGA Hardware Security Requirements and Roadmap.” https://www.intel.com/content/dam/www/programmable/us/en/pdfs/ education/events/northamerica/isdf/SoC-FPGA-Hardware-Security.pdf [Accessed 9/13/2018].
[26] M. Tehranipoor and F. Koushanfar, “A survey of hardware Trojan taxonomy and detection,” IEEE Design Test of Computers, vol. 27, pp. 10–25, Jan 2010.
[27] J. A. Roy, F. Koushanfar, and I. L. Markov, “Extended abstract: Circuit cad tools as a security threat,” in Proc. HOST’08, pp. 65–66, June 2008.
[28] S. M. Trimberger and J. J. Moore, “FPGA Security: Motivations, Fea- tures, and Applications,” Proceedings of the IEEE, vol. 102, pp. 1248– 1265, Aug 2014.
[29] D. Last, D. Myers, M. Heffernan, M. Caiazzo, and Captain N. Paltzer, “Command and control of proactive defense,” J. of Cyber Security and Information Syst., vol. 4, no. 1, pp. 8–13, 2015.
[30] “Moving Target Defense.” https://www.dhs.gov/science-and- technology/csd-mtd [Accessed 9/13/2018].
[31] R. Zhuang, S. A. Deloach and X. Ou, “Towards a theory of moving target defense,” in Proc. the First ACM Workshop on Moving Target Defense, pp. 31–40, 2014.
Zhiming Zhang is currently pursuing the Ph.D.
degree with the Department of Electrical and
Computer Engineering, University of New Hamp-
shire, Durham, New Hampshire, USA. His cur-
rent research focuses on hardware security which
includes design obfuscation, side channel analysis
of encryption algorithms, fault attack analysis,
and emerging technologies with emphasis on
hardware security and trust.
Laurent Njilla (M’05) received his Ph.D. from
the Electrical and Computer Engineering De-
partment at Florida International University, Mi-
ami, his M.S. from the University of Central
Florida, Orlando USA, and his B.S. from the
Department of Computer Science, University of
Yaounde, Yaounde, Cameroon. He is currently
a Research Engineer at the Air Force Research
Laboratory, Department of Defense. His research
interests and expertise include cyber security,
Game theory, hardware and network security,
blockchain technology, cyber threat information and advanced computer
networking.
Charles A. Kamhoua (S’10-M’12-SM’14) is a
researcher at the Network Security Branch of
the U.S. Army Research Laboratory (ARL) in
Adelphi, MD, where he is responsible for con-
ducting and directing basic research in the area
of game theory applied to cyber security. Prior
to joining the Army Research Laboratory, he
was a researcher at the U.S. Air Force Research
Laboratory (AFRL), Rome, New York for 6 years
and an educator in different academic institutions
for more than 10 years. He has held visiting
research positions at the University of Oxford and Harvard University.
He has co-authored more than 150 peer-reviewed journal and conference
papers. He has been recognized for his scholarship and leadership with
numerous prestigious awards. He received a B.S. in electronics from
the University of Douala (ENSET), Cameroon, in 1999, and a Ph.D. in
Electrical Engineering from FIU in 2011. He is currently an advisor for
the National Research Council postdoc program, a member of the FIU
alumni association and ACM, and a senior member of IEEE.
Qiaoyan Yu (S’03-M’11-SM’17) received the B.S.
degree from Xidian University, Xian, China in
2002, the M.S. degree from Zhejiang University,
Hangzhou, China in 2005, and the Ph.D. de-
gree in Electrical Engineering from University
of Rochester, Rochester, New York, USA in 2011.
Dr. Yu is currently an Associate Professor with
the Department of Electrical and Computer Engi-
neering, University of New Hampshire, Durham,
New Hampshire, USA. Her research interests
include hardware security and trust, embedded
system security, cyber-physical system, error control for networks-on-
chip, fault-tolerance for VLSI circuits and systems. Dr. Yu is the recipient
of National Science Foundation CAREER award in 2017. She has served
on the technical program committees of HOST, DAC, FDTC, Asian
HOST, ISVLSI, DFT, ASP-DAC, GLSVLSI, and ISCAS. She is a member
of the editorial boards of Integration, the VLSI Journal, Microelectronics
Journal, and Journal of Circuits, Systems, and Computers.
sources/149/Zhang et al. - 2015 - A PUF-FSM Binding Scheme for FPGA IP Protection an.pdf
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015 1137
A PUF-FSM Binding Scheme for FPGA IP Protection and Pay-Per-Device Licensing
Jiliang Zhang, Yaping Lin, Yongqiang Lyu, and Gang Qu, Senior Member, IEEE
Abstract— With its reprogrammability, low design cost, and increasing capacity, field-programmable gate array (FPGA) has become a popular design platform and a target for intellectual property (IP) infringement. Currently available IP protection solutions are usually limited to protect single FPGA configurations and require permanent secret key storage in the FPGA. In addition, they cannot provide a commercially popular pay-per-device licensing solution. In this paper, we propose a novel IP protection mechanism to restrict IP’s execution only on specific FPGA devices in order to efficiently protect IPs from being cloned, copied, or used with unauthorized integration. This mechanism can also enforce the pay-per-device licensing, which enables the system developers to purchase IPs from the core vendors at the low price based on usage instead of paying the expensive unlimited IP license fees. In our proposed binding-based mechanism, FPGA vendors embed into each enrolled FPGA device with a physical unclonable function (PUF) customized for FPGAs; IP vendors embed augmented finite-state machines (FSM) into the original IPs such that the FSM can be activated by the PUF responses from the FPGA device. We propose protocols to lock and unlock FPGA IPs, demonstrate how PUF can be embedded onto FPGA devices, and analyze the security vulnerabilities of our PUF-FSM binding method. We implement a 128-bit delay-based PUF on 28-nm FPGAs with only 258 RAM-lookup tables and 256 flipflops. The PUF responses are unique and reliable against environment changes. We also synthesize a variety of FSM benchmark circuits. On large benchmarks, the average timing overhead is 0.64% and power overhead in 0.01%.
Index Terms— Binding, field-programmable gate array (FPGA), finite state machine (FSM), hardware metering, intellectual property (IP) protection, physical unclonable functions (PUFs).
I. INTRODUCTION
A. Motivations
FIELD-PROGRAMMABLE gate arrays (FPGAs) are thesemiconductor devices that can be reprogrammed by the Manuscript received January 3, 2014; revised May 30, 2014 and
November 29, 2014; accepted January 28, 2015. Date of publication February 5, 2015; date of current version April 13, 2015. This work was supported in part by the National Natural Science Foundation of China under Grant 61173038 and Grant 61228204 and in part by a scholarship from China Scholarship Council under Grant 201306130042. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Farinaz Koushanfar.
J. Zhang and Y. Lin are with the College of Information Science and Engineering, Hunan University, Changsha 410082, China (e-mail: [email protected]; [email protected]).
Y. Lyu is with the Research Institute of Information Technology, Tsinghua University, Beijing 100084, China (e-mail: [email protected]).
G. Qu is with the Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742 USA (e-mail: [email protected]).
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TIFS.2015.2400413
end-users to implement any digital system. Comparing to the implementation with Application-specific Integrated Circuits (ASICs), FPGA design has the advantages of shorter time-to-market, lower non-recurring engineering costs and higher flexibility. These have made FPGA a popular design platform for many applications such as automotive electron- ics, consumer electronics and aerospace equipments. In this FPGA-based design platform, third-party intellectual proper- ties (IPs) are widely used due to both the technical merits (e.g., the IPs proven functionality, compatibility, and performance) and non-technical concerns (e.g., time-to-market, cost, and patent enforcement). However, there are severe piracy attacks to the FPGA IPs and the current licensing schemes are also not flexible enough to precisely control the authorized usage.
Firstly, from the perspective of the attack, piracy attacks, such as cloning, copy, misuse and unauthorized integration, are considered to be the most common security vulnerability of volatile FPGAs [1]. Un-configured FPGA devices are off-the-shelf products, and the configuration bitstreams can be obtained by eavesdropping or directly from the volatile SRAM FPGAs [1], which not only reduces the profits and market share, but also causes the damage to the brand reputation and even leads to severe early product failures and safety hazards [1], [2]. Furthermore, this is not limited to high-value single FPGA designs; the third-party FPGA intellectual property (IP) cores are also vulnerable to those attacks.
Secondly, from the perspective of licensing, it is often vital to ensure that the configuration bit-streams can only be used on the licensed FPGA devices. In such a case, IP core vendors would prefer to sell their IP products through pay-per-device licensing rather than through up-front license fees that allows users to configure any FPGA device. In order to adapt the IP core business model for the low/medium-volume FPGA applications [3], effective pay-per-device licensing techniques are in urgent need.
Mainstream FPGA vendors have been paying more and more efforts in protecting their IPs from piracy attacks and improving licensing schemes to activate and protect the IP-based commercial flow. However, the state-of-art tech- niques still have some drawbacks. In this paper, we consider hardware IPs (HWIPs) as the soft-core (synthesized from HDL) hardware modules stored in the FPGA configuration bitstreams [11]. Our goal is to develop techniques to solve the piracy and licensing challenges. We propose to solve these problems by a binding mechanism that seeks to restrict the execution of the protected IPs to the authorized FPGA devices only.
1556-6013 © 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1138 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
B. Limitations of Prior Art
FPGA HWIP protection techniques have been well-studied in academic [2]–[5] and widely used in industry [6]–[9]. However, all the existing HWIP protection techniques are based on encryption and have the following main drawbacks:
1) The commercially available encryption-based techniques can only protect the single large FPGA configurations.
2) The commercially encryption-based techniques cannot provide a solution to the commercially popular pay-per-device licensing requirement for both single large configurations and individual IP cores.
3) The current encryption-based FPGA IP protection meth- ods introduce security vulnerabilities (e.g., physical attacks and side channel attacks) for permanent key storage and management.
C. Our Contributions In this paper, we propose a binding scheme that binds
the HWIPs to specific FPGA devices via the interaction between physical unclonable functions (PUF) built on the FPGA devices and the FSMs in the HWIPs in order to address the limitations of existing FPGA HWIP protection techniques. We first report this concept in [37]. In this article, 1) we provide a concrete construction and implementation of a delay-based PUF on 28nm FPGAs as a reference design of our binding scheme; 2) we implement and verify the proposed binding scheme by synthesizing MCNC’91 circuits and large FSM benchmarks from GenFSM on FPGAs; 3) we elaborate the details of the proposed binding scheme with illustrative example and in-depth discussion on design flow, system integration, and security vulnerabilities. To the best of our knowledge, this is the first non-encryption based FPGA HWIP binding method. Comparing to the traditional encryption-based HWIP protection methods, our approach has the following advantages:
1) It can be used to protect both single FPGA configura- tions and third-party FPGA IP cores. Currently available encryption-based commercial methods can only protect the former, but not the latter.
2) It supports the pay-per-device licensing mechanism. The FPGA configuration bitstream can only be used to configure specific FPGA devices, giving IP vendors control of their IPs and allowing product developers to pay licensing fee only for the FPGA devices they are using.
3) It does not need permanent storage for secret keys in the FPGA. In our binding scheme, the secret PUF response can be ephemeral and immediately cleared after use. Therefore, it eliminates the security vulnerabilities of the permanent key management and exchange.
4) It has low hardware overhead. We implement the pro- posed method on Virtex-5 FPGA devices and find that the 128-bit delay-based PUF needs about 256 slices [41] and the modified FSM only introduces 0.64% timing overhead and 0.01% power overhead on average for ten large FSM designs. As a comparison, previous FPGA IP protection schemes consume 6776 LUTs for a SHA-1 core and an ECDH core [4], [5].
D. Outline of the Paper
The rest of this paper is organized as follows. Related work is surveyed in Section II. The necessary background informa- tion on PUF, FSM, and parties involved in HWIP binding is presented in Section III. The proposed binding method and its working mechanism are elaborated in Section IV. An IP locking mechanism and a reference implementation of PUF for the proposed binding method are then given in Section V and Section VI, respectively. Potential security threats and countermeasures are analyzed in Section VII. The detailed experimental results and analysis are reported in Section VIII. Finally, we conclude in Section IX.
II. RELATED WORK
A. FPGA HWIP Protection Techniques
Many intellectual property protection techniques for FPGAs have been proposed in academic and industry.
In commercial tools, bit-stream encryption [6]–[9] is the most popular intellectual property protection method against direct cloning of single large FPGA configurations for high-end FPGA devices. Some recent FPGAs employ the advanced encryption standard (AES) core or triple data encryp- tion standard (3DES) core to support the encryption of the FPGA configuration bitstreams; some FPGAs employ keyed- hash message authentication code (HMAC) core to enable bit-stream authentication [8]. They all need the on-chip cryp- tographic decryption module and the permanent secure key storage. Unfortunately, these solutions come with some prac- tical limitations: they are not appropriate for resource-limited environments, and more importantly, it is well-known that such permanent key storage scheme allows attackers to attack at any time.
In the academic domain, Gneysu et al. [4] proposed a protection scheme for the FPGA bitstreams, which uses the secondary secure key register and the authenticated bitstream encryption and requires minor modification to the current FPGA technology. They employed a public-key-based pro- tocol between the IP providers and the FPGA-based system developers, and a trusted third party (TTP) is used to handle key exchange and installation in the symmetric-key-decryption engines. This solution is only suitable for the protection of single large FPGA configurations, and the protection of individual HWIP cores remains as a challenging problem. Drimer et al. [5] presented an encryption-based method to protect multiple IPs, and Kepa et al. [13] proposed a secure reconfigurable controller based method to support license enforcement within the partial reconfiguration flow. More recently, Maes et al. [2] introduced a valuable “pay-per-use” licensing scheme to protect multiple FPGA IPs through the self-reconfiguring capabilities of modern FPGAs and a TTP for metering the service.
As we can see from the above, all commercial and academic FPGA configuration bitstream protection meth- ods are encryption-based; they have three shortcomings: 1) the commercial methods are limited to the protection of single large FPGA configurations; 2) they cannot support the pay-per-device licensing; 3) the previous encryption-based
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1139
HWIP protection methods require permanent key storage and on-chip cryptographic decryption modules to decrypt the bitstream, which introduces some security vulnerabilities and high overhead. Our approach overcomes these limitations.
B. Metering ASIC Intellectual Properties
A number of watermarking methods for ASIC/FPGA intellectual property protection have been proposed [32]–[35]. However, watermarking techniques are passive and only used to identify the intellectual property. In 2001, Koushanfar and Qu [38] proposed the first hardware metering method that can enable the design house to gain the post-fabrication control by passive or active control of the number of produced ICs. Alkabani et al. [24] proposed an anti-overbuilding mechanism which exploits the functional description of the design and the unique and unclonable IC identifiers. The locks can be embedded via modifying the hardware computational model such as an FSM. They also presented another FSM manipulation method [25] which introduces only a few new states. These solutions are only suitable for protecting single ASIC chips. Later on, they further extended their scheme to actively control multiple IP cores [26] for ASIC chips. Recently, Koushanfar [27] improved again the locking structure in [24] by a multi-point function. Meanwhile, Roy et al. [20] presented another kind of cryptography-based metering methods, but their solution has a very high overhead. These metering mechanisms are designed for anti-overbuilding ASIC devices, they are not appropriate for pay-per-device licensing of FPGA designs.
In this paper, our proposed FPGA HWIP binding technique not only addresses the main drawbacks of the traditional FPGA HWIP protection methods, it can also support a pay-per-device licensing scheme. This provides technical sup- port for the product developers (system developers) to pay IP licensing fees only for the FPGA devices they are using. It also enables the IP vendors to freely distribute their IPs because they can ensure that the distributed IPs run only on specific FPGAs rather than all the FPGAs. This binding scheme brings a remarkable advantage for the IP-based busi- ness model: the IP owners can take the full control over the use of their IP cores and protect them from unlicensed use; the FPGA-based product developers who could not afford the expensive unlimited IP license are now also able to obtain a number of single instances of the required IP cores at a much lower cost.
III. PRELIMINARIES
In this section, we will introduce the general terms and concepts used throughout the paper. More specific definitions would be described as necessary.
A. Physical Unclonable Function (PUF)
PUF provides a unique chip-dependent mapping from a set of digital inputs (challenges) to a set of digital out- puts (responses) based on the unclonable properties of the
underlying physical device. Although it is difficult to come up with a uniform definition for all types of PUFs, they should all satisfy the following properties [39]:
• Persistent and unpredictable. The response ( Ri ) to a challenge (Ci ) is random and unpredictable, but should remain the same for the same challenge over multiple observations.
• Unclonable. It is impossible to obtain Ri from Ci without the physical presence of the PUF. In other words, given a PUF, it is infeasible for an adversary to build another PUF that provides the same responses to every possible challenge. This is assumed to be true due to the uncon- trollable technology variations.
• Tamper evident. Invasive attacks to PUFs will destroy the PUFs and thus can be detected easily.
Because of those properties, PUF has become an efficient mechanism to address security and trust problems in many applications, such as binding software IPs to specific FPGAs [11], hardware/software authentication [16], FPGA IP protection [18], [43], anti-overbuilding [24]–[27] and resisting FPGA replay attacks [36].
B. Finite State Machine (FSM)
FSM is a popular model for sequential systems. In this paper, we employ FSMs to bind HWIPs to the FPGAs with PUFs to restrict the HWIP’s usage so that it can only work on the enrolled FPGA devices. Similar to the FSM-based works such as [15] and [24]–[27], the method proposed in this paper is not applicable to some high-speed designs that do not have FSMs. These high-speed designs are normally small dedicated modules such as digital filters, channel equalizers, address decoders and arithmetic logic units. Fortunately, for the HWIPs in industrial designs that we target to protect, the sequential components or functions, and therefore FSMs, are ubiquitous [15].
C. Parties Involved in HWIP Binding
In order to facilitate our study, we consider the following parties involved in the binding mechanism and their respective roles:
• FPGA vendor (FV): FV designs and manufactures un-configured FPGA devices and can securely deploy PUF in the fabric of these devices.
• System developer (SD): SD integrates the third-party IPs along with their own designs to create a commercial prod- uct on an FPGA chip. The product will be synthesized into a configuration bitstream file for the FPGA chip to download using the computer aided design (CAD) tools provided by the FV.
• IP core vendor (CV): CV creates innovative logic circuits (HWIP cores) and sells them to SDs for profits. CV needs an effective technique to keep the full control over the use of the HWIP cores.
• End user (EU): EU purchases the FPGA products developed by the SD. The SD expects that EUs cannot ‘clone’ the products by copying the FPGA configuration bitstream file and run on unauthorized FPGA devices.
1140 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
Fig. 1. Design flow of modifying hardware IP.
Our goal is to design a new binding mechanism so the SDs and CVs can protect their FPGA designs or IPs from piracy without introducing much inconvenience and large performance degradation to the EUs and FVs.
IV. THE PROPOSED BINDING SCHEME
Traditionally, the HWIPs are written without any concern of binding to any specific FPGA. The configuration bitstream can be used to configure any FPGA device of the same type. Given a HWIP, our goal is to modify the original FSM of the HWIP to produce an augmented FSM which is functionally equivalent to the former. The modified FSM reacts with the intrinsic PUF located in the specific FPGA hardware, and it can perform exactly the same as that of the HWIP as long as the challenges issued by the HWIP obtain correct responses through the PUF. This means that only the FPGA chips authorized by the CVs can guarantee the correct functionalities. Meanwhile, as long as there are PUFs embedded in the FPGA chips from the FV, and the CV modifies their IP designs to support PUF, no more changes are needed at any party when a new HWIP is developed and needs to be deployed to a new FPGA device. The details of the binding scheme are depicted in figures 1, 2, and 3 and described as follows.
A. Design Flow
The design flow of modifying the HWIP together with a standard FPGA design methodology is shown in Fig. 1. First, the CV uses the high-level design description to setup the behavioral model of the FSM. Next the original FSM is modified so that the added FSM structure (such as additional states and transitions) and the original FSM form a new augmented FSM. The standard phases of the FPGA design methodology (e.g., design synthesis, placement and routing) can then be carried out. Finally, the HWIP configuration bitstream file can be downloaded into an FPGA device to run. Although there is an inevitable verification/testing overhead due to the added features in the augmented FSM, the entire traditional design methodology is maintained so the introduced design overhead can be controlled.
TABLE I
SYMBOLS AND ACRONYMS US ED I N THE PROTOCOL
B. Description of the Protocol
For reader’s convenience, we list the symbols and acronyms used in the protocol, as shown in Table I. The proposed PUF-FSM binding protocol is described as follows.
1) FPGA Device Enrollment: The device enrollment protocol is shown in Fig. 2(a). To enable the proposed scheme, the FPGA vendor (FV) initially tests the PUF for every piece of FPGA chip to obtain their random challenge-response pairs (CRPs) before selling them. The PUF challenges are stored in the non-volatile on-chip memory, which is automat- ically configured on F iPU F immediately when the device is powered on. Note that the PUF challenges can be public and do not need to be encrypted or hidden because of the uniqueness and unpredictability of the PUF responses. In addition, FV can also generate the I D(F iPU F ) which is a public unique serial number burned in at manufacturing time (e.g., Xilinx Device DNA [19]). If the core vendor (CV) or system developer (SD) wants to buy the FPGA embedded with the PUF, F iPU F , to start the HWIP/system development, the FV will respond with the I D(F iPU F ) from database and then sell the FPGA device F iPU F to the CV/SD.
2) Hardware IP Core Enrollment and Distribution: As Fig. 2(b) shows, before the system developer (SD) devel- ops its product, the core vendor (CV) creates the IP with I D(H W I Pj ). The CV then synthesizes the H W I Pj with the PUF-binding FSM into the bit-stream to generate the new version b{H W I Pj }locked . This process can be expressed as b{H W I Pj }locked = Lock b{H W I Pj }. The CV stores I D(H W I Pj ) and b{H W I Pj }locked in its database, and releases I D(H W I Pj ) for sale. When a SD needs the H W I Pj to develop FPGA-based products, it asks for buying it via sending the I D(H W I Pj ) to the CV. The CV then looks up the database for I D(H W I Pj ) and sends the corresponding b{H W I Pj }locked , the locked HWIP bit-stream, to the SD.
3) Hardware IP Core Licensing: As Fig. 2(c) shows, when the system developer (SD) requires to unlock the purchased b{H W I Pj }locked in their FPGA-based products, it sends I D(F iPU F ) and I D(H W I Pj ) to the core vendor (CV). The CV will send I D(F iPU F ) to the FPGA vendor (FV) to obtain the corresponding CRPs and then calculate licenses based on the CRPs and the modified FSM. The computed licenses can be public. Finally, the licenses are sent to the SD to unlock b{H W I Pj }. This process can be expressed as b{H W I Pj }unlocked = b{H W I Pj }locked (Li censes). Note that the CRPs should be securely transferred from the FV to the CV or SD.
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1141
Fig. 2. Hardware IP core binding protocol. (a) FV generates I D(F iPU F ) and CRPs and then sells devices to SD and CV. (b) CV generates the locked H W I Pj and distributes it to SD. (c) CV licenses H W I Pj to SD. (d) SD licenses Pr oduct j to EU.
Fig. 3. Configuration process of an FPGA-based product containing multiple locked hardware IP cores.
4) Product Licensing: As Fig. 2(d) shows, if an end user (EU) would like to buy the products developed by the system developer (SD) to run on a specific FPGA device F iPU F , it should send I D(F
i PU F ) and ID( Pr od uct j ) to SD.
The SD will send I D(F iPU F ) to FV to obtain the corre- sponding CRPs and then calculate licenses based on the FPGA-PUF responses and the modified FSM. Note that the licenses can be public. Finally, the licenses are sent to EUs to unlock b{H W I Pj }. This process can be denoted as b{ Pr od uct j }unlocked = b{ Pr od uct j }locked (Li censes).
C. System Integration
The proposed binding scheme can support multiple HWIP cores to be integrated on a single FPGA design. To develop an FPGA-based product, the system developer (SD) obtains
an FPGA device with the hard core PUF inside from the FPGA vendor (FV) [following Fig. 2(a)], the required third- party HWIP cores from the core vendors (CVs) [following Fig. 2(b)] and the required licenses for these cores from the CVs [following Fig. 2(c)]. For example, when there are two different HWIP cores from two different CVs; the SD can integrate them into the same FPGA device by putting the PUF challenges from the FV and the authorized licenses from the CVs in a nonvolatile memory (NVM) next to the FPGA device, then our IP protection scheme will work as shown in Fig. 3. When the system is powered on, the activation process checks the PUF-based licenses of the IPs and loads the unlocked IP cores into the reconfigurable FPGA fabric. If a purchased HWIP is copied to an unauthorized FPGA device, even the same license cannot unlock the HWIP because the unauthorized FPGA could not generate the same PUF responses as the authorized one.
V. HWIP LOCKING MECHANISM
A. Locking the Hardware IP
In this section, we describe a prototyping design of the lock mechanism proposed in the binding scheme. The lock is achieved by exploiting PUF’s unique properties (unclonable, persistent and unpredictable). As Fig. 4 shows, we use the PUF response to control the transitions of the FSM in the HWIP. The error corrected PUF response is used to uniquely determine the transitions of the state transition graph (STG) of the HWIP (the IP behavior); without the correct PUF response, the STG would not perform correctly. Therefore, the circuit is kept locked until the correct license (formed by
1142 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
Fig. 4. PUF response is used to uniquely control the transitions of the STG.
Fig. 5. The binding FSM structure. The original states are shown in dark and the added states are shown in white on the STG of the added FSM.
the correct PUF response) unlocks it. It should be noticed that the computed licenses can also be public and different PUF responses can be used to calculate different licenses. Additionally, the FPGA vendor often computes the error correcting code (ECC) to adjust for any bit-flip to the PUF output (response) because the PUF output is hard to maintain absolutely stable due to the noise or other sources of physical uncertainty.
As an example, considering the original STG with 6 states, S0 ∼ S5, in Fig. 5, the transition from state S0 to S1 is excited by a specific input combination. S0 is called the reset state of the original FSM and S1 is the next state of this transition. Now we introduce the method to generate a new FSM with additional structure to bind with PUF. We add M (M is an even number) layers of states to form the added FSM. Any even-number layer consists of m states and any odd-number layer only has one state. We define a fixed power-up state Sr for the binding FSM. The first transition step starts from Sr with m transitional edges to each of the other m states. Then the second transition step goes from each of these m states to the next layer (odd layer). After the M -layer (M transition steps) transitions, the state transits to S0 which is the unlocked state (the reset state).
Assuming the input of the original STG is k-bit long, we define a k-bit input sequence: {b1b2 · · · bk }ti ({t|t ∈ N, 1 ≤ t ≤ 2k, k ∈ N }; i ∈ N, 1 ≤ i ≤ M) where i denotes the i -th transitional step and t denotes the specific transition in the i -th transitional step. The k-bit input is the function of a L-bit PUF response which determines the transition path in the transitional steps. The odd-number transitional steps are determined by partial bits of the PUF response, and the even- number transitions of the FSM are determined by the license and the rest PUF response.
Fig. 6. An example of the lock mechanism and the generation of the licenses for sequential circuit dk16.
For an added FSM structure of M layers, every transitional step needs log2(m)-bit PUF output. Hence, the length of required PUF bits can be formulated as Eq. (1) shows.
L PU F = M × log2(m) (1) In addition, log2(m)-bit license should also be provided in
the even-layer transitional step. The length of license bits can be computed by:
Llicense = M
2 × log2(m) (2)
Note that L PU F and Llicense should be sufficiently long in practice in order to guarantee the security of this model.
To illustrate the key idea of our approach, we give an example of the implementation of the lock and the generation of the licenses for benchmark circuit dk16 shown in Fig. 6. Considering a two-layer added FSM structure composed by two transition steps, assume k = 2, hence each transitional step consists of 4 (22 = 4) edges forming the 4 transition paths. In the step 1, we use {b1b2}ti (t = 1, 2, 3, 4) which is designed to be 4 different values to distinguish the 4 edges. Then the value of {b1b2}ti will be decided by a 2-bit PUF output value once the design begins to run, it begins from Sr to one of the four connected states depending on the first 2-bit PUF outputs. As the first 2-bit PUF output value is “01”, which equals to the designed {01}21, thus the 1st step will transition from Sr to S7. Then in the 2nd step, the design can only possibly transition from S7 to S10 when the first two input bits equal to {10}22. To possibly enable the transition, the second 2-bit PUF output “00” should be XOR’d with a 2-bit key that is able to generate the result of “10” (in this case the key should be “10”). The FSM can transit from state Sr to s_1 (the original reset state of dk16) with the calculated license and the PUF response.
B. Unlocking the Hardware IP
The PUF outputs L PU F bits to determine the transitions of the binding FSM. Now, an attacker with no information about the transition table of the FSM cannot find the correct
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1143
Fig. 7. The structure of the delay-based PUF design.
sequence of the primary input combinations to arrive at the reset state S0. Hence, the CV is the only one who can compute the license to unlock the b{H W I Pj }locked .
The unlocking process is stated as follows. The FV provides enrolled PUF-embedded FPGA F iPU F ; each F
i PU F provides a
specific set of PUF challenges. If a b{H W I Pj }locked is ille- gally over-used, copied or cloned by a SD, it would be locked into the fixed power-up state, Sr , on the event of powering up the F iPU F because it does not have the correct PUF responses for the challenges. Hence, in order to unlock the design, the CV must use the received L PU F -PUF responses from the FV who tests the PUF responses on the provided PUF challenges, and then calculates the correct Llicense -bit license for SD. SD can use this license to unlock the b{H W I Pj }locked correctly.
VI. THE REFERENCE IMPLEMENTATION OF PUF
Many kinds of PUF have been proposed in the past decade [42], such as optical PUF, SRAM PUF, arbiter PUF and ring oscillator PUF. Some of them have also been implemented on FPGAs [17], [18], [21], [22]. The proposed binding method can work with any PUF implemented on FPGA that satisfies the properties defined in section III.A. Which PUF to use is up to the FPGA vendor. In this study, we give a concrete implementation based on a delay-based PUF for the designers to refer to. This PUF is designed specifically for FPGAs. It does not need the hard macro with fix routing and is completely described in VHDL with the merits of easy-of-use and low silicon area overhead [41].
A. A Delay-Based PUF
In this paper, we designed and implemented a delay-based PUF on 28nm FPGAs, which takes advantage of the manu- factured difference of the switching latencies of two carry- chain multiplexers on the FPGA to produce a positive pulse (glitch) at the output of downstream multiplexer. The glitch can be used to set the output of a D flip-flop to logic-1 from the default logic-0, which forms a one-bit PUF response. The detailed structure of the PUF design is illustrated in Fig. 7. The shift register contents are pre-initialized as follows:
• Input A: 0x5555 (0101010101010101) • Input B: 0xAAAA (1010101010101010)
Fig. 8. The new prototype implementation of a primitive PUF on Xilinx Zynq-7000 FPGA.
When the look-up-table (LUT) A and its driving multiplexer A are faster than the LUT B and multiplexer B, the output OUT would be logic-1.
Note that the current delay-based PUF [41] for FPGAs cannot be directly implemented on the latest Xilinx FPGAs such as Virtex-7, Kintex-7, Artix-7 and Zynq-7000 since the structure of SLICE of the latest Xilinx FPGAs are different from that of the previous FPGA families such as Virtex-5. In the architecture of Virtex-5 FPGAs, once a LUT in SLICEM is configured into a shift register, the logic-0 data input of multiplexer can be connected to a logic-0 signal to meet the design requirement. However, in the SLICEM of Zynq-7000 FPGAs, two optional paths of logic-0 data input of the carry chain multiplexer have been used as output or input of a shift register, which cannot meet the requirement that logic-0 data input of multiplexer should be always connected to a logic-0 signal.
To solve this problem, we use four SLICEs to implement one bit PUF signature. In the layout of Xilinx Zynq-7000 XC7Z020 FPGA, there are two SLICEs in one CLB. A SLICE whose X coordinate is even number is SLICEM; then the other SLICE in the same CLB would be SLICEL. Two SLICEMs are configured into two shift registers respectively, while their corresponding SLICELs are configured into a carry chain multiplexer. As shown in Fig. 8, four slices are used to implement a new primitive PUF. The dotted line represents the direction of data flow.
B. Reliability-Enhancing Techniques
Silicon PUF is based on manufacture variation, which may be very sensitive to the operating environment such as voltage and temperature, particularly for delay based PUF [10], [22]. It is very hard for any known PUF to maintain an absolutely stable response. Methods such as error correcting [21], [40], pattern matching [44], [45],
1144 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
and temperature aware collaboration [46] have been proposed to correct bit flips in PUF responses to generate stable PUF output.
These state-of-the-arts have already been very successful in reducing and correcting the PUF bit errors. For example, by using the Index-Based Syndrome coding (IBS), error correc- tion performed on output responses of ring oscillator (RO) PUFs implemented on Virtex-5 FPGAs has an error rate less than 10−6 when temperature goes from −55°C to 125°C under 1.0V operating voltage with a ± 10% variation [40]. Maes et al. demonstrated that an efficient and extremely low overhead BCH decoder specially for correcting bit flips in PUF responses utilizes merely 112 Slices on a Xilinx Spartan-6 FPGA [21]. Paral and Devadas proposed to use string pattern matching to generate reliable PUF responses, both the false positive and false negative rates can be less than 10−9 [44]. Yin and Qu built a temperature aware collaborative RO PUF where they measure the PUF output values at different temper- atures and choose the correct one based on the real operating temperature from on-chip temperature sensors, which ideally guarantees no bit error [46].
Moreover, electro-migration, hot carrier injection (HCI), negative bias temperature instability (NBTI) and temperature- dependent dielectric breakdown (TDDB) cause the device aging, which would impact the stability of PUF signatures [47], [48]. For example, Ganta et al. [48] observe that around 4% of the RO-PUF bits are prone to instability due to aging in various operating conditions. Recently, the corresponding aging resistant techniques [49], [50] are also developed.
As we will report in Section VIII, our reference delay- based PUF does have bit errors when operating at different temperatures. However, such error will not cause any false positive or false negative, which means that we will be able to distinguish a PUF response with bit errors from a PUF response from a different device. To keep our discussion focused on the new PUF-FSM binding scheme for FPGA IP protection and pay-per-device licensing, we will not elaborate how to improve the reliability of the above reference delay- based PUF and the associated cost. When high reliable PUF responses are critical, one can always use one or more of the reliability-enhancing techniques. It will be a task for the vendors and IP developers to balance the tradeoff between PUF reliability and design overhead.
C. The Integration Architecture
The delay-based PUF in this paper is designed in HDL, and hence has the merits of easy-of-use and high flexibility. The PUF can be implanted into FPGA in the form of soft- core or hard-core. Soft-core PUF is implemented in FPGA fabric while hard-core PUF actually physically implemented as a structure in the silicon connected to the FPGA fabric. Both hard-cores and soft-cores have been widely adopted in FPGA industry. An example of FPGA with hard-cores is the ARM Cortex-A9 dual-core MCU used in the new Xilinx Zynq-7000 System on a Programmable Chip (SOPC). On the other hand, soft-cores are more commonly used in FPGAs such as MicroBlaze, Nios II and OpenRISC.
Fig. 9. An example of a soft-core PUF implanted in a SOPC.
Fig. 9 illustrates how a soft-core PUF can be implanted into a SOPC. The PUF is mounted on PLB bus to connect to a Xilinx MicroBlaze soft-core embedded processor. In our proposed binding scheme, the PUF will only be used when there is a need to unlock the IPs, normally during the FPGA power-up process. When the FPGA is running, the PUF will not be needed anymore. Therefore, we propose to power off the PUF unit once the IPs are unlocked. This mechanism can easily be implemented with some control logic and brings several advantages. First, by shutting down the hard-core or soft-core PUF unit, it will not consume unnecessary power; second, when the PUF unit is off, there will not be any leak of timing, power, or electromagnetic emanation from the PUF unit, so it will be more resilient to potential side channel attacks.
VII. THE SECURITY ANALYSIS
The objective of the proposed PUF-FSM binding method is to protect the HWIPs from the piracy attacks such as cloning, copying, unauthorized redistribution, over-use, etc. To analyze the security of this method, we consider the following existing attacks:
• Brute force. The adversary tries to guess the correct license to unlock the b{H W I Pj }locked . By using the unclonable PUF responses to control the transition of the added STG, the space of the correct license becomes exponential, making such brute force attack infeasible. For example, when L PU F = 256-bit (License = 128-bit), the search space of such brute force attack will be all the 2128 possible license values.
• PUF removal/tampering attack. The adversary tries to remove/tamper the PUF on the FPGA, for example by replacing the PUF with a SRAM that contains PUF responses from a previously unlocked hardware IP. Then the license for unlocking the previous HWIP can be used to unlock a new HWIP. There are several countermeasures to address this kind of attack, such as adding obfuscated states within the FSM for PUF checking [27]. Our binding scheme can adopt these countermeasures.
• Simulating PUF. According to the intrinsic properties of the PUFs described above, it is impractical to dupli- cate a PUF with functional and timing characteristics identical to another PUF. Although machine learning
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1145
techniques [14] have been used to model some strong PUFs with high prediction rate, they need a huge amount of PUF CRPs during the learning phase. Therefore, this attack will not be effective to weak PUFs such as the one used in this paper, SRAM PUF [18], [29] and similar architectures.
• Tapping PUF responses. In the binding scheme, the secret PUF response is ephemeral (the response is only used to unlock HWIPs at boot time) and will be immedi- ately cleared after use, and hence it resists tapping PUF responses.
• Reverse engineering the added FSM. An adversary tries to extract the STG and separate/remove the added STG from the original STG. However, STG recovery is a computationally intractable problem [15], [24], [28], and there exist effective methods that we can use in our scheme against such attacks such as creating black holes in the added FSM and merging the added FSM with the test and other FSMs [24].
• Side channel attacks. These attacks statistically analyze the time, power consumption or electromagnetic ema- nation of the cryptographic devices to gain knowledge about integrated secrets. Our delay-based glitch PUF architecture (see Fig. 7) uses multiple flip-flops in parallel and leaves little room for side channel attacks. For elec- tromagnetic emanation analysis, it is practically difficult to locate each flip-flop on the die of an FPGA and to focus the EM probe mainly on the radiation of its components. Timing and power analysis attacks are unlikely because all the flip-flops will be on regardless of the PUF bit will be a 0 or a 1 and the PUF will only be used at the IP unlocking phrase. However, our approach is not completely side channel attack free as it is well- known that any PUF-based security mechanism would be vulnerable to side channel attacks unless appropriate countermeasures are taken [21].
VIII. EXPERIMENTAL RESULTS AND ANALYSIS
We have performed a set of experiments to evaluate the effectiveness of the proposed new binding method. The exper- iments include two parts: the reference implementation of the delay-based PUF on the 28nm Xilinx FPGAs, Zynq-7000 FPGAs; and the evaluation of the PUF-bound FSM on FPGAs.
A. Design Evaluation of a Delay-Based PUF
We implemented 16 identical 64-bit PUFs at different loca- tions on a Xilinx Zynq-7000 FPGA. We used range constraints (ROLC_RANGE statement) supported by the Xilinx integrated development kit to place the PUF design to the designated area. Hence, the responses from these PUFs will be indepen- dent. It is well-known that the manufacture variation between two chips is normally larger than the variation between dif- ferent regions on the same chip. Consequently, if the PUFs located in the different regions on a single FPGA produce unique outputs, we would have the strong confidence that the PUF outputs from different chips should also be unique [41]. In this section, we first show the area overhead caused by
the PUFs, and then discuss the uniqueness and reliability of the PUF outputs.
1) Area Overhead: The Xilinx Zynq-7000 XC7Z020 FPGA has about 53,200 LUTs, 17400 of which can be used as storage or shift registers. In our experiments, a 128-bit PUF will consume 258 shift register LUTs (utilization: 1%) and 256 flip-flops (utilization: 1%). Hence, the reference PUF implementation’s area overhead can be neglected.
2) Uniqueness: The uniqueness shows how uniquely a PUF response can be, which determines the quality of the PUF. It is not acceptable if different PUFs produce the same or very similar responses when fed with the same challenge. We use Hamming Distance (HD) to evaluate the PUF response’s uniqueness. For a pair of n-bit PUF responses: Pi and Pj (i �= j ), their HD is the number of bits that Pi and Pj are different. A PUF response is unique as long as it has a non-zero HD with the responses of other challenges. However, due to reliability concerns (see item 3) below for more details), the PUF responses under different oper- ating environments may have bit errors and thus produce the same response on different challenges when their des- ignate responses have a very small HD. We define, for k n-bit PUF responses: P1, P2, · · · , Pk , their average pairwise HD as:
u = 2 k(k − 1)
k−1∑
i=1
k∑
j =i+1
H D(Pi , Pj )
n × 100% (3)
where,
H D(Pi , Pj ) = n∑
m=1 (ri,m ⊕ r j,m )
ri,m is the m-th response bit of the n-bit response string from PUF Pi .
If each PUF response is unique and logic-0 and logic-1 are distributed in responses uniformly, the expectation of HDs between the PUF responses should be 50%. In our experiment, we use k = 16 and n = 64. From the (16 ∗ 15)/2 = 120 data points of pairwise HD, we have u = 49.6%, which means that on average, any pair of PUF responses have a HD of 31.75 bits.
To further investigate the PUF response’s uniqueness, we consider the frequency histogram of these 120 pairwise HD which is shown in Fig. 10. These HDs are concentrated around the expected value of 32 (which is half of 64 bits) with max- imum equals to 45 and minimum equals to 18. This implies that for two different challenges to generate the same response, at least one of the PUF response has to have 9 out of the total 64 bits flipped, as we will see next, this is almost impossible to happen with the current PUF technologies. Consequently, we conclude that the implemented PUF achieves good response uniqueness.
3) Reliability: Reliability is used to assess the stability of PUF responses in different environments. Ideally, PUF responses challenged by the same input should remain the same in repeated multiple tests. However, PUF responses may change due to factors such as ambient temperature variation and supply voltage fluctuation since these factors
1146 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
Fig. 10. PUF response uniqueness: Hamming distance distribution.
Fig. 11. PUF response variability at high vs. normal temperature.
may affect circuit delay in practice. As a most-effective factor appearing in normal using scenarios, the temperature variation plays very important role to the PUF performance because it can affect the circuit delay. In this study, we select the temperature as the effecting environmental factor to verify the PUF performance. We expect the PUF can also have good uniqueness and reliability when the FPGA runs at high temperature which may be caused by high working load, poor ventilation and high environmental temperature and so on.
We used the Xilinx Chip-Scope to monitor and read the temperature of FPGA chip with PUFs embedded. Room tem- perature was 15°C, and the normal temperature of the FPGA chip is about 40°C. Then we used an electric hair dryer to raise the FPGA chip temperature to 70°C and tested the PUF responses.
In order to verify the reliability of the PUF at the high temperature, the 64-bit PUF responses were recorded when the FPGA temperatures were about 70°C. The HDs were then calculated between the high-temperature responses and the normal-temperature responses for the 16 testing PUFs as Fig. 11 shows. As for the same PUF under the same challenge at high and normal temperatures, the maximal, minimal and average HD of the responses were 9, 2 and 5.5 respectively; there were 81.25% HDs distributed within the range [3, 8] (small difference). As shown in Fig. 11, when the temperature
TABLE II
STATI S TI CS F OR MCNC’91 BENCHMARKS
rose from 40 °C to 70 °C, the reliability degradation was small: the average HD increased from 2.38 to 5.50, and the maximum increased from 6 to 9. From Fig. 10 and Fig. 11, one can see clearly there is no overlap between the number of bit errors at high temperature (maximum to be 9 bits) and the Hamming distance of two different PUF responses (minimum at 18 bits). This phenomenon is known as a vacuum belt: if we use any value x between 10 and 17 as a threshold, when the number of bit difference between the PUF response at run time and the original correct PUF response is less than x, they are errors from the same PUF response; otherwise, they are from different PUF response. Therefore, there will not be any false positive or false negative. As we have discussed earlier in Section VI.B, when the design exhibits large variation to operating environment and there is no vacuum belt, we can apply reliability-enhancing techniques to reduce the bit errors from PUF response.
B. Overhead Analysis of Modifying FSM
We performed experiments to evaluate the overhead incurred by modifying FSM on the MCNC’91 benchmark sequential circuits and FSM circuits randomly generated by GenFSM [31]. The circuits are described in KISS2 format. Firstly, we use a JAVA program to add the states and transitions for the circuits in KISS2 format, and then use the kiss2vl tool [30] to convert KISS2 to Verilog. Finally, each FSM circuit in Verilog format was synthesized and implemented using the Xilinx ISE 14.1 on the Xilinx Virtex5 FPGA XC5VLX50T, featuring 7200 slices and 28800 Slice LUTs. All experiments were conducted on a 2.4GHz AMD Athlon(tm) 64 Processor 3800+ Dell OptiPlex 740 machine with 1GB RAM.
Table II gives the original synthesis summary conducted on MCNC’91 benchmark designs. The columns “|S|”, “PI”, “PO” and “T” are the numbers of states, input variables, output variables and transitions, respectively, in each FSM benchmark. The columns “LUTs”, “Slices”, “Delay” and “Power” are the “Number of Slice LUTs”, “Number of occu- pied Slices”, “Minimum period” and the “Estimated power”, respectively, of the design with the original FSM as reported by the ISE tools. The Minimum period was obtained by using the Timing Analyzer, and the Power is the estimated power obtained by using the XPower Analyzer.
In our experiment, the number of replicated states m in each odd layer and the number of layers M in the added FSM were
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1147
TABLE III
STATI S TI CS F OR MCNC’91 BENCHMARKS WI TH OUR METHOD WHEN m = 4 & M = 4
TABLE IV
STATI S TI CS F OR MCNC’91 BENCHMARKS WI TH OUR METHOD WHEN m = 4 & M = 6
TABLE V
STATI S TI CS F OR LARGE FSMs GENERATED BY GENFSM WI TH OUR METHOD WHEN m = 4 & M = 4
set as parameters. Table III and Table IV show the synthesis summary on the benchmark circuits processed by our method when (m = 4 & M = 4) and (m = 4 & M = 6), respectively. Resources overhead is denoted by the increased “Number of Slice LUTs” and “Number of occupied Slices”. Timing overhead is measured by the increased Minimum period. �R-LUTs and �R-Slices are normalized resources overhead in our proposed scheme in LUTs and Slices, respectively. �D, and �P are the normalized overhead in delay and power, respectively. We can see from Table III and Table IV that the resources, timing and power overhead due to modifying FSM seems to be independent of the benchmark circuit size. The average resources, power and timing overhead is 52.02% for LUTs (55.34% for Slices), 11.77% and 0.03% when m = 4 & M = 4; and 61.27% for LUTs (49.21% for Slices), 13.91% and 0.03% when m = 4 & M = 6.
The Table III and Table IV reveal that the power is rather low, and the timing degradation is moderate (11.77% for Table III and 13.91% for Table IV on average) and even negative in some instances. A negative percentage implies that our method has actually improved the performance. The high area overhead and moderate timing overhead on these small benchmark circuits is a direct result of the simplicity of these circuits as they contain only control paths. In practice, an actual HWIP would be much larger with lots of other components in addition to control paths. In those cases, we expect the overhead to be small.
To demonstrate this, we use GenFSM [31] to generate ten random STGs with hundreds of states and hundreds to thou- sands of state transitions for experimentation by specifying the number of inputs, outputs and states. The experimental results, as shown in Table V, indicate that our method introduces very
1148 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
Fig. 12. Area, delay, and power overhead with different M = (2,4,6,8,10,12) and m = (2,4,6,8,10) for circuit planet.
low “�R-LUTs”, “�R-Slices”, “�D” and “�P” for large FSM designs, the average of resources, timing and power overhead are −2.67% for LUTs (−6.25% for Slices), 0.64% and 0.01% respectively. In addition, it must be noted that the overhead could be much less in large designs where there are many components other than control paths, such as memory and I/O peripheries. Furthermore, the control path realized by the binding FSM on each benchmark is only a small part of the overall size of the design (≤ 1%) [23]. Therefore, adding more control paths to the binding FSM would be acceptable with low overhead on area, timing and power in practice.
And finally, we discuss the impact of various m and M on resources, timing and power overhead for benchmarks. Fig. 12 shows the impact of various M and various m on resources, timing and power overhead for benchmark planet when M was assigned to 2, 4, 6, 8, 10, 12 and m was assigned to 2, 4, 6, 8, 10, successively. It can be seen that the power overhead is negligible, and the resources and timing overhead are roughly positive-correlated to both M and m, but nonlinear due to the optimization of the circuits during synthesis.
From the above results, we see that the PUF-FSM binding scheme can control the overhead on the area, power and timing especially in large designs. Proper m and M values can also be considered near the empirically proper values.
IX. CONCLUSION AND FUTURE WORK This article presents a new binding method that enables
binding hardware IPs to specific FPGAs by utilizing the PUF and the FSM of circuits. The method is fundamentally dif- ferent from the traditional encryption-based HWIP protection methods and offers the following advantages: 1) it can be used to protect the third-party FPGA IP cores in addition to the single FPGA configuration bitstream; 2) it does not need any third parties or permanent storage for secret keys in the FPGA; 3) it supports the pay-per-device licensing mechanism; and 4) it has low hardware cost. Experimental results on a reference implementation of the binding scheme show that a 128-bit delay-based PUF utilizes only 258 RAM-LUTs and 256 flip-flops on 28nm Xilinx FPGAs and the modified FSM only introduces 0.64% timing overhead and 0.01% power overhead on average for ten FSM designs randomly generated by GenGSM.
We conclude with a discussion on the limitations of our PUF-FSM binding scheme which lead to several future research directions. First, in our approach, we modify the FSM of a design to lock it and use the PUF response to unlock it. This effectively protects the whole design, not only the FSM, from attacks such as cloning, copying, misusing and unauthorized integration. However, the design compo- nents without bound FSMs will still be vulnerable to
ZHANG et al.: PUF-FSM BINDING SCHEME FOR FPGA IP PROTECTION AND PAY-PER-DEVICE LICENSING 1149
tamping attacks. Anti-tampering is beyond the scope of this article, but it will be interesting to study how our approach can be combined with anti-tamper methods such as those described in [12]. Second, although we have argued that our approach is more resilient again physical attacks than encryption-based IP protection method. From the physical security perspective, it is well-known that any implemen- tation of a cryptographic primitive and PUF-based security mechanism would be vulnerable to side channel attacks when no appropriate countermeasures are taken [21]. It will be of high interest to develop effective countermeasures to enhance resiliency against various types of physical attacks.
ACKNOWLEDGMENT The authors would like to thank Dr. Qiang Wu,
Dr. Qiang Zhou, Wenjie Che and Kecheng Yang for reviewing this article and providing us feedback. We would also like to thank the anonymous reviewers for their insightful suggestions and comments.
REFERENCES
[1] S. Drimer, “Security for volatile FPGAs,” Ph.D. dissertation, Dept. Comput. Lab., Univ. Cambridge, Cambridge, U.K., Tech. Rep. UCAM-CL-TR-763, Nov. 2009.
[2] R. Maes, D. Schellekens, and I. Verbauwhede, “A pay-per-use licensing scheme for hardware IP cores in recent SRAM-based FPGAs,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 1, pp. 98–108, Feb. 2012.
[3] T. Kean, “Cryptographic rights management of FPGA intellectual prop- erty cores,” in Proc. ACM/SIGDA 10th Int. Symp. Field-Program. Gate Arrays (FPGA), 2002, pp. 113–118.
[4] T. Güneysu, B. Möller, and C. Paar, “Dynamic intellectual property protection for reconfigurable devices,” in Proc. Int. Conf. Field-Program. Technol. (ICFPT), Dec. 2007, pp. 169–176.
[5] S. Drimer, T. Güneysu, M. G. Kuhn, and C. Paar. (2008). Protect- ing Multiple Cores in a Single FPGA Design. [Online]. Available: http://www.cl.cam.ac.uk/~sd410/papers/protect_many_cores.pdf
[6] “Design security in Stratix III devices (v1.5),” Altera, San Jose, CA, USA, White Paper 01010, Sep. 2009.
[7] “Using high security features in Virtex-II series FPGAs (v1.0),” Xilinx, San Jose, CA, USA, Appl. Note 766, Jul. 2004.
[8] S. Trimberger, J. Moore, and W. Lu, “Authenticated encryption for FPGA bitstreams,” in Proc. 19th ACM/SIGDA Symp. Field-Program. Gate Arrays (FPGA), 2011, pp. 83–86.
[9] “Protecting the FPGA design from common threats (v1.0),” Altera, San Jose, CA, USA, White Paper 01111, Jun. 2009.
[10] G. E. Suh and S. Devadas, “Physical unclonable functions for device authentication and secret key generation,” in Proc. 44th ACM/IEEE Design Autom. Conf. (DAC), Jun. 2007, pp. 9–14.
[11] M. A. Gora, A. Maiti, and P. Schaumont, “A flexible design flow for software IP binding in FPGA,” IEEE Trans. Ind. Informat., vol. 6, no. 4, pp. 719–728, Nov. 2010.
[12] S. J. Stone, “Anti-tamper method for field programmable gate arrays through dynamic reconfiguration and decoy circuits,” M.S. thesis, Dept. Elect. Comput. Eng., Air Force Inst. Technol., Wright-Patterson Air Force Base, OH, USA, 2008.
[13] K. Kepa, F. Morgan, and K. Kosciuszkiewicz, “IP protection in partially reconfigurable FPGAs,” in Proc. IEEE Int. Conf. Field-Program. Logic Appl. (FPL), Aug./Sep. 2009, pp. 403–409.
[14] U. Rührmair, F. Sehnke, J. Sölter, G. Dror, S. Devadas, and J. Schmidhuber, “Modeling attacks on physical unclonable functions,” in Proc. 17th ACM Conf. Comput. Commun. Secur. (CCS), 2010, pp. 237–249.
[15] A. L. Oliveira, “Robust techniques for watermarking sequential circuit designs,” in Proc. 36th Annu. ACM/IEEE Design Autom. Conf. (DAC), Jun. 1999, pp. 837–842.
[16] E. Simpson and P. Schaumont, “Offline hardware/software authentication for reconfigurable platforms,” in Proc. 8th Int. Conf. Cryptogr. Hardw. Embedded Syst. (CHES), 2006, pp. 311–323.
[17] M. Majzoobi and F. Koushanfar, “Time-bounded authentication of FPGAs,” IEEE Trans. Inf. Forensics Security, vol. 6, no. 3, pp. 1123–1135, Sep. 2011.
[18] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, “FPGA intrinsic PUFs and their use for IP protection,” in Proc. 9th Int. Workshop Cryptogr. Hardw. Embedded Syst. (CHES), 2007, pp. 63–80.
[19] “Security solutions using Spartan-3 generation FPGAs (v1.1),” Xilinx, San Jose, CA, USA, White Paper 266, Apr. 2008.
[20] J. A. Roy, F. Koushanfar, and I. L. Markov, “EPIC: Ending piracy of integrated circuits,” in Proc. Eur. Design Test Conf. (DATE), 2008, pp. 1069–1074.
[21] R. Maes, A. Van Herrewege, and I. Verbauwhede, “PUFKY: A fully functional PUF-based cryptographic key generator,” in Proc. 14th Int. Conf. Cryptogr. Hardw. Embedded Syst. (CHES), 2012, pp. 302–319.
[22] M. Majzoobi, A. Kharaya, F. Koushanfar, and S. Devadas. (2014). “Auto- mated design, implementation, and evaluation of arbiter-based PUF on FPGA using programmable delay lines,” Rice Univ., Houston, TX, USA, Tech. Rep. 2014/639. [Online]. Available: http://eprint.iacr.org/
[23] J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quan- titative Approach, 4th ed. San Mateo, CA, USA: Morgan Kaufmann, 2006.
[24] Y. M. Alkabani and F. Koushanfar, “Active hardware metering for intellectual property protection and security,” in Proc. 16th USENIX Secur. Symp., 2007, pp. 291–306.
[25] Y. Alkabani, F. Koushanfar, and M. Potkonjak, “Remote activation of ICs for piracy prevention and digital right management,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Design (ICCAD), Nov. 2007, pp. 674–677.
[26] Y. Alkabani and F. Koushanfar, “Active control and digital rights management of integrated circuit IP cores,” in Proc. Int. Conf. Compil., Archit., Synth. Embedded Syst., 2008, pp. 227–234.
[27] F. Koushanfar, “Provably secure active IC metering techniques for piracy avoidance and digital rights management,” IEEE Trans. Inf. Forensics Security, vol. 7, no. 1, pp. 51–63, Feb. 2012.
[28] A. Cui, C.-H. Chang, S. Tahar, and A. T. Abdel-Hamid, “A robust FSM watermarking scheme for IP protection of sequential circuit design,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 30, no. 5, pp. 678–690, May 2011.
[29] D. E. Holcomb, W. P. Burleson, and K. Fu, “Power-up SRAM state as an identifying fingerprint and source of true random numbers,” IEEE Trans. Comput., vol. 58, no. 9, pp. 1198–1210, Sep. 2009.
[30] C. Pruteanu. (2000). Kiss to Verilog FSM Converter. [Online]. Available: http://codrin.freeshell.org/
[31] C. Pruteanu and C.-G. Haba, “GenFSM: A finite state machine gen- eration tool,” in Proc. 9th Int. Conf. Develop. Appl. Syst., 2008, pp. 165–168.
[32] A. B. Kahng et al., “Watermarking techniques for intellectual property protection,” in Proc. 35th Annu. Design Autom. Conf. (DAC), 1998, pp. 776–781.
[33] G. Qu and M. Potkonjak, Intellectual Property Protection in VLSI Designs: Theory and Practice. Boston, MA, USA: Kluwer, 2003.
[34] J. Zhang, Y. Lin, Q. Wu, and W. Che, “Watermarking FPGA bitfile for intellectual property protection,” Radioengineering, vol. 21, no. 2, pp. 764–771, 2012.
[35] J. Zhang, Y. Lin, W. Che, Q. Wu, Y. Lyu, and K. Zhao, “Efficient verification of IP watermarks in FPGA designs through lookup table content extracting,” IEICE Electron. Exp., vol. 9, no. 22, pp. 1735–1741, 2012.
[36] J. Zhang, Y. Lin, and G. Qu, “Reconfigurable binding against FPGA replay attacks,” ACM Trans. Design Autom. Electron. Syst., vol. 20, no. 2, Feb. 2015, Art. ID 33.
[37] J. Zhang et al., “FPGA IP protection by binding finite state machine to physical unclonable function,” in Proc. 23rd Int. Conf. Field-Program. Logic Appl. (FPL), Sep. 2013, pp. 1–4.
[38] F. Koushanfar and G. Qu, “Hardware metering,” in Proc. 38th Annu. Design Autom. Conf. (DAC), 2001, pp. 490–493.
[39] J.-L. Zhang, G. Qu, Y.-Q. Lv, and Q. Zhou, “A survey on silicon PUFs and recent advances in ring oscillator PUFs,” J. Comput. Sci. Technol., vol. 29, no. 4, pp. 664–678, 2014.
[40] M.-D. Yu and S. Devadas, “Secure and robust error correction for physical unclonable functions,” IEEE Des. Test Comput., vol. 27, no. 1, pp. 48–65, Jan./Feb. 2010.
[41] J. H. Anderson, “A PUF design for secure FPGA-based embedded systems,” in Proc. 15th Asia South Pacific, Design Autom. Conf. (ASP-DAC), 2010, pp. 1–6.
[42] R. Maes and I. Verbauwhede, “Physically unclonable functions: A study on the state of the art and future research directions,” in Towards Hardware-Intrinsic Security. Berlin, Germany: Springer-Verlag, 2010, pp. 3–37.
1150 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 10, NO. 6, JUNE 2015
[43] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, “Physical unclonable functions and public-key crypto for FPGA IP protection,” in Proc. Int. Conf. Field Program. Logic Appl. (FPL), Aug. 2007, pp. 189–195.
[44] Z. Paral and S. Devadas, “Reliable and efficient PUF-based key genera- tion using pattern matching,” in Proc. IEEE Int. Symp. Hardw.-Oriented Secur. Trust (HOST), Jun. 2011, pp. 128–133.
[45] M. Majzoobi, M. Rostami, F. Koushanfar, D. S. Wallach, and S. Devadas, “Slender PUF protocol: A lightweight, robust, and secure authenti- cation by substring matching,” in Proc. IEEE Symp. Secur. Privacy Workshops (SPW), May 2012, pp. 33–44.
[46] G. Qu and C.-E. Yin, “Temperature-aware cooperative ring oscil- lator PUF,” in Proc. IEEE Int. Workshop Hardw.-Oriented Secur. Trust (HOST), Jul. 2009, pp. 36–42.
[47] A. Maiti, L. McDougall, and P. Schaumont, “The impact of aging on an FPGA-based physical unclonable function,” in Proc. Int. Conf. Field Program. Logic Appl. (FPL), Sep. 2011, pp. 151–156.
[48] D. Ganta and L. Nazhandali, “Study of IC aging on ring oscillator phys- ical unclonable functions,” in Proc. 15th Int. Symp. Quality Electron. Design (ISQED), Mar. 2014, pp. 461–466.
[49] M. Rahman, D. Forte, J. Fahrny, and M. Tehranipoor, “ARO-PUF: An aging-resistant ring oscillator PUF design,” in Proc. Design, Autom., Test Eur. Conf. Exhibit. (DATE), 2014, pp. 1–6.
[50] R. Maes and V. van der Leest, “Countering the effects of silicon aging on SRAM PUFs,” in Proc. IEEE Int. Symp. Hardw.-Oriented Secur. Trust (HOST), May 2014, pp. 148–153.
Jiliang Zhang received the B.E. degree in chemical engineering and technology from the Shandong Uni- versity of Science and Technology, Qingdao, China, in 2009, and the Ph.D. degree in computer applica- tion technology from Hunan University, Changsha, China, in 2015. From 2013 to 2014, he was a Research Scholar with the Maryland Embedded Sys- tems and Hardware Security Laboratory, University of Maryland, College Park, MD, USA. His research interests include hardware security, such as secu- rity for field-programmable gate arrays, PUF and
PUF-related applications, IC obfuscation, and IP protection.
Yaping Lin received the B.S. degree from Hunan University, Changsha, China, in 1982, the M.S. degree from the University of Defense Technology, Changsha, in 1985, and the Ph.D. degree from Hunan University, in 2000. From 2004 to 2005, he was a Visiting Scholar with the University of Texas at Arlington, Arlington, TX, USA. He is currently a Professor with the College of Information Science and Engineering, Hunan University. His primary research interests are in the area of computer networking and information
security with a focus on sensor networks, cloud security, and hardware related security.
Yongqiang Lyu received the B.S. degree in computer science from Xidian University, Xi’an, China, in 2001, and the M.S. and Ph.D. degrees in computer science from Tsinghua University, Beijing, China, in 2003 and 2006, respectively. He is cur- rently an Assistant Professor with the Research Institute of Information Technology, Tsinghua University. His research interest focuses on the hardware–software fusion architecture in emerging computing systems.
Gang Qu (SM’07) received the B.S. and M.S. degrees in mathematics from the University of Science and Technology of China, Hefei, China, in 1992 and 1994, respectively, and the Ph.D. degree in computer science from the University of California at Los Angeles, Los Angeles, CA, USA, in 2000. Upon graduation, he joined the University of Maryland, College Park, MD, USA, where he is currently a Professor with the Department of Electrical and Computer Engineering and the Institute for Systems Research.
He is a member of the Maryland Cybersecurity Center and the Maryland Energy Research Center. He is the Director of Maryland Embedded Systems and Hardware Security Laboratory, College Park, and the Wireless Sensors Laboratory.
His primary research interests are in the area of embedded systems and very large scale integration (VLSI) computer aided design (CAD) with a focus on low power system design and hardware related security and trust. He studies optimization and combinatorial problems and applies his theoretical discovery to applications in VLSI CAD, wireless sensor network, bioinformatics, and cybersecurity. He has received many awards for his academic achievements, teaching, and service to the research community. He serves as an Associate Editor of the IEEE E MBEDDED SYS TEMS LETTERS, and the Integration, the VLSI Journal.
<< /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /sRGB /DoThumbnails true /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 0 /ParseDSCComments false /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo false /PreserveFlatness true /PreserveHalftoneInfo true /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts false /TransferFunctionInfo /Remove /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true /Arial-Black /Arial-BoldItalicMT /Arial-BoldMT /Arial-ItalicMT /ArialMT /ArialNarrow /ArialNarrow-Bold /ArialNarrow-BoldItalic /ArialNarrow-Italic /ArialUnicodeMS /BookAntiqua /BookAntiqua-Bold /BookAntiqua-BoldItalic /BookAntiqua-Italic /BookmanOldStyle /BookmanOldStyle-Bold /BookmanOldStyle-BoldItalic /BookmanOldStyle-Italic /BookshelfSymbolSeven /Century /CenturyGothic /CenturyGothic-Bold /CenturyGothic-BoldItalic /CenturyGothic-Italic /CenturySchoolbook /CenturySchoolbook-Bold /CenturySchoolbook-BoldItalic /CenturySchoolbook-Italic /ComicSansMS /ComicSansMS-Bold /CourierNewPS-BoldItalicMT /CourierNewPS-BoldMT /CourierNewPS-ItalicMT /CourierNewPSMT /EstrangeloEdessa /FranklinGothic-Medium /FranklinGothic-MediumItalic /Garamond /Garamond-Bold /Garamond-Italic /Gautami /Georgia /Georgia-Bold /Georgia-BoldItalic /Georgia-Italic /Haettenschweiler /Impact /Kartika /Latha /LetterGothicMT /LetterGothicMT-Bold /LetterGothicMT-BoldOblique /LetterGothicMT-Oblique /LucidaConsole /LucidaSans /LucidaSans-Demi /LucidaSans-DemiItalic /LucidaSans-Italic /LucidaSansUnicode /Mangal-Regular /MicrosoftSansSerif /MonotypeCorsiva /MSReferenceSansSerif /MSReferenceSpecialty /MVBoli /PalatinoLinotype-Bold /PalatinoLinotype-BoldItalic /PalatinoLinotype-Italic /PalatinoLinotype-Roman /Raavi /Shruti /Sylfaen /SymbolMT /Tahoma /Tahoma-Bold /TimesNewRomanMT-ExtraBold /TimesNewRomanPS-BoldItalicMT /TimesNewRomanPS-BoldMT /TimesNewRomanPS-ItalicMT /TimesNewRomanPSMT /Trebuchet-BoldItalic /TrebuchetMS /TrebuchetMS-Bold /TrebuchetMS-Italic /Tunga-Regular /Verdana /Verdana-Bold /Verdana-BoldItalic /Verdana-Italic /Vrinda /Webdings /Wingdings2 /Wingdings3 /Wingdings-Regular /ZWAdobeF ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 150 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 600 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages false /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 600 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 400 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description << /CHS <FEFF4f7f75288fd94e9b8bbe5b9a521b5efa7684002000410064006f006200650020005000440046002065876863900275284e8e55464e1a65876863768467e5770b548c62535370300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c676562535f00521b5efa768400200050004400460020658768633002> /CHT <FEFF4f7f752890194e9b8a2d7f6e5efa7acb7684002000410064006f006200650020005000440046002065874ef69069752865bc666e901a554652d965874ef6768467e5770b548c52175370300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c4f86958b555f5df25efa7acb76840020005000440046002065874ef63002> /DAN <FEFF004200720075006700200069006e0064007300740069006c006c0069006e006700650072006e0065002000740069006c0020006100740020006f007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400650072002c0020006400650072002000650067006e006500720020007300690067002000740069006c00200064006500740061006c006a006500720065007400200073006b00e60072006d007600690073006e0069006e00670020006f00670020007500640073006b007200690076006e0069006e006700200061006600200066006f0072007200650074006e0069006e006700730064006f006b0075006d0065006e007400650072002e0020004400650020006f007000720065007400740065006400650020005000440046002d0064006f006b0075006d0065006e0074006500720020006b0061006e002000e50062006e00650073002000690020004100630072006f00620061007400200065006c006c006500720020004100630072006f006200610074002000520065006100640065007200200035002e00300020006f00670020006e0079006500720065002e> /DEU <FEFF00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e0020007a0075006d002000450072007300740065006c006c0065006e00200076006f006e002000410064006f006200650020005000440046002d0044006f006b0075006d0065006e00740065006e002c00200075006d002000650069006e00650020007a0075007600650072006c00e40073007300690067006500200041006e007a006500690067006500200075006e00640020004100750073006700610062006500200076006f006e00200047006500730063006800e40066007400730064006f006b0075006d0065006e00740065006e0020007a0075002000650072007a00690065006c0065006e002e00200044006900650020005000440046002d0044006f006b0075006d0065006e007400650020006b00f6006e006e0065006e0020006d006900740020004100630072006f00620061007400200075006e0064002000520065006100640065007200200035002e003000200075006e00640020006800f600680065007200200067006500f600660066006e00650074002000770065007200640065006e002e> /ESP <FEFF005500740069006c0069006300650020006500730074006100200063006f006e0066006900670075007200610063006900f3006e0020007000610072006100200063007200650061007200200064006f00630075006d0065006e0074006f0073002000640065002000410064006f00620065002000500044004600200061006400650063007500610064006f007300200070006100720061002000760069007300750061006c0069007a00610063006900f3006e0020006500200069006d0070007200650073006900f3006e00200064006500200063006f006e006600690061006e007a006100200064006500200064006f00630075006d0065006e0074006f007300200063006f006d00650072006300690061006c00650073002e002000530065002000700075006500640065006e00200061006200720069007200200064006f00630075006d0065006e0074006f00730020005000440046002000630072006500610064006f007300200063006f006e0020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e003000200079002000760065007200730069006f006e0065007300200070006f00730074006500720069006f007200650073002e> /FRA <FEFF005500740069006c006900730065007a00200063006500730020006f007000740069006f006e00730020006100660069006e00200064006500200063007200e900650072002000640065007300200064006f00630075006d0065006e00740073002000410064006f006200650020005000440046002000700072006f00660065007300730069006f006e006e0065006c007300200066006900610062006c0065007300200070006f007500720020006c0061002000760069007300750061006c00690073006100740069006f006e0020006500740020006c00270069006d007000720065007300730069006f006e002e0020004c0065007300200064006f00630075006d0065006e00740073002000500044004600200063007200e900e90073002000700065007500760065006e0074002000ea0074007200650020006f007500760065007200740073002000640061006e00730020004100630072006f006200610074002c002000610069006e00730069002000710075002700410064006f00620065002000520065006100640065007200200035002e0030002000650074002000760065007200730069006f006e007300200075006c007400e90072006900650075007200650073002e> /ITA (Utilizzare queste impostazioni per creare documenti Adobe PDF adatti per visualizzare e stampare documenti aziendali in modo affidabile. I documenti PDF creati possono essere aperti con Acrobat e Adobe Reader 5.0 e versioni successive.) /JPN <FEFF30d330b830cd30b9658766f8306e8868793a304a3088307353705237306b90693057305f002000410064006f0062006500200050004400460020658766f8306e4f5c6210306b4f7f75283057307e305930023053306e8a2d5b9a30674f5c62103055308c305f0020005000440046002030d530a130a430eb306f3001004100630072006f0062006100740020304a30883073002000410064006f00620065002000520065006100640065007200200035002e003000204ee5964d3067958b304f30533068304c3067304d307e305930023053306e8a2d5b9a3067306f30d530a930f330c8306e57cb30818fbc307f3092884c3044307e30593002> /KOR <FEFFc7740020c124c815c7440020c0acc6a9d558c5ec0020be44c988b2c8c2a40020bb38c11cb97c0020c548c815c801c73cb85c0020bcf4ace00020c778c1c4d558b2940020b3700020ac00c7a50020c801d569d55c002000410064006f0062006500200050004400460020bb38c11cb97c0020c791c131d569b2c8b2e4002e0020c774b807ac8c0020c791c131b41c00200050004400460020bb38c11cb2940020004100630072006f0062006100740020bc0f002000410064006f00620065002000520065006100640065007200200035002e00300020c774c0c1c5d0c11c0020c5f40020c2180020c788c2b5b2c8b2e4002e> /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken waarmee zakelijke documenten betrouwbaar kunnen worden weergegeven en afgedrukt. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR <FEFF004200720075006b00200064006900730073006500200069006e006e007300740069006c006c0069006e00670065006e0065002000740069006c002000e50020006f0070007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740065007200200073006f006d002000650072002000650067006e0065007400200066006f00720020007000e5006c006900740065006c006900670020007600690073006e0069006e00670020006f00670020007500740073006b007200690066007400200061007600200066006f0072007200650074006e0069006e006700730064006f006b0075006d0065006e007400650072002e0020005000440046002d0064006f006b0075006d0065006e00740065006e00650020006b0061006e002000e50070006e00650073002000690020004100630072006f00620061007400200065006c006c00650072002000410064006f00620065002000520065006100640065007200200035002e003000200065006c006c00650072002e> /PTB <FEFF005500740069006c0069007a006500200065007300730061007300200063006f006e00660069006700750072006100e700f50065007300200064006500200066006f0072006d00610020006100200063007200690061007200200064006f00630075006d0065006e0074006f0073002000410064006f00620065002000500044004600200061006400650071007500610064006f00730020007000610072006100200061002000760069007300750061006c0069007a006100e700e3006f002000650020006100200069006d0070007200650073007300e3006f00200063006f006e0066006900e1007600650069007300200064006500200064006f00630075006d0065006e0074006f007300200063006f006d0065007200630069006100690073002e0020004f007300200064006f00630075006d0065006e0074006f00730020005000440046002000630072006900610064006f007300200070006f00640065006d0020007300650072002000610062006500720074006f007300200063006f006d0020006f0020004100630072006f006200610074002000650020006f002000410064006f00620065002000520065006100640065007200200035002e0030002000650020007600650072007300f50065007300200070006f00730074006500720069006f007200650073002e> /SUO <FEFF004b00e40079007400e40020006e00e40069007400e4002000610073006500740075006b007300690061002c0020006b0075006e0020006c0075006f0074002000410064006f0062006500200050004400460020002d0064006f006b0075006d0065006e007400740065006a0061002c0020006a006f0074006b006100200073006f0070006900760061007400200079007200690074007900730061007300690061006b00690072006a006f006a0065006e0020006c0075006f00740065007400740061007600610061006e0020006e00e400790074007400e4006d0069007300650065006e0020006a0061002000740075006c006f007300740061006d0069007300650065006e002e0020004c0075006f0064007500740020005000440046002d0064006f006b0075006d0065006e00740069007400200076006f0069006400610061006e0020006100760061007400610020004100630072006f0062006100740069006c006c00610020006a0061002000410064006f00620065002000520065006100640065007200200035002e0030003a006c006c00610020006a006100200075007500640065006d006d0069006c006c0061002e> /SVE <FEFF0041006e007600e4006e00640020006400650020006800e4007200200069006e0073007400e4006c006c006e0069006e006700610072006e00610020006f006d002000640075002000760069006c006c00200073006b006100700061002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400200073006f006d00200070006100730073006100720020006600f60072002000740069006c006c006600f60072006c00690074006c006900670020007600690073006e0069006e00670020006f006300680020007500740073006b007200690066007400650072002000610076002000610066006600e4007200730064006f006b0075006d0065006e0074002e002000200053006b006100700061006400650020005000440046002d0064006f006b0075006d0065006e00740020006b0061006e002000f600700070006e00610073002000690020004100630072006f0062006100740020006f00630068002000410064006f00620065002000520065006100640065007200200035002e00300020006f00630068002000730065006e006100720065002e> /ENU (Use these settings to create PDFs that match the "Required" settings for PDF Specification 4.0) >> >> setdistillerparams << /HWResolution [600 600] /PageSize [612.000 792.000] >> setpagedevice
sources/152/Trimberger and Moore - 2014 - FPGA Security Motivations, Features, and Applicat.pdf
INVITED P A P E R
FPGA Security: Motivations, Features, and Applications This paper discusses all aspects of FPGA security and trust.
By Stephen M. Trimberger, Fellow IEEE, and Jason J. Moore
ABSTRACT | Since their inception, field-programmable gate arrays (FPGAs) have grown in capacity and complexity so that
now FPGAs include millions of gates of logic, megabytes of
memory, high-speed transceivers, analog interfaces, and whole
multicore processors. Applications running in the FPGA include
communications infrastructure, digital cinema, sensitive data-
base access, critical industrial control, and high-performance
signal processing. As the value of the applications and the data
they handle have grown, so has the need to protect those
applications and data. Motivated by specific threats, this paper
describes FPGA security primitives from multiple FPGA ven-
dors and gives examples of those primitives in use in
applications.
KEYWORDS | Anti-tamper (AT); authentication; encryption; field-programmable gate arrays (FPGAs); information assur-
ance; physically uncloneable function (PUF); system on chip
(SoC); trust
I . I N T R O D U C T I O N
A. FPGAs and Programming Technology A field-programmable gate array (FPGA) is a semicon-
ductor device that can be programmed after manufacture
to perform a specific application design, typically specified
as a digital logic system [43]. A taxonomy of FPGAs com-
monly starts with the program storage technology (Fig. 1).
SRAM-programmed FPGAs store their configuration
data in internal volatile memory cells distributed through-
out the device. These are generally not SRAM cells, but are
more similar to static latch cells [43]. Xilinx’s 7-Series and Altera’s Stratix-5 are examples of popular SRAM-based
FPGAs. A recognized disadvantage of SRAM programming
stems from its volatility. When power is removed, the programming is lost, so an SRAM FPGA requires an exter-
nal nonvolatile memory for permanent storage of the ap-
plication program. When power is applied, the SRAM
FPGA loads its programming bitstream from that external
storage. Besides requiring a second device, the transmis-
sion of the program from the nonvolatile external memory
to the SRAM FPGA may expose the programming to a
potential adversary. The volatility of data may also be used as a positive security feature, enabling the SRAM FPGA to
clear all programming if it is tampered [48].
In contrast, flash memory programmable logic devices,
such as traditional complex programmable logic devices
(CPLDs), the Microsemi Corporation (Aliso Viejo, CA, USA)
SmartFusion2, and Lattice Semiconductor (Hillsboro, OR,
USA) ispXPGA [1], [21], are nonvolatile and use internal flash
memory to hold the programming. While the internal flash memory eliminates the need for an external nonvolatile storage
device and the consequent exposure of the programming to
potential adversaries, systems employing flash FPGAs com-
monly require in-system programming (ISP) of the FPGA. ISP
exposes the programming of the FPGA to the same security
concerns as SRAM FPGAs. The availability of reprogrammable
flash provides FPGA manufacturers with the ability to build
applications that ‘‘remember’’ information through power cyclesVuseful in cryptographic applications such as tamper logging and key revocation. Flash devices can also be erased
upon command to eliminate the design when needed.
Antifuse FPGAs, such as the Microsemi Axcelerator,
use a one-time programmable structure to form
Manuscript received September 14, 2013; revised May 16, 2014; accepted June 11, 2014.
Date of publication July 8, 2014; date of current version July 18, 2014.
S. M. Trimberger is with Xilinx, San Jose, CA 95124 USA (e-mail: [email protected]).
J. J. Moore is with Xilinx, Albuquerque, NM 87109 USA.
Digital Object Identifier: 10.1109/JPROC.2014.2331672
Fig. 1. FPGA taxonomy.
0018-9219 � 2014 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/ redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1248 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
nonvolatile links between internal nodes [2], [25], [26]. An antifuse, commonly built as a programmable via be-
tween metal layers, is disconnected at manufacture. A
high-voltage pulse programs the fuse, causing it to form a
low-resistance connection between the internal nodes.
Antifuses are nonvolatile, but one-time programmable.
Once programmed, an antifuse FPGA cannot be changed
or reprogrammed. Because there is no need for external
configuration storage, the confidentiality and authentica- tion of configuration data is more easily maintained. ISP is
not possible. However, because the program cannot be
erased from the device, additional system-level security
concerns may remain.
B. Why SRAM? By far, the most common FPGAs, even in security-
conscious applications, are those programmed with SRAM. If SRAM programming exposes sensitive data to adversar-
ies, why would anyone use them? The popularity of SRAM
programming technology derives from the simplicity of its
manufacture: SRAM FPGAs require only transistors and
wires to realize the interconnect, configuration memory
cells and switches of the generic device. Therefore, SRAM
FPGAs take advantage of new process nodes earlier than
other FPGAs [47], which may be two process generations ahead of other technologies. This process advantage results
in higher performance, greater logic density, and improved
power efficiency for SRAM FPGAs. SRAM programming
also simplifies manufacturing test, where the SRAM FPGA
is typically programmed many times to perform self-tests.
In addition, SRAM FPGA applications can be easily up-
dated in the field in much the same way software is
updated. When used with strong bitstream security features, in-
cluding those described in this paper, the security of
SRAM FPGAs is on par with the security of nonvolatile
internal storage of the bitstream. Therefore, despite the
greater perceived security of antifuse and flash FPGAs,
SRAM FPGAs are deployed in many security-conscious
applications.
C. The FPGA Design Lifecycle The FPGA lifecycle includes two design flows: the base
array design and the application design (Fig. 2). Security
must be maintained through both [44]. The base array
design is a standard integrated circuit development flow
controlled by the FPGA manufacturer. The base array is
designed using commercial design tools and libraries,
manufactured at a foundry and tested. It is then typically sent to another facility for packaging and final test. The
resulting base array is shipped to a customer or authorized
distributor. The base array design is subject to all the sup-
ply chain trust and security concerns as any other integ-
rated circuit, including questions about tampering with
tools, supply-chain control, and reverse engineering. Large
FPGA manufacturers maintain a close watch on their
supply chain, tracking every manufactured device through
to final customer delivery or destruction. As the security
issues associated with the design and manufacture of the
base array are no different than those of other semicon-
ductor devices, this paper does not focus on the base array
design and manufacture, but instead focuses on the secu-
rity concerns that arise from the need to protect the appli- cation design.
The application design also has a design phase, typically
performed with FPGA vendors’ tools, often augmented
with commercial EDA tools. The application developer in-
tegrates design information or intellectual property (IP)
from a number of sources into an FPGA application: ori-
ginal and reused hardware description language (HDL)
code, libraries from the FPGA vendor and other parties and software for soft and hard microprocessors. The FPGA
vendor’s tools compile the application design into a bit-
stream, the programming of the FPGA base array to realize
the application function. As with any design process, the
design itself can be carried out in a secure location. Protec-
tion of IP during the design phase is no different for FPGAs
than it is for ASICs or microprocessors. Therefore, this
paper does not address design-phase security. A nonvolatile FPGA, such as a flash or antifuse FPGA,
may be programmed before it is shipped. An SRAM FPGA
is typically shipped with a separate nonvolatile memory
containing the programming, and when power is applied,
the FPGA loads its programming from the nonvolatile
memory.
D. This Paper This paper begins by focusing on those FPGA aspects
that impact security, both positively and negatively. It
summarizes the common threat vectors and then intro- duces some early FPGA security strategies. The remainder
of the paper focuses on modern FPGA security as it relates
to two of the primary security domains: information
assurance (IA) and anti-tamper (AT). In each domain, the
presentation describes the techniques that are currently
deployed, introducing them broadly, then using specific
threats to motivate additional detail. The various FPGA
Fig. 2. FPGA lifecycle flows. (Left) Generic integrated circuit flow for the base array. (Right) Application design and deployment flow.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1249
vendors have chosen solutions to these threats that are similar, yet they differ in detail. In this paper, we attempt
to describe the major security solutions deployed by large
FPGA vendors, outlining major distinctions while omitting
minor differences. Since FPGA security is continually
changing, newer FPGAs may well deploy different
mechanisms.
Following the discussion of security features is a sec-
tion showing applications using those features to achieve security goals. This paper concludes with a short discus-
sion of the future of FPGA security capabilities.
I I . F P G A S E C U R I T Y I N T R O D U C T I O N
A. Unique Aspects of FPGA Security FPGA programming bitstreams are qualitatively much
like microprocessor software. They are susceptible to all
the same security concerns that surround software, includ-
ing unauthorized copy, theft of IP embodied in the FPGA
application program, and tampering to introduce malware
[9], [46]. FPGA programming is present in the system in
the field, whether programmed directly in antifuses, flash
memory cells, or in an external nonvolatile memory. If an
adversary can recover the programming by reading the internal memory, intercepting the programming bit-
stream, or reverse-engineering programmed fuses from a
decapped device, then the application can be duplicated
and reverse engineered. SRAM FPGAs, in particular, have
been criticized over this concern [2], although Flash-based
FPGAs have the same susceptibility if in system repro-
grammability is required.
On the other hand, the application developer does not reveal the application design to FPGA vendors or their
suppliers. Because the FPGA base array is manufactured
without knowledge of the end application, there is no
chance of IP theft or tampering of an application design
during manufacture and test of the FPGA base array. Since
all FPGA devices are manufactured identically and sold
into a variety of applications, an adversary cannot discover
any application-dependent information by attacking the FPGA vendor’s supply chain.
Further, since the programming is not done with me-
tallization as is the case with ASIC devices, traditional
reverse engineering, where the mask layers are recognized
from a decapped device, does not work. Such reverse
engineering may yield the application-independent base
array, but not the application implemented on it.
B. Environment and the Cost of Security FPGA security is complicated by the environment in
which the FPGA is expected to perform. The design of
FPGA security features assumes no physical barrier and no
communication network: the FPGA may be in the hands of
an adversary with no trusted party available. This envi-
ronmental assumption distinguishes FPGA security from
internet security, where servers may physically reside in a trusted environment and those servers can verify identity
through name servers with which they are in communi-
cation. In FPGA security design, it is assumed that the
adversary has physical access to the device and may mount
any electrical, physical, side channel, or replay attack. The
rationale is straightforward: if the adversary does not have
such access, then the containing system could ensure the
security of the FPGA by controlling all access to the FPGA. In this case, built-in FPGA security would be redundant.
Although military systems may employ physical secu-
rity, the cost of ‘‘guns, gates, and guards’’ is impractical in
commercial systems. The adversary is assumed to have an
economic motive, such as theft of IP. Therefore, the secu-
rity applied in the commercial domain is an economic
concern where the cost of security measures is balanced
against the value of the information being protected. FPGA security is designed to make the cost of breaking the
security greater than the adversary’s expected economic
gain. This decision is ultimately in the hands of the
application developer, not the FPGA manufacturer.
As FPGAs have become larger and more capable, the
value of the IP of the application designs has grown, moti-
vating significant investment in built-in security functions.
Further, the value of the data handled by the FPGA has also increased significantly, including such information as
decrypted digital cinema and personal-data databases. As
a result, today we find FPGAs deployed in a security-
hostile environment, protecting data of great commercial
value.
I I I . T H R E A T S
An adversary may attack the IP of the application design
itself, the data stored in the application or the system of
which the FPGA is a part. Each type of data has different
value. Each attack requires different security features to
defend. The attacks of major concern to FPGA vendors can
be divided into categories.
A. Cloning/Overbuilding In cloning, an adversary copies the FPGA program-
ming, then uses it in an identical device, selling it as his
own. In overbuilding, an adversary such as a contract
manufacturer builds additional systems, inserting the legi-
timate bitstream into those systems and selling them
without the designer’s approval. Cloning may apply to an
entire design or may apply to a subset of the design, for
example, purchased cores that may be restricted by the seller. In both cases, the adversary does not require de-
tailed knowledge of the design.
B. Reverse Engineering An adversary may reverse engineer the bitstream to
recover the circuit design that it implements. This may be
done to understand and duplicate the functionality of that
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1250 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
application, but may also be used as part of an attack on other aspects of the system. Reverse engineering may be
used to tamper with the application to insert malware.
Historically, reverse engineering an FPGA bitstream, like
decompiling software, has been considered possible,
though tedious and nontrivial. Reverse engineering of
FPGA bitstreams is further complicated because FPGA
vendors do not have a standardized bitstream. As a result,
every new FPGA device requires a new bitstream reverse- engineering effort.
A more insidious problem is dealing with the size of the
application. Although reverse engineering may divulge the
netlist of the application, transforming a multimillion gate
netlist into an understandable design that can be modified
is problematic. The complexity of the application increases
its value, making theft attractive, but the consequent size
makes theft difficult. Regardless, researchers have periodically reported the
ability to reverse engineer unencrypted bitstreams. It
would seem imprudent to rely on the tedium of bitstream
reverse engineering to protect valuable IP.
C. Tampering In tampering, an adversary modifies an application
design. Tampering may be employed to add logic that leaks information from an application or tampering may
disable parts of the application, potentially defeating other
security measures. For the former, tampering must control
the application to set values in the bitstream, so reverse
engineering may also be required. However, for the latter,
merely scrambling parts of the bitstream may be
sufficient.
D. Spoofing In spoofing, an adversary replaces the FPGA bitstream
with his own. That bitstream may or may not include
components derived from cloning or reverse engineering.
A spoofed application may compromise the system in
which it operates.
E. Denial of Service, Destruction of the FPGA, and Substitution
Since it is assumed that the FPGA is in the hands of an
adversary, denial of service and malicious destruction of
the FPGA device are somewhat irrelevant. Rather than
mount a clever attack on the design to prevent the system
from operating, an adversary could simply smash the FPGA
with a hammer. Conversely, if a system requires an FPGA
containing a unique key, an adversary may choose to circumvent security measures by replacing the FPGA in a
system with another identically manufactured device from
the FPGA vendor without the key or with his own key. In
many cases, this substitution is simpler than attempting to
break the FPGA device security. Since these physical at-
tacks are so simple, FPGAs typically do not defend against
these types of threats.
I V . H I S T O R I C A L F P G A S E C U R I T Y
Early FPGAs contained very little logic, and by inference
that logic had low value. Therefore, when they were in- troduced, FPGAs provided only rudimentary protection
against threats.
FPGA manufacturers did not release the coding of their
bitstreams, though they did release a considerable amount
of information about the bitstream in tools and documen-
tation [50]. They considered the task of reverse engineering
the bitstream to be more expensive than the task of re-
creating the design by black-box observation of its operation.
A. Readback From their inception, FPGAs of all types included a
readback mechanism, whereby the program and data in the
device can be read out for test purposes. To prevent un-
authorized copy, early FPGAs followed the features of
programmable logic devices (PLDs) and included a prog-
ramming bit to disable the readback mechanism. This
method worked well for antifuse and flash-based FPGAs,
where the program could be loaded at a secure location, but SRAM FPGAs still needed to load the bitstream in the
field, while potentially in the hands of an adversary.
Preventing readback gave little protection if the bitstream
could be intercepted as it was loaded into the FPGA. For
this reason, antifuse FPGAs, that did not expose the
programming in the system, gained an early reputation for
being a more secure FPGA technology.
It is important to note that the readback function has, and continues to be, a valuable feature for both the FPGA
manufacturer and the user. Whether the manufacturer
uses it for device test, or the user employs readback for in-
system data integrity checks, it is a feature, much like
JTAG, that is useful but needs to be adequately protected to
avoid vulnerabilities.
Readback continues to be a concern, and as late as
2012, Skorobogatov and Woods [38] discovered a keyed back-door/test mechanism that enabled the readback fea-
ture of a Microsemi antifuse FPGA that was assumed to be
protected by the FuseLock protection mechanism [27].
B. Early Bitstream Protections for SRAM FPGAs Before bitstream encryption, two methods were used
to protect SRAM bitstreams. The first method was to load
the FPGA at a secure location and use a battery to hold
the configuration bitstream for the entire lifetime of the
fielded system [3]. Since programmable logic devices had
privacy settings to prevent readback of the program, and since the bitstream was never exposed outside the device,
this method assured that the bitstream running inside the
FPGA is both secure and unmodified. This is precisely the
same level of security achieved by antifuse and other
nonvolatile FPGAs. The drawback of this method is, of
course, the requirement that the system be powered
continually. As FPGAs grew larger and more complex, this
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1251
solution became impractical due to increased standby power requirements.
The second solution was to use an external memory
with a unique identifier, and customize the FPGA program
to require that identifier, essentially tying the FPGA appli-
cation bitstream to a unique board-level identifier. An im-
provement to the simple test of the board-level identifier
uses an external keyed device that is queried with a ran-
dom number generated by the FPGA [6]. This solution defeats simple cloning because the bitstream only func-
tions correctly in the system with the correct external
identifier device. However, the bitstream for each system
is unique, which complicates the manufacturing process.
Further, the design can be copied by an adversary who
reverse engineers the bitstream, identifies the check logic,
and rebuilds the application with the check removed.
This solution increases the difficulty and cost of copy- ing the design but still relies on the difficulty of reverse
engineering the bitstream as the basis of security. This
solution was considered strong enough for many commer-
cial applications, and there is no evidence that anyone
mounted a successful attack on a device protected with it.
However, reliance on the tedium and complexity of bit-
stream reverse engineering seemed risky [49].
C. Modern FPGA Security As FPGAs grew in capacity, the applications grew in
value, driving the need for stronger security. Over the
years, FPGA vendors have implemented circuitry, soft-
ware, IP cores, and usage models to address security
threats. Since the FPGA application design is embodied in
a design file, aspects of information security, notably en-
cryption and authentication, were applied to FPGA bit- streams. But that was not enough. Given that FPGAs were
deployed into a hostile environment, measures were taken
also to improve protocols and implementations to secure
designs in the field. These include not only cryptography
on the configuration files but also development of fault-
tolerant design methodologies for the base array and for
applications. Today, FPGA security is strong enough that
they are deployed in security-sensitive applications in commercial and government systems [24].
V . I N F O R M A T I O N A S S U R A N C E
The basic tenets of information assurance (IA) are:
confidentiality, integrity, availability, authentication, and
nonrepudiation. As mentioned earlier, since access to the
FPGA is assumed, availability is not a requirement ad- dressed by FPGA security features. Nonrepudiation will be
addressed in the context of authentication. Therefore, we
focus on confidentiality, integrity, and authentication.
A. Confidentiality Large FPGA designs can contain IP of significant value,
and bitstream encryption prevents a competitor from
simply copying that IP. Encryption can also provide trust assurance by limiting access to the FPGA only to designs
constructed with the proper key.
1) Overview of Bitstream Encryption: Xilinx (San Jose, CA, USA) introduced bitstream encryption in 2001 in Virtex-II
devices [40], [41] to address the problem of unauthorized
copy of the bitstream as it is loaded into the FPGA from
external memory. Since that time, other FPGA vendors have added encrypted-bitstream capability.
Preventing unauthorized copy does not strictly
require encryption, since the task from a cryptographic
point of view is to determine if the bitstream is author-
ized to operate in the FPGA. This fundamentally requires
authentication, not confidentiality: a device could verify
a message authentication code on the bitstream. How-
ever, the adversary’s workaround is simple: reverse engi- neer the bitstream, recompile, and load it into a new
FPGA with the authentication removed. Therefore, re-
verse engineering must also be prevented, so confi-
dentiality of the bitstream becomes a requirement for
preventing cloning.
Virtex-II FPGAs used triple-Data Encryption Standard
(DES) encryption and subsequent Xilinx FPGAs use 256-b
Advanced Encryption Standard (AES). Recent SRAM devices from Altera Corporation (San Jose, CA, USA) [4]
and Flash devices from Microsemi Corporation (Aliso
Viejo, CA, USA) [28] also use 256 b AES. Lattice
Semiconductor (Hillsboro, OR, USA) devices use 128 b
AES [19]. Although features have changed over the years,
and details vary among vendors, the basics of FPGA
bitstream encryption for all SRAM and Flash FPGAs are
similar. The major components and use flow are described here with respect to the Xilinx, Inc. (San Jose, CA, USA)
7-series FPGA.
An application developer prepares a secured FPGA
application with the same tools and processes used for any
other application. At the end of the design process, when
the bitstream is generated, Xilinx proprietary software en-
crypts the bitstream. The Xilinx software can supply a ran-
domly generated key and initialization vector or the application developer may supply those values. The Xilinx
software produces the encrypted bitstream and a key-
insertion file.
2) Key Loading: At a secure facility, the application developer uses the key-insertion file to load the decryption
key into the FPGA through the JTAG scan chain, as shown
in Fig. 3. On-chip, the key is stored in either dedicated nonvolatile or volatile memory. FPGAs supply an inde-
pendent battery-backed array for volatile storage or one-
time-programmable eFuses for nonvolatile storage or both.
Typically, the key is loaded into the FPGA in plaintext
form, which is why this must be done at a trusted facility.
Alternative strategies for key loading and key storage are
discussed in Section VI-A2.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1252 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
3) Bitstream Loading: Later, in the field when the FPGA board boots, the FPGA loads its bitstream from an external memory. The FPGA begins loading an unencrypted bit-
stream. If the bitstream includes an encrypted-bitstream
indicator, the FPGA starts the decryptor and decrypts the
remainder of the bitstream as it loads. If the encrypted-
bitstream indicator is not present, the FPGA bypasses the
decryptor. This feature allows an FPGA in the field to be
booted with either an encrypted bitstream or an unen-
crypted bitstream for test purposes without compromising the security of the bitstream confidentiality. In addition,
most FPGAs now offer the ability to force the device to
always configure with an encrypted bitstream.
B. Data Integrity Bitstream data integrity, the ability to ensure a design
has not been accidentally modified, was a feature of very
early FPGAs. In those early devices, an improperly prog-
rammed FPGA might enable two large internal drivers in
contention, generating excessive heat and current, dam-
aging the chip. To prevent this, data integrity checks were
added to FPGA bitstreams to detect corruption of the bit- stream during loading. Cyclic redundancy check (CRC), a
common data integrity check in data transmission proto-
cols, was deployed in many FPGAs. While CRC is effective
in detecting accidental data corruption, it is ineffective
against intentional data modification.
1) Tampering With Encrypted Bitstreams: Xilinx FPGAs use 256 b AES encryption [11] in cipher block chaining (CBC) mode of operation [33] to produce a stream cipher.
In CBC encryption, each block of data is first xored with
the ciphertext of the previous encryption before being en-
crypted. In decryption, the decrypted plaintext of each
block is xored with the ciphertext of the previous block
(Fig. 4). CBC causes blocks with identical plaintext (for
example, all zero) to encrypt to different ciphertext,
thereby eliminating a dictionary attack on the data. Altera
devices use AES in counter mode (CTR) [30]. In CTR mode, an encryptor encrypts the output of a counter to
generate a pseudorandom stream of bits. That pseudo-
random stream is xored with the plaintext to generate
ciphertext. On decryption, an encryptor generates the
same pseudorandom stream to recover the plaintext.
CBC and CTR are non-error-extension modes of ope-
ration, meaning that corruption of the encrypted data
causes only a localized corruption of the corresponding plaintext. Therefore, both CBC and CTR permit a ‘‘bit-
flipping’’ attack on the plaintext. The attack is shown in
Fig. 4 with respect to CBC. If an adversary inverts a bit in
the first encrypted block, as shown by the shaded area, the
first block will decrypt to unintelligible nonsense. How-
ever, the corresponding plaintext bit in the next decrypted
block is inverted. Bit-flipping CTR mode is more straight-
forward, since a bit flip anywhere in the ciphertext inverts the corresponding bit in the decrypted plaintext without
disrupting any other data.
Using this bit-flipping technique, an adversary can
selectively invert any number of bits in the decrypted bit-
stream. If the location and state of the target bit are
known, an adversary can set it. For example, if the logic to
enable bitstream readback is disabled with a ‘‘0’’ at a
specific location, an adversary could reenable bitstream readback without knowing the contents of the bitstream
by inverting that one bit. For this reason, disabling of
readback of an encrypted FPGA bitstream is not controlled
by bits in the encrypted bitstream itself, but is instead
controlled by the configuration logic of the FPGA. When
the FPGA loads an encrypted bitstream, readback is dis-
abled regardless of the bitstream contents. However, other
attacks may attempt to modify the FPGA in a simple way: enable the internal configuration access port (ICAP), ena-
ble input/output (I/O) blocks, or change clock speed in an
attempt to gain access to internal data.
Fig. 3. Encryption architecture for Xilinx 7-series FPGAs.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1253
Attacks on the bitstream are also possible without
knowing the specific bit to attack. In Fig. 4, the first block
of data is scrambled. An adversary does not know the
plaintext that results from modifying the ciphertext. If the
number of bits to control is small enough, an attacker with patience may attempt a brute-force attack on part of the
bitstream. Scrambling the bits may program the FPGA to
perform a function it was not supposed to, such as leak-
sensitive information.
Checksums and CRCs on the FPGA bitstream detect
errors in transmission, corrupted bitstreams, and uninten-
tionally flipped bits. However, it is computationally
straightforward to compute a revised CRC after tampering with the bitstream or to determine a set of bit flips that
produce the same CRC value. Further, a CRC is typically
only 16 or 32 b, so brute-force attacks on the CRC are
tractable. Finally, in some FPGA architectures, CRCs can
be disabled altogether. Simple data integrity checks are not
sufficient to ensure that a bitstream has not been inten-
tionally tampered.
C. Authentication Communication of the bitstream to the FPGA is a one-
way transfer. Therefore, two-way entity authentication
cannot be performed. Instead, FPGAs rely on one-way
message authentication, which assures the recipient of a
message that the message is exactly the message the sender intended [8]. Strong authentication requires a message
authentication code (MAC), a cryptographic hash function
computed over the entire message. The hash function must
be impossible to compute without knowing the plaintext of
the message. The difficulty of recomputation of the MAC
eliminates all forms of CRC as the hash function, since each
bit of the CRC is a known xor of a set of bits of the message.
Because authentication verifies that the application has
not been accidentally or intentionally altered, it assures
trust in the running application. That trust enables an ap-
plication developer to guarantee protection of crypto-
graphic services and the handling of sensitive data. These sensitive data may be customer data of high value, such as
personal data in a database or copyrighted video. The
cryptographic services may include key management
functions, encryption/decryption algorithms, or keys for
further partial reconfiguration of the FPGA. Data authen-
tication provides a strong root of trust, allowing an initial
FPGA configuration to act as a trusted boot loader for
trusted subsequent configuration of the FPGA. Xilinx integrated strong data authentication in Virtex-6
devices and 7-series to address the concerns of targeted
tampering with encrypted bitstreams and the inherent
cryptographic weaknesses of CRC. Microsemi also has a
dedicated data integrity check for all of the nonvolatile
configuration memory segments of some Flash devices
[28]. Authentication is described here as it is implemented
in Xilinx devices.
1) Data Authentication in Xilinx Virtex Devices: Virtex-6 and subsequent Xilinx FPGAs authenticate using the
secure hash algorithm (SHA-256) to compute a 256-b
keyed hashed MAC (HMAC) [12], [13], [42]. SHA-256 is
a one-way hashing algorithm with a compact hardware
implementation. The keyed HMAC requires a secret au-
thentication key included in the hash. The MAC result cannot be computed without knowing the key, thereby
authenticating the identity of the sender as well as veri-
fying that the message has not been altered. The 256-b
hash size ensures that any tampering with the bitstream
will be detected with a high probability. HMAC with
Fig. 4. Bit-flipping attack on CBC mode.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1254 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
SHA-256 makes tampering with the bitstream as compu- tationally difficult as guessing the encryption key, which is
also 256 b.
2) Integration of Authentication With Bitstream Encryp- tion: Virtex devices use generic composition of the SHA- 256 keyed HMAC authentication with AES-256 encryption
[34], [37]. Generic composition allowed the two parts to be
separated, which permitted them to be developed inde- pendently and separately pipelined.
Virtex-6 and 7-series authentication and encryption are
composed using authentication then encryption (AtE). The
HMAC is computed on the plaintext, unencrypted bit-
stream. The configuration data and the MAC result value
are then encrypted. On the FPGA, the data are first de-
crypted and the MAC result is recomputed on the de-
crypted data and compared with the transmitted value in the bitstream. If the two MAC values disagree, the FPGA
configuration fails and the FPGA does not become active.
The authentication check catches errors in transmission
and attempts to configure the FPGA with the incorrect key
value as well as intentional tampering.
3) The Authentication Key: HMAC requires a secret authentication key in addition to the decryption key [12]. When generating an authenticated encrypted bitstream,
both keys are specified to the bitstream generation soft-
ware. To save nonvolatile storage space, only the decryp-
tion key is stored in the FPGA array. Because of the AtE
composition, the encrypted authentication key can be
transmitted with the bitstream. The bitstream encryption
provides the privacy to keep the authentication key secret.
4) Authentication Using Public Key Cryptography in FPGAs/ SoCs: The recent introduction of programmable systems on chip (SoCs) from FPGA manufacturers, including the
Xilinx Zynq and Microsemi SmartFusion2 devices, have
brought public key cryptography to the programmable
logic market. Both of these devices use asymmetric crypto-
graphy to provide authentication during the secure boot
process. The public key is stored on-chip in nonvolatile memory and its integrity checked before use. Public-key
authentication of configuration files such as a first stage
boot loader (FSBL) detects random-data attacks such as
those commonly used for side-channel attacks. It can also
serve to provide nonrepudiation of protected applications.
D. Bitstream Structure Fig. 5 compares bitstream structures of representative
Xilinx FPGA families, each with different security capa-
bilities [42]. Fig. 5(a) shows the bitstream format of an
unencrypted bitstream for Virtex devices. The unen-
crypted bitstream structure starts with a synchronization
word (SYNC) followed by a sequence of instructions.
Header commands set registers and control a variety of
functions, including declaring the device type and setting
up the startup sequence. The available commands and
registers are described in the Configuration User Guides
for each device family [50], [51]. The Write Frame Data
Register Immediate (Write FDRI) command begins streaming the configuration data to the FPGA’s configu-
ration memory. An unencrypted bitstream can contain any
number of Write FDRI commands, each writing a differ-
ent, possibly discontinuous, portion of the FPGA config-
uration memory. Footer commands allow setting of
register values after loading configuration data. CRC veri-
fies data integrity and STARTUP begins the FPGA startup
sequence. DESYNC prepares the configuration logic to accept postconfiguration reconfiguration commands.
Virtex-II through Virtex-5 FPGAs allowed encryption of
the FPGA configuration data, but not authentication. As a
representative of those encrypted-only bitstreams, Fig. 5(b)
shows a Virtex-5 encrypted bitstream structure. The CTL
instruction informs the FPGA that this is an encrypted bit-
stream. If the CTL command is missing, the FPGA assumes
the bitstream is unencrypted. CBC IV is the initialization vector for the AES CBC register. The CBC IV does not need
to be secret, and it is evident in the bitstream structure that
it is set with an unencrypted header command. The Write
FDRI command passes encrypted configuration data
through the decryptor. The Write FDRI command includes
a length field, also transmitted unencrypted, so the de-
cryptor decrypts the proper amount of data. Only the
configuration data are encrypted, although the CRC is computed on all data that precede it in the bitstream.
Fig. 5(c) shows an authenticated encrypted bitstream
from Virtex-6 and 7-series devices. Authentication and
encryption are always used together. There is no way to
specify a bitstream that is only encrypted or only
Fig. 5. Xilinx bitstream structure. (a) Unencrypted. (b) Virtex-5. Shaded area is encrypted. (c) Virtex-6/7-series. Shaded area is
authenticated and encrypted.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1255
authenticated for these devices. As in earlier devices, the CTL instruction informs the FPGA that the bitstream has
security enabled. The CBC initialization vector initializes
the decryptor as before. Decrypt word count (DWC)
indicates to the FPGA the amount of secure data to follow.
DWC includes not only the configuration data but header
and footer commands as well. Header commands and
footer commands are encrypted and covered by authenti-
cation. DWC is transmitted in the clear and could be modified by an adversary, but since the length of the data is
included in the MAC computation, a modification of DWC
will invalidate the computed MAC.
The authentication key is transmitted to the FPGA at
the start of configuration and again at the end of confi-
guration, since the key is used twice in the HMAC com-
putation. ALIGN is a variable number of no-operation
instructions, inserted to ensure the authenticated en- crypted data are an even multiple of 512 b, simplifying the
MAC computation. At the end of the bitstream, the re-
quired MAC is transmitted to the FPGA where it is com-
pared with the MAC computed by the FPGA.
Confidentiality, data integrity, and data authentication
of the configuration data are all required to protect FPGA
configuration data that are exposed to potential adversar-
ies. To date, only a few devices available from Xilinx and Microsemi provide all three protections on their config-
uration files.
V I . A N T I - T A M P E R
Physical security of the FPGA is just as critical as the ap-
plication of confidentiality, integrity, and authentication to
the device configuration. While there are focus areas of AT that overlap with IA, there are also aspects of AT that are
unique. FPGA manufacturers are faced with a number of
challenges while focusing on improving the physical secu-
rity of the device. As commercial products, some of the
challenges include, but are not limited to:
• FPGAs are readily available for adversaries to ex- periment on;
• compliance with U.S. and worldwide export and import restrictionsVmanufacturers must be able to sell their product worldwide, and do so while
meeting all import/export laws;
• FPGAs are cost sensitive, requiring a careful ba- lance between protecting customers’ IP and ena-
bling FPGA use in all types of systems.
There has been significant investment by FPGA manu-
facturers to enhance the physical security of their devices, driven primarily by the continual growth in performance,
density, and capabilities. This puts FPGAs at the heart of
most electronic systems today, where customers’ IP must
be protected.
This section describes some of the primary security
features and protocols of the Xilinx 7-series FPGA. These
are explored by looking at the configuration lifecycle of the
device. AT protections are employed preconfiguration, during configuration, and postconfiguration.
A. Preconfiguration
1) Defense Against Trojan Insertion: FPGAs allow the ability to configure either encrypted or unencrypted. This
is useful for application developers who may not want to
use encryption during integration and test, but then enable encryption when the system is fielded. This ability to con-
figure either encrypted or unencrypted, subjects the device
to a class of Trojan insertion attacks.
If an FPGA contains a decrypted bitstream, an adver-
sary may attempt to load a partial configuration into a
subset of the device that spies on the resident application.
It could connect to internal signals or memories. By con-
necting to internal components, the Trojan could be used to deduce the secured application in the FPGA. Con-
versely, an adversary may operate the same attack by pre-
loading a Trojan design and interrupting the secure loading
of the protected application.
Consequently, Xilinx FPGAs do not permit mixing
encrypted and unencrypted bitstreams, or partial bit-
streams, in any order. A new configuration of the device
requires fully clearing the existing device, either by cycling power or executing the JTAG JPROGRAM command.
Both methods initiate internal device housekeeping,
which clears all configuration and internal memory.
Similar concerns exist today with SoCs being intro-
duced by the FPGA manufactures. Xilinx, Altera, and
Microsemi are now offering processor-centric SoC devices
that typically have separate and independent regions for
the processor subsystem and the programmable logic. The independence provides users flexibility and the ability to
significantly reduce power by turning off the programma-
ble logic. This capability presents vulnerability. If an ad-
versary can preload a Trojan, either into the processor
memory or the programmable logic before allowing the
device to boot normally, then the Trojan will have access to
the entire internal application running on the device. As
with Xilinx FPGAs, Xilinx SoCs have been designed to ad- dress this security concern. The Xilinx Zynq device boots
both the processor and the programmable fabric from the
same root of trust, either fully secured, or fully open.
The Trojan insertion vulnerability also exists postconfi-
guration. Xilinx and Altera FPGA families permit partial
reconfiguration, the ability to change the configuration of a
section of the FPGA while the rest operates normally. This
feature has proven to be very valuable for innovative ap- plications. However, it also is susceptible to a Trojan
insertion attack after the initial configuration. The appli-
cation design must authenticate postconfiguration bit-
streams to exclude Trojans.
2) Protecting Keys: The secrecy of a cryptographic key is fundamental to security; protecting the key is the first
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1256 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
priority of the FPGA manufacturer. As stated earlier, this can be a challenge for commercial vendors who are de-
veloping a device that is used in nearly all types of ap-
plications, and not specifically designed for a specific
domain, application, or cost point. Also important to note
is the fact that while we address the protection of keys here
in the preconfiguration section of this paper, protection of
keys is essential before, during, and after configuration.
a) Key storageVTechnology: Most FPGA manufac- turers provide both volatile and nonvolatile key storage. In
the case of SRAM FPGAs, volatile key storage is imple-
mented as battery-backed RAM (BBRAM) and nonvolatile
key storage is implemented as eFuses. Each has advantages
and disadvantages.
BBRAM key storage requires no process changes, mak-
ing it easily implemented in state-of-the-art process tech-
nology. The volatile key storage also allows for key agility and key zeroization, critical components of a strong cryp-
tographic system. In Xilinx FPGAs, when primary power is
applied, the BBRAM is powered by that power supply, which
not only reduces the drain on the battery but also permits
replacing the battery in fielded system. Altera specifies that a
battery must be attached before the key is loaded, implying
that the source for the BBRAM memory is only the external
battery [4]. Xilinx also provides an internal interface that can be used by an application to command a zeroization of
the key space. Zeroization is intended for use when the
FPGA detects tampering with an operating application.
BBRAM is not without its disadvantages. A momentary
loss of contact or low battery voltage could cause the key to
be lost. While modern coin-cell batteries hold enough
energy to hold encryption keys for the design lifetime of
20 years, and new betavoltaic batteries with great re- liability are being introduced to the market, many battery
vendors do not specify thermal wearout or other failure
modes, for the length of time required by most FPGA
users.
BBRAM is inherently more physically secure than
nonvolatile key storage technology. To steal the key, an
adversary would need to decap the FPGA and mill away
many levels of metal, then scan the bits with a scanning electron microscope (SEM). This attack must be per-
formed while keeping clean power to the key memory.
This is the type of attack required to extract the entire
configuration directly from the FPGA SRAM cells as well,
so no bitstream encryption method is qualitatively
stronger. This attack is considered to be beyond the capa-
bilities of all but the most sophisticated of adversaries.
An eFuse provides a simple, one-time-programmable nonvolatile memory. Because they are nonvolatile, eFuses
eliminate the maintenance issues associated with a battery.
A common eFuse structure is a narrow wire that is prog-
rammed by electromigration from high programming cur-
rent. eFuses are simple to build and program, requiring no
additional process complexity or high voltage. However,
eFuses and their programming circuitry are rather large, so
eFuses are practical only for small amounts of memory, such as a decryption key. The physical change caused by
eFuse programming is visible under a microscope, so
eFuses are comparatively easy to reverse engineer from a
decapped part. Of course, they cannot be reprogrammed or
erased. However, to zeroize an eFuse key, one could burn
all eFuse cells in the key.
b) Key loading: The JTAG test port is a common in- terface for loading keys into programmable logic devices [4], [22], [35]. Loading a key into a Xilinx device begins by
first executing a JTAG command to enter key access mode,
which clears the existing key and all configuration data and
memory in the FPGA. A second JTAG command writes the
new key and reads it back to verify it. Of course, on power-
up, FPGAs key access is disabled.
Details of key loading vary considerably among manu-
facturers. Loading of the key may be done in plaintext (‘‘red key load’’) or ciphertext (‘‘black key load’’) or other-
wise obscured. In Xilinx devices, the key is transmitted to
the FPGA in the plaintext, so it must be loaded in a secure
location. The key access control sequence ensures that the
key is cleared before any command is executed that could
read it back. Other vendors have chosen alternative solu-
tions. Altera Stratix devices include a key obfuscation
mechanism so the key may be presented to the FPGA in an encrypted form. Moradi [32] reported that two 128 b keys
are used by Altera Stratix-II devices. The bitstream key is
transmitted and stored in encrypted form, encrypted by a
second key, which is presented without any obfuscation.
Although the key used to decrypt user data is not trans-
mitted or stored in the FPGA, and hence cannot be ex-
tracted, it can be computed by a straightforward algorithm
from the readable key that accompanies it. In Altera Stratix-V devices, the user key is sent through a one-way
function before being stored on the device [4]. In both
of these scenarios, the loading of the key is obscured, and
while not cryptographically sound, may provide a level of
security acceptable at a given price point.
Microsemi is the first FPGA manufacturer to offer a
true ‘‘black-key load’’: the key is encrypted by a secret key
before loading. In selected SmartFusion2 devices, the de- vice and user exchange public keys and perform an elliptic
curve Diffie–Hellman (ECDH) exchange to generate a key
that can be used for the authenticated/encrypted loading of
a user key [28]. The generated key is used as a key encryp-
tion key (KEK) to encrypt the user key on the transmit side,
and to decrypt the user key within the SmartFusion2 device.
c) Key storageVred or black?: Much like key loading, the device key can be stored in plaintext, ciphertext, or obfuscated form. Xilinx stores 7-series keys in plaintext
form. An adversary who decaps the part and can identify
the key storage cells can attempt to extract the actual key
bits. An obfuscated key defeats this attack until the obfus-
cation method is discovered. Altera implemented a key
obfuscation algorithm in Stratix devices, so that probing
the device could not divulge the key directly. When the
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1257
obfuscation algorithm was revealed, obfuscation was no longer a barrier to invasive key extraction [32].
Selected SmartFusion2 devices from Microsemi make
use of Intrinsic-ID’s (Eindhoven, The Netherlands)
Quiddikey technology [17]. This technology does not store
an encryption key on-chip. Instead, it generates the key
when needed through the use of an activation code gene-
rated during an enrollment phase, and the output of
Intrinsic-ID’s SRAM-based physically uncloneable func- tion (PUF) [17], [29].
d) Eliminating keys: When an unauthorized event occurs, the application may need to eliminate sensitive
keys within the device. For systems that employ BBRAM
key storage, there are multiple options. First, passive era-
sure can be accomplished by simply electrically discon-
necting the battery from the supply. Second, for Xilinx
devices, an external device could send the appropriate JTAG command to enter key access mode. As mentioned
earlier, this actively clears the device key and the configu-
ration of the device. Finally, most FPGA vendors have the
ability to erase the key from within the device under con-
trol of the application [4], [29], [38]. Xilinx and Microsemi
offer the ability to fully zeroize the device key, actively
erase the key, and then verify that it indeed has been
erased, either through readback or dedicated hardware.
3) Antispoofing: When they are manufactured, FPGAs can accept either an unencrypted bitstream or an en-
crypted bitstream. All programmable logic vendors that
provide encrypted bitstreams have the ability to modify the
FPGA to require an encrypted configuration. This modi-
fication involves programming a nonvolatile eFuse register
that disables unencrypted configuration. An adversary cannot substitute an alternative bitstream in the device or
change the key. Instead, that adversary must replace the
FPGA with another equivalent device. This solution pro-
vides no value with a BBRAM volatile key, as the adversary
only needs to remove the battery to clear the key, then load
a new key into the device to gain access. Of course, an
adversary can circumvent the antispoofing by replacing the
protected FPGA with a new, unprogrammed one. None- theless, antispoofing the device is a cost-effective compo-
nent of overall system security.
4) Test Circuitry: Because it provides access to and control of internal nodes, test circuitry has long been a
primary point of security vulnerability in integrated cir-
cuits, and must be disabled for a secure application to
indeed be secure. While protection of test circuitry is dis- cussed in this preconfiguration section, it must be consid-
ered during configuration and postconfiguration as well.
Test interfaces can be disabled in many ways. Pro-
prietary test interfaces are typically handled differently
than industry-standard interfaces such as JTAG. Xilinx
disables readback by setting internal security bits when an
encrypted bitstream is loaded. In the Zynq SoC, eFuses
may be used to disable test interfaces permanently [36]. Test disable is also provided in Altera devices where a
tamper-protection bit disables the test modes of the FPGA
[4]. When permanently disabling test circuitry, users must
be aware of the consequences for additional failure anal-
ysis: if the test access port has been disabled, there is very
little anyone can do to debug the device.
Microsemi and Xilinx provide mechanisms to perma-
nently disable the JTAG interface as well as monitor it internally for tamper conditions [29], [35]. Altera has the
ability to reduce the number of JTAG commands executed
to only those mandatory by the standard (e.g., Extest,
Intest, IDCODE, etc.). The execution of nonmandatory
JTAG instructions can be enabled by issuing the UNLOCK
JTAG instruction, which is only allowed to execute when
sent from within the device [4].
B. During Configuration
1) Side-Channel Attacks on Keys: In recent literature, Xilinx, Altera, and Microsemi FPGAs have been shown to
be vulnerable to differential power analysis (DPA) attacks
on their keys [30]–[32], [38]. Although noninvasive, these
published attacks employ a custom board with a significant
reduction in bypass capacitance in order to enhance the power signal. This brings up the question of the difficulty
of moving an FPGA from one board to another while
keeping the key intact. eFuse, antifuse, and Flash storage
should be unaffected, but battery backed RAM keys are lost
if, during the transfer, power is lost to the keys or if the
device temperature exceeds operating limits.
Security is always a moving target. Attacks continue to
improve, and since a custom board is not required, in principle, to mount a DPA attack, one would expect that
future side-channel attacks on FPGAs will target devices in
their native environment. Defenses improve as well. As
side-channel attacks became better understood, FPGA
vendors added countermeasures, though they are not al-
ways explicit about precisely what they have done. Micro-
semi has licensed CRI technology, but has not released
which aspects of that technology they have used. Other vendors are silent on the question of precise circuit details
to address DPA.
C. Postconfiguration FPGAs rely on the application as an active participant
in protecting the device after configuration, a capability
somewhat novel to FPGAs [45]. FPGAs provide security-
related features, but leave the policy decision of handing the features to the user of the FPGA to implement in the
application.
1) Readback Disable: Traditional FPGA operation allows the unencrypted bitstream and data to be read out using
the bitstream readback command. Therefore, when an
FPGA loads an encrypted bitstream, it disables the
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1258 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
readback mechanism, regardless of bitstream settings. This automatic, mandatory setting prevents the simple attack of
using the FPGA to decrypt the bitstream, then reading it
out. Readback continues to be a valuable feature for both
the FPGA manufacturer and the application developer.
Proper measures must be taken so that it does not jeo-
pardize the application security.
Skorobogatov and Woods [38] used a side-channel
attack to extract a key that unlocked readback in an FPGA that was advertised to have no such capability. While sen-
sationalized as a back door, and questioned for who in-
serted it, and for whom, in all practicality it was no more
than an interface used for device test.
2) Restricted Access to Base Silicon Cryptographic Logic: On-chip cryptographic functions, such as the decryptor,
are well-tested, high-speed logic designs of a standard function. It would seem efficient to allow an operating
FPGA application to use cryptographic functions after
configuration. However, user access to the decryptor, or
other cryptographic functions, permits data flow paths that
complicate the analysis of the security of the base silicon. If
the user has access to the cryptographic functions, and the
device is programmed to permit unencrypted bitstreams,
then the adversary has access to the cryptographic func- tions as well. The manufacturer must perform a security
analysis to verify that no key data could leak into the
application domain.
Second, there are U.S. export and various national
import laws worldwide that add risk to the manufacturer if
cryptographic functions are used for more than just the
configuration of the device. Third, most cryptographic
functions, such as AES decyrptors are simply not very large and can be implemented in the user application without
consuming much of the FPGA logic. Finally, users have a
wide range of needs for cryptographic services. This be-
comes a cost/benefit tradeoff for the manufacturer. Xilinx
and Altera do not allow access to the cryptographic
functions on the FPGAs. Microsemi allows access to the
cryptographic functions on selected models of the
SmartFusion2 devices [28].
3) Restricted Access to Base Silicon Features: Concern over tampered bitstreams in early Virtex devices led Xilinx
to prohibit reconfiguration of encrypted bitstreams. This
restriction applied to the internal configuration access port
(ICAP) as well as the external configuration port. The
concern was that a bitstream might be tampered to enable
access to ICAP, which could then be used to read back the decrypted configuration. Virtex-II through Virtex-5 de-
vices required encrypted bitstreams to pass a CRC to begin
operating, thus ensuring the integrity of the bitstream
data. However, as described earlier, CRC does not give a
strong defense against bitstream tampering.
Since the addition of authentication in Virtex-6 and
7-series, a secured bitstream must pass the authentication
check, defeating any bitstream tampering. Since an au- thenticated bitstream could not have been modified by an
adversary, it can be trusted. This trust applies to the appli-
cation in general, but specifically enables trusted self-
reconfiguration with the ICAP. Since the application
design is trusted, ICAP operation is permitted with au-
thenticated encrypted bitstreams. An authenticated bit-
stream may use the ICAP to launch a partial configuration
while the device continues to operate, allowing the design of a trusted reconfigurable platform [53].
ICAP is a Xilinx-specific example of a base silicon fea-
ture that, if used maliciously, could provide a vulnerability
without the appropriate protections. In all cases, the man-
ufacturer must provide safeguards, while the application
developer has final responsibility. It is, of course, possible
to construct an insecure application despite the encryption
and authentication. For example, if an application devel- oper connected the ICAP interface directly to the external
pins, an adversary could interrogate the ICAP to read back
the unencrypted application bitstream. FPGA security
enables the construction of secure applications; it does not
guarantee them.
4) The Value of ICAP and Checking Designs in the Field: ICAP permits logic inside the FPGA to read and write its own bitstream, providing a wide range of powerful use
cases. These include:
• internal readback of the device configuration for in-system integrity checks;
• configuration clearing and zeroization; • algorithm agility for those applications that need to
change algorithms without a complete reconfigu-
ration of the device; • self-test; • use of user-specific decryption and authentication
algorithms with custom protections against attacks
such as DPA or other side-channel attacks;
• configuration repair: random single-event upsets (SEU) [23], [52] or intentional tampering may
cause configuration bits inside the FPGA to
change. Jones [19] describes the SEU controller, an application in which the FPGA logic reads its
own bitstream internally through ICAP, checks the
stored bitstream with previously computed ECC
data, and corrects configuration errors. The SEU
controller is intended to detect and correct errors
in a high-reliability environment, but it can be used
to detect tampering with the FPGA in the field if
individual bits are flipped. More recent FPGAs include the SEU detection and scrubbing feature in
dedicated hardware [35].
D. Invasive Attacks Because of the environment of the fielded FPGA, the
difficulty of protecting FPGA keys and configuration data
persists regardless of the technology used to store them.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1259
When an adversary can physically open the device and scan the contents, no storage technology is wholly secure.
However, using our model of cost-based security, some
storage methods are more expensive to break, sometimes
despite having no qualitative advantage.
The strongest way to prevent theft and tampering with
a bitstream is to keep it out of an adversary’s hands. The
Xilinx Spartan 3AN is a multichip package containing an
FPGA die and a flash memory die. Since nothing is trans- mitted from an external source, the trivial bitstream inter-
ception method does not work. However, after decapping
the package, the signals between die can be probed to
pirate the bitstream.
MicroSemi’s SmartFusion devices have internal non-
volatile Flash memory storage as well. These devices are
still subject to physical, invasive attack, though that attack
is more difficult for several reasons. The storage in these devices is distributed around the device, and there is no
localized point at which one could intercept the config-
uration data, so the attack must scan the entire device.
Programmed Flash and antifuse cells are not observably
changed from unprogrammed cells, so the detection of
programming is more difficult. It may require SEM or
thermal analysis. SEM images of programmed and unprog-
rammed antifuse cells show no apparent differences [2]. Invasive physical attacks on antifuse devices and Flash
devices are qualitatively no more difficult than methods for
extracting eFuse bits. However, these attacks are consid-
ered significantly more expensive because millions of bits
of data must be extracted, rather than merely a 256-b key.
Further, the resulting extracted programming is not for-
matted for programming another FPGA, so it must be
formatted properly by the adversary in order to clone the design. The proper format is not published, so there is no
cryptographically strong protection, but it is considered
difficult and tedious.
Despite the concerns, there has not been a report of a
successful invasive attack on any FPGA regardless of the
internal storage: SRAM, BBRAM, eFuse, Antifuse, or Flash.
E. Environmental Attacks The circuits inside FPGAs that implement the security
functions are no less susceptible to attack than those in
other semiconductor integrated circuits. Published attacks
on security functions in other devices include out-of-range
temperature and power adjustment, overclocking, and
other environmental attacks. Defense against these attacks
is very difficult because, by definition, semiconductor
foundries do not guarantee operation outside their gua- ranteed environmental range. FIPS140-2, level 4, requires
environmental failure protection on cryptographic mod-
ules [10] and FPGA vendors provide limited protection
from environmental attacks.
The traditional response to environmental attacks has
been more robust circuitry, including dedicated voltage
regulation for security functions, large hamming distances
in security-critical state machines, and redundant storage of critical state values, such as those disabling readback in
a secure system.
Xilinx provides an embedded analog-to-digital con-
verter (ADC) that can be used to monitor voltage and
temperature both outside and inside the FPGA. Users can
configure the circuitry to specific voltage and temperature
ranges based on the environment the system will operate
in. If the voltage or temperature exceeds this user-specified range, an internal alarm signal will be generated notifying
the application running on the device. User-specific
actions can then be taken, for example, clearing sensitive
cryptographic variables in registers or RAMs, zeroizing the
key or clearing the configuration of the device itself and
shutting down.
With few exceptions, FPGA manufacturers do not pub-
lish details of their security circuitry. Microsemi FuseLock and FlashLock include internal fuses or flash cells that
prevent inappropriate access. According to Microsemi,
‘‘special security keys are hidden throughout the fabric of
the device, preventing internal probing and overwriting.
They are located such that they cannot be accessed or
bypassed without destroying the rest of the device’’ [28].
Xilinx readback disabling circuitry has ‘‘hardened triple-
redundant logic’’ and key loading FSMs have ‘‘large ham- ming distances between states’’ [35].
1) Device Identifier: A unique identifier is a powerful way to restrict access to an FPGA, defeating cloning and
spoofing. An application can be coded to operate only on
the one device that matches a specific identifier or on a
subset of devices with a range of values.
Modern FPGAs contain a device identification register. Xilinx provides device DNA, a 57-b serial number prog-
rammed in eFuses during manufacture and used for track-
ing devices. Device DNA is accessible from outside the
FPGA via JTAG. In addition, devices include a user-
programmable 32-b eFuse field that can be used as an
identifier as well. This user eFuse field is only available to
logic within the FPGA.
F. PUFs and FPGAs Other alternatives exist for device identifier. PUFs [14],
[39] provide a device-specific unique identifier derived
from random process variations. A PUF generator pro-
duces a different signature for each manufactured device.
PUFs have been demonstrated in FPGA fabric (‘‘soft PUF’’)
as well as in dedicated logic (‘‘hard PUF’’). Microsemi’s
SmartFusion2 includes a hard PUF. Other FPGA vendors have IP providers who provide soft PUF functions in fabric.
Therefore, application developers can build PUFs for de-
vice identification today with existing FPGAs.
There are several drawbacks for the use of PUFs that
have precluded their use as decryption keys in FPGAs.
First, the PUF only resides inside the device. It must be
read out of the device to encrypt the bitstream data file.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1260 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
Alternatively, Kean recommended an encryption method in which the FPGA encrypts its own data file using its
internal key and emits the encrypted data for external
storage [20]. More importantly, the PUF is unique to each
unit, so the bitstream must be encrypted uniquely for each
device. This problem may be addressed by including a key
transformation word, the exclusive-or of the computed
PUF for the device with the actual key used to decrypt the
data. Still, at system build time, the FPGA must be pow- ered on and the transformation word derived. Perhaps
most importantly, PUFs are not stable: a few bits may
change over the lifetime of the device. This is not parti-
cularly important for a device identifier, but disastrous for
a decryption key. One method to compensate for this is the
addition of helper data to the PUF-encrypted bitstream.
Helper data are fundamentally error correcting code
information for correcting errant bits in the PUF. It is unclear how much information about the key is leaked in
helper data. Finally, long-term PUF reliability data over
process, voltage, and temperature is sketchy at advanced
process nodes, leading to concern over lost keys during the
lifetime of the fielded device.
V I I . A P P L I C A T I O N S
A. IFF Flow for Nonsecured Devices Baetoniu [6] described ‘‘identification friend or foe’’
(IFF), a way to tie an FPGA bitstream to a specific
system. IFF uses an external storage device, a secure serial
electrically erasable programmable read-only memory
(EEPROM), such as the Dallas Semiconductor/Maxim
DS2432 (Fig. 6). The secure EEPROM includes a crypto-
graphic hash function. At system build time, the application developer programs a secret key into the EEPROM and also
programs the secret key into the FPGA application.
After the FPGA boots, it uses its random number
generator to interrogate the EEPROM. The EEPROM
computes the hash of the random string with its stored key.
The FPGA does the same. If the two hashes match, the
FPGA continues to operate. If the hashes do not match, the
FPGA enacts countermeasures such as ceasing operation or disabling premium functionality. The check may be
repeated as often as desired during operation.
IFF ties the FPGA bitstream to a properly programmed
secure EEPROM. Although it can be applied to an FPGA
without bitstream encryption, doing so leaves the system
vulnerable. An adversary may reverse engineer the bit-
stream and disable the check on the hash function. This
mechanism is even vulnerable with an encrypted, but not authenticated, bitstream, because an adversary may at-
tempt to disable the hash function check by a bit-flipping
attack or random perturbation of the plaintext to disable
the hash check.
B. Metered IP As third-party IP cores become more common, one
would like a mechanism to charge per copy for those
cores. The core vendor would be paid for each use, just as if it had been a physical device. Guajardo et al. [15] described a method for doing this and the company Intrinsic-ID
developed into a product under the brand name Quiddi-
card [17].
The method has an enrollment phase and an authenti-
cation phase. In the enrollment phase, the FPGA is prog-
rammed with a PUF which generates an identifier unique
to the FPGA. An activation code is generated from the PUF value and stored off-chip. The activation code generation is
a proprietary algorithm, but may be an encryption of the
PUF value using a private key of a public/private key pair.
In the authentication phase, the same PUF is constructed
in the FPGA and the design is authorized with the acti-
vation code (Fig. 7).
To turn this activation process into an IP metering
mechanism, the generation of the activation code may be done by a trusted third party, possibly a trusted piece of
billing hardware at a manufacturing site that reports the IP
usage as it generates the activation code. This mechanism
has been extended to include multiple keys to permit ac-
cess to multiple pieces of IP in the FPGA application [18].
This mechanism relies on confidentiality and authen-
tication of the application design, so that an adversary
Fig. 6. IFF design.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1261
cannot reverse engineer the device to remove the activa-
tion code checking. There is nothing fundamental about
using a PUF for identification. Device DNA or some other
unique or nearly unique fixed device identifier can serve.
C. Just in Time Secure Configuration Utilizing partial reconfiguration and authenticated en-
crypted bitstreams, it is possible to design a system where
critical technology (CT) is only configured into the device
when it is needed, thereby adding an additional layer of security to the system. Peterson [35] proposed a method by
which a user application is partitioned between CT and
non-CT. The non-CT is resident in the FPGA at all times
and the CT logic is partially reconfigured into the FPGA
only when needed. Otherwise, it is stored externally,
encrypted and authenticated.
The CT, which exists as a partial configuration, can be
decrypted by the device using the device key or by the application using a user-specified algorithm implemented
in the FPGA fabric, and potentially a PUF to generate the
key. The boot configuration of the FPGA sends the CT
partial bitstreams to the ICAP so that the decryption pro-
cess is completely contained with the FPGA. Encryption is
required to ensure the privacy of keys included in the CT
partial bitstreams or the boot configuration bitstream.
Authentication is required so that the bitstreams cannot be tampered in a way that compromises the CT partial bit-
streams. The IP described by Zeineddini and Wesselk-
amper [53] for secure and high-reliability applications
utilizing partial reconfiguration also checks for tampering.
It uses the integrated ADC to monitor power and tem-
perature, and checks the JTAG port to detect tamper
conditions. If necessary, the IP zeroizes the CT and its key.
D. Fault-Tolerant Design FPGA manufacturers supporting confidentiality, in-
tegrity, and authentication of the configuration provide a
strong foundation that users can build high-reliability system upon. Cryptographic processing and security
services, like any high-reliability function, must be fault
tolerant. Xilinx’s isolation design flow (IDF) [7], developed
in conjunction with government entities [24], was the first
in the programmable logic industry. Altera has since
developed similar technology, called the design separation
flow [5].
IDF provides fault containment at the FPGA module level, enabling single-chip fault tolerance by various tech-
niques, including modular redundancy, watchdog alarms,
segregation by safety level, and isolation of test logic for
safe removal [7]. The applicability of this type of technol-
ogy goes beyond cryptographic processing and security.
The same technology can be used to aid in compliance for
systems that must be designed to safety-critical standards
such as IEC61508, ISO26262, and DO-254. The basic concept is to separate critical and/or inten-
tionally redundant functions physically on the FPGA. This
can be accomplished through careful floorplanning and the
use of unused logic as fences. Fig. 8 represents a design
that has been floorplanned with IDF. Fig. 9 is the same
design after place and route.
The fences are exhaustively analyzed by the FPGA
manufacturer to show that a single failure would not com- promise the isolation or redundancy built into the system.
The goal is to minimize the size of the fence to reduce the
inefficiencies that come with its use [16]. As an example,
the width or height of a fence made of configurable logic
blocks (CLBs) in a Xilinx 7-series FPGA is a single CLB.
In an ideal world, each module would be completely
isolated from each other. In practice, this scenario is not
feasible: some level of communication must exist between isolated regions. Xilinx developed the concept of ‘‘trusted
routing,’’ restricted routing that is specifically chosen by
the place and route algorithms such that the isolation
established by the use of ‘‘fences’’ is not compromised.
Finally, no high-reliability system is complete without
the use of independent verification. To address concerns
associated with software ‘‘bugs’’ or inappropriate use of the
design methodology by the user, FPGA manufacturers must provide independent verification tools that can be applied
to the design to validate the isolation of the modules. Xilinx
Fig. 8. Notional floorplan of a design into five isolated regions.
Fig. 7. Metered IP system architecture.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1262 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
developed the isolation verification tool (IVT) for this
purpose. IVT can be used early in the development flow to
aid in isolation verification before a printed wiring board
(PWB) is committed. It is also used once the design is
complete in order to verify that the final design, placed and routed, has the isolation designed in that the user intended.
E. Single-Chip Cryptography Single-chip crypto (SCC) combines data of different
levels of secrecy or control in a single device. The device
must not only protect programs during loading, but also it
must defend against attacks from outside and attacks while
operating, including leakage of protected information across internal boundaries. Therefore, single-chip cryp-
tography aggregates much of the technology discussed in
this paper.
SCC uses the authenticated encryption capability to
load a boot loader. The boot loader, isolation region #1 in
Fig. 8, manages further FPGA configuration, software for
on-chip processors, and data handling. Because it was au-
thenticated and encrypted, the boot loader is known to be unaltered by potential adversaries or accidental bit errors.
In addition, sensitive data, such as session keys, are known
to be kept secret. To ensure no internal leakage of in-
formation, SCC implements the fences of IDF as described
in Section VII-D (Fig. 8) to separate sensitive data spatially
in the FPGA. This separation assures the confidentially of
sensitive information even in the presence of accidental or
intentional attacks on the fences. The spectrum of isola- tion capabilities is sufficient to support applications such
as the separation of red and black data processing, key
management, and other high-reliability functions.
Bitstream scrubbing, using internal readback, contin-
ually monitors the configuration data, in particular the
isolation fences, to ensure that changes to the configura-
tion are detected and corrected quickly. SCC can even
verify that the device DNA is correct, ensuring operation on the proper individual chip.
Starting with the root of trust, followed by the power
and flexibility of both hardware and software, coupled with
the application of isolation technologies and partial recon-
figuration, a system that would typically have been devel-
oped through the use of multiple devices now could be
integrated into just one with no loss of security.
V I I I . T H E F U T U R E O F F P G A S E C U R I T Y
A. Field-Programmable SoC SCC was originally conceptualized and developed in
cooperation with government authorities for FPGAs [24],
and the application provides additional value in new prog-
rammable SoCs such as Zynq. Zynq includes both a prog- rammable logic subsystem (PL) that comprises hundreds
of thousands of gates of logic, and a processor subsystem
(PS) that includes a dual-core ARM (ARM Holdings,
Cambridge, U.K.) Cortex A9 processor, caches, memories,
and peripherals, connected to one another and to the PL
using an Advanced Microcontroller Bus Architecture
(AMBA) Advanced eXtensible Interface (AXI) bus. The
Zynq device boots securely, using authenticated encryp- tion capabilities like those described for FPGAs. Zynq also
provides asymmetric and symmetric authentication, con-
fidentiality, and integrity. Leveraging this root of trust,
applications can implement cryptoprocessors or systems
performing cryptographic functions in the combination of
processor and FPGA with confidence that they have not
been compromised.
In Zynq, the processor subsystem is known to be iso- lated physically from the programmable logic. Within the
PL, isolated regions as in IDF ensure separation of sensi-
tive data spatially. Within the PS, known software meth-
ods, such as hypervisors and ARM Trustzone technology
isolate sensitive software processes from other processes.
The trusted boot loader decrypts and authenticates all
configuration data and software.
Partial reconfiguration is further enhanced. The entire PL can be reconfigured, or even powered down, controlled
by the PS. Alternatively, portions of the PL can be partially
reconfigured for applications that require algorithm agi-
lity. The same reliability checks performed on ICAP [52]
can be applied to the processor configuration access port
(PCAP) to ensure proper data integrity of software. De-
cryption and authentication of partial configuration files
Fig. 9. FPGA editor view of a design implemented using the IDF methodology.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1263
can be performed by either the PS or the PL, allowing users the flexibility to choose their own authentication and de-
cryption algorithms as well as perform functions such as
authenticate before decryption to aid in defense against
side-channel attacks. Of course, key management remains
a critical consideration in these applications.
B. Conclusion Security in FPGAs has been driven by the need to
address new threats, by the growth in value of the IP of the
applications, and by the growth in the expected sophisti-
cation of the adversary. All three drivers continue to
operate. New areas of protection, such as confidentiality of
the data handled by the FPGA, metering of third-party IP,
and counterfeit protection motivate additional capabilities and combinations of capabilities in the FPGA. Modern
FPGAs and new programmable SoC devices hold applica-
tions that comprise complete systems, processing very sen-
sitive data and controlling valuable systems. The high value
of the applications, the data they handle, and the systems
they control motivate well-equipped adversaries to steal IP
or to subvert the systems of which the FPGA is a part.
As adversaries become more sophisticated, so do the FPGA defenses. Future FPGA security features must con-
tinue to improve to meet all three drivers. As in the past,
these features will include circuits on the base array,
algorithms in silicon, and IP in the programmable part of
the device. h
R E F E R E N C E S
[1] Actel, ‘‘Implementation of security in Actel’s ProASIC and ProASICPLUS Flash-based FPGAs,’’ Appl. Note AC185, 2003.
[2] Actel, ‘‘Understanding Actel antifuse device security,’’ 2004. [Online]. Available: www. actel.com/documents/AntifuseSecurityWP. pdf.
[3] P. Alfke, ‘‘Configuration issues: Power-up, volatility, security, battery back-up,’’ Xilinx, Appl. Note XAPP092, 1997. [Online]. Available: http://www.xilinx.com/support/ documentation/application_notes/xapp092. pdf.
[4] Altera, ‘‘Using the design security features in Altera FPGAs,’’ Appl. Note, AN-556, Jun. 19, 2013.
[5] Altera, ‘‘Quartus II design separation flow,’’ 2013. [Online]. Available: http://www.altera. com/literature/hb/qts/qts_qii51019.pdf.
[6] C. Baetoniu, ‘‘FPGA IFF copy protection using Dallas Semiconductor/Maxim DS2432 Secure EEPROMs,’’ Xilinx, Appl. Note XAPP780 v. 1.1, 2010. [Online]. Available: http://www. zylinks.com/support/documentation/ application_notes/xapp780.pdf.
[7] J. D. Corbett, ‘‘The Xilinx isolation design flow for fault-tolerant systems,’’ Xilinx WP412, 2012. [Online]. Available: http:// www.xilinx.com/support/documentation/ white_papers/wp412_IDF_for_Fault_Toler- ant_Sys.pdf.
[8] S. Drimer, ‘‘Authentication of FPGA bitstreams, why and how,’’ Reconfigurable Computing: Architectures, Tools and Applications, vol. 4419. Berlin, Germany: Springer-Verlag, 2007, pp. 73–84.
[9] S. Drimer, ‘‘Security for volatile FPGAs,’’ Ph.D. dissertation, Comput. Sci. Dept., Cambridge Univ., Cambridge, U.K., 2009.
[10] National Institute of Standards and Technology (NIST), ‘‘Security requirements for cryptographic modules,’’ FIPS 140-2, 2001.
[11] National Institute of Standards and Technology (NIST), ‘‘Announcing the advanced encryption standard,’’ FIPS 197, 2001.
[12] National Institute of Standards and Technology (NIST), ‘‘The keyed-hash message authentication code (HMAC),’’ FIPS PUB 198, Mar. 6, 2002. [Online]. Available: http://csrc.nist.gov/publications/fips/ fips198-1/FIPS-198-1_final.pdf.
[13] National Institute of Standards and Technology (NIST), ‘‘Secure hash standard,’’ FIPS PUB 180-2 + Change Notice to include
SHA-224, Aug. 1, 2002. [Online]. Available: http://csrc.nist.gov/publications/fips/fips180- 2/fips180-2withchangenotice.pdf.
[14] J. Guajardo, S. S. Kumar, G. J. Schrijen, and P. Tuyls, ‘‘Physical unclonable functions and public-key crypto for FPGA IP protection,’’ Proc. IEEE Int. Conf. Field-Programm. Logic Appl., 2007, pp. 189–195.
[15] J. Guajardo, S. S. Kumar, G. J. Schrijen, and P. Tuyls, ‘‘Brand and IP protection with physical unclonable functions,’’ in Proc. IEEE Int. Symp. Circuits Syst., 2008, pp. 3186–3189.
[16] T. Huffmire et al., ‘‘Moats and drawbridges: An isolation primitive for reconfigurable hardware based systems,’’ in Proc. IEEE Symp. Security Privacy, 2007, pp. 281–295.
[17] Intrinsic-ID, ‘‘Quiddikey-Flex,’’ 2013. [On- line]. Available: http://www.intrinsic-id.com/ products/quiddikey-flex.
[18] Intrinsic-ID, ‘‘Quiddicard protecting your IP gainst overproduction, counterfeiting and cloning,’’ Aug. 30, 2013. [Online]. Available: www.intrinsic-id.com/products/ quiddicard-.
[19] L. Jones, ‘‘Single event upset (SEU) detection and correction using Virtex-4 devices,’’ Xilinx, Appl. Note #714, 2007. [Online]. Available: http://www.xilinx.com/bvdocs/appnotes/ xapp714.pdf.
[20] T. Kean, ‘‘Secure configuration of field programmable gate arrays,’’ in Proc. IEEE Annu. Symp. Field-Programm. Custom Comput. Mach., 2001, pp. 259–260.
[21] Lattice, ‘‘FPGA design security issues: Using Lattice FPGAs to achieve high design security,’’ White Paper, 2007.
[22] Lattice, ‘‘Advanced security encryption key programming guide for LatticeECP3, LatticeECP2MS, LatticeECP2S devices,’’ Tech. Note TN1215, 2012.
[23] A. Lesea, S. Drimer, J. Fabula, C. Carmichael, and P. Alfke, ‘‘The Rosetta experiment: Atmospheric soft error rate testing in differing technology FPGAs,’’ IEEE Trans. Device Mater. Reliab., vol. 5, no. 3, pp. 317–328, Sep. 2005.
[24] M. McLean and J. Moore, ‘‘FPGA-based single chip cryptographic solution,’’ Military Embedded Systems, 2007. [Online]. Available: http://www.mil-embedded.com/ pdfs/NSA.Mar07.pdf.
[25] Microsemi, ‘‘Igloo2 FPGAs revision 0,’’ 2013. [Online]. Available: www.microsemi.com/ document-portal/doc_download/132042- igloo2-fpga-datasheet.
[26] Microsemi, ‘‘Axcelerator family FPGAs,’’ 2012. [Online]. Available: http://www. microsemi.com/document-portal/doc_
download/130669-axcelerator-family-fpgas- datasheet.
[27] Microsemi, ‘‘Implementation of security in Microsemi Antifuse FPGAs,’’ Appl. Note AC168, 2012.
[28] Microsemi, ‘‘Security architecture,’’ 2013. [Online]. Available: http://www.microsemi. com/products/fpga-soc/technology-solutions/ security/security-architecture.
[29] Microsemi, ‘‘SmartFusion2 SoC FPGA reliability and security user’s guide,’’ 2013.
[30] A. Moradi, A. Barenghi, T. Kasper, and C. Paar, ‘‘On the vulnerability of FPGA bitstream encryption against power analysis attacks: Extracting keys from Xilinx Virtex-II FPGAs,’’ in Proc. ACM Conf. Comput. Commun. Security, 2011, pp. 111–124.
[31] A. Moradi, M. Kasper, and C. Parr, ‘‘Black-box side-channel attacks highlight the importance of countermeasuresVAn analysis of the Xilinx Virtex 4 and Virtex-5 bitstream encryption mechanism,’’ in Proc. 12th Conf. Topics Cryptol., 2012, DOI: 10.1007/ 978-3-642-27954-6_1.
[32] A. Moradi, D. Oswald, C. Paar, and P. Swierczynski, ‘‘Side channel attacks on the bitstream encryption mechanism of Altera Stratix II,’’ in Proc. ACM/SIGDA Int. Symp. Field-Programm. Gate Arrays, 2013, pp. 91–100.
[33] National Institute of Standards and Technology (NIST), ‘‘Recommendation for block cipher modes of operation,’’ Special Publ. 800-38A, 2001.
[34] M. Parlekar, ‘‘Authenticated encryption in hardware,’’ M.S. thesis, Electr. Comput. Eng. Dept., George Mason Univ., Fairfax, VA, USA, 2005.
[35] E. Peterson, ‘‘Developing tamper resistant designs with Xilinx Virtex-6 and 7 series FPGAs,’’ Xilinx, Appl. Note XAPP1084, 2012.
[36] L. Sanders, ‘‘Secure boot of Zynq-7000 all-programmable SoC,’’ Xilinx, Appl. Note XAPP 1175 (v1.0), 2013.
[37] B. Schneier, Applied Cryptography Second Edition. New York, NY, USA: Wiley, 1996.
[38] S. Skorobogatov and C. Woods, ‘‘Breakthrough silicon scanning discovers backdoor in military chip,’’ Cryptographic Hardware and Embedded SystemsVCHES 2012, vol. 7428. Berlin, Germany: Springer-Verlag, 2012, pp. 23–40.
[39] G. E. Suh and S. Devadas, ‘‘Physical unclonable functions for device authentication and secret key generation,’’ in Proc. Design Autom. Conf., 2007, pp. 9–14.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
1264 Proceedings of the IEEE | Vol. 102, No. 8, August 2014
[40] A. Telikepalli, ‘‘Is your design secure?’’ Xilinx, 2003. [Online]. Available: http://www.xilinx. com/publications/archives/xcell/Xcell47.pdf.
[41] S. Trimberger, ‘‘Method and apparatus for protecting proprietary configuration data for programmable logic devices,’’ U.S. Patent 6 654 889, 2003.
[42] S. Trimberger, J. Moore, and W. Lu, ‘‘Authenticated encryption of FPGA bitstreams,’’ in Proc. 19th ACM/SIGDA Int. Symp. Field Programm. Gate Arrays, 2011, pp. 83–86.
[43] S. Trimberger, Field-Programmable Gate Array Technology. Norwell, MA, USA: Kluwer, 1994.
[44] S. Trimberger, ‘‘Trusted design in FPGAs,’’ in Proc. Design Autom. Conf., 2007, pp. 5–8.
[45] S. Trimberger and J. Moore, ‘‘FPGA security: From features to capabilities to trusted systems,’’ in Proc. 51st Annu. Design Autom. Conf., 2014, DOI: 10.1145/2593069.2602555.
[46] S. Trimberger, ‘‘Security in SRAM FPGAs,’’ IEEE Design Test Comput., vol. 24, no. 6, p. 581, Nov./Dec. 2007.
[47] S. Trimberger, ‘‘Three ages of FPGAs,’’ in FPGA20. Highlights of the International Symposium on Field-Programmable Gate Arrays, ACM, 2011, pp. 1–18.
[48] T. Tuan, T. Strader, and S. Trimberger, ‘‘Analysis of data remanence in a 90 nm FPGA,’’ in Proc. IEEE Custom Integr. Circuits Conf., 2007, pp. 93–96.
[49] T. Wollinger and C. Parr, ‘‘How secure are FPGAs in cryptographic applications,’’ Field Programmable Logic and Application,
vol. 2778, P. Y. K. Cheung, G. A. Constantinides, and J. T. de Sousa, Eds. Berlin, Germany: Springer-Verlag, 2003, pp. 91–100.
[50] Xilinx, ‘‘Virtex-4 FPGA configuration user guide, v1.11,’’ UG071, 2009.
[51] Xilinx, ‘‘Virtex-6 FPGA Configuration User Guide,’’ UG360, Jul. 30, 2010. [Online]. Available: http://www.xilinx.com/support/ documentation/user_guides/ug360.pdf.
[52] Xilinx, ‘‘Device reliability report, second quarter 2013,’’ UG116, 2013.
[53] A. Zeineddini and J. Wesselkamper, ‘‘PRC/ EPRC: Data integrity and security controller for partial reconfiguration,’’ Appl. Note XAPP887, 2012.
A B O U T T H E A U T H O R S
Stephen M. Trimberger (Fellow, IEEE) received
the B.S. degree in engineering and applied science
from the California Institute of Technology,
Pasadena, CA, USA, in 1977, the M.S. degree in
information and computer science from the
University of California at Irvine, Irvine, CA, USA,
in 1979, and the Ph.D. degree in computer science
from the California Institute of Technology in 1983.
He was employed at VLSI Technology from
1982 to 1988. Since 1988 he has been at Xilinx, San
Jose, CA, holding a number of positions. He is currently a Xilinx Fellow,
heading the Circuits and Architectures group in Xilinx Research Labs in
San Jose, CA, USA. He is an author and editor of five books as well as
dozens of papers and journal articles. He is an inventor on more than 200
U.S. patents in the areas of integrated circuit (IC) design, field-
programmable gate array (FPGA) and application-specific integrated
circuit (ASIC) architecture, computer-aided engineering (CAE), 3-D die
stacking semiconductors, and cryptography.
Dr. Trimberger is a four-time winner of the Freeman Award, Xilinx’s
annual award for technical innovation. He is a Fellow of the Association
for Computing Machinery (ACM).
Jason J. Moore received the B.S. degree in
electrical engineering from New Mexico State
University, Las Cruces, NM, USA, in 1992.
He is currently a Director of Market Segments
Engineering at Xilinx, Albuquerque, NM, USA,
focused on security and safety architectures. Pre-
vious to his assignments at Xilinx, he was respon-
sible for the development of field-programmable
gate array (FPGA)-based communication security
equipment in a wide range of avionics and ground-
based platforms at the Motorola Government Group. He has been
awarded multiple patents on cryptographic design in addition to novel
approaches for logical and functional isolation within a single FPGA.
Mr. Moore is a two-time winner of the Freeman Award, Xilinx’s annual
award for technical innovation.
Trimberger and Moore: FPGA Security: Motivations, Features, and Applications
Vol. 102, No. 8, August 2014 | Proceedings of the IEEE 1265
<< /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles false /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Warning /CompatibilityLevel 1.4 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJDFFile false /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails true /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 1 /ParseDSCComments false /ParseDSCCommentsForDocInfo true /PreserveCopyPage true /PreserveDICMYKValues false /PreserveEPSInfo true /PreserveFlatness true /PreserveHalftoneInfo true /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Remove /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 300 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 1.50000 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages false /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.15 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /GrayImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 30 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /Description << /CHS <FEFF4f7f75288fd94e9b8bbe5b9a521b5efa7684002000410064006f006200650020005000440046002065876863900275284e8e9ad88d2891cf76845370524d53705237300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c676562535f00521b5efa768400200050004400460020658768633002> /CHT <FEFF4f7f752890194e9b8a2d7f6e5efa7acb7684002000410064006f006200650020005000440046002065874ef69069752865bc9ad854c18cea76845370524d5370523786557406300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c4f86958b555f5df25efa7acb76840020005000440046002065874ef63002> /DAN <FEFF004200720075006700200069006e0064007300740069006c006c0069006e006700650072006e0065002000740069006c0020006100740020006f007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400650072002c0020006400650072002000620065006400730074002000650067006e006500720020007300690067002000740069006c002000700072006500700072006500730073002d007500640073006b007200690076006e0069006e00670020006100660020006800f8006a0020006b00760061006c0069007400650074002e0020004400650020006f007000720065007400740065006400650020005000440046002d0064006f006b0075006d0065006e0074006500720020006b0061006e002000e50062006e00650073002000690020004100630072006f00620061007400200065006c006c006500720020004100630072006f006200610074002000520065006100640065007200200035002e00300020006f00670020006e0079006500720065002e> /DEU <FEFF00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e0020007a0075006d002000450072007300740065006c006c0065006e00200076006f006e002000410064006f006200650020005000440046002d0044006f006b0075006d0065006e00740065006e002c00200076006f006e002000640065006e0065006e002000530069006500200068006f006300680077006500720074006900670065002000500072006500700072006500730073002d0044007200750063006b0065002000650072007a0065007500670065006e0020006d00f60063006800740065006e002e002000450072007300740065006c006c007400650020005000440046002d0044006f006b0075006d0065006e007400650020006b00f6006e006e0065006e0020006d006900740020004100630072006f00620061007400200075006e0064002000410064006f00620065002000520065006100640065007200200035002e00300020006f0064006500720020006800f600680065007200200067006500f600660066006e00650074002000770065007200640065006e002e> /ESP <FEFF005500740069006c0069006300650020006500730074006100200063006f006e0066006900670075007200610063006900f3006e0020007000610072006100200063007200650061007200200064006f00630075006d0065006e0074006f00730020005000440046002000640065002000410064006f0062006500200061006400650063007500610064006f00730020007000610072006100200069006d0070007200650073006900f3006e0020007000720065002d0065006400690074006f007200690061006c00200064006500200061006c00740061002000630061006c0069006400610064002e002000530065002000700075006500640065006e00200061006200720069007200200064006f00630075006d0065006e0074006f00730020005000440046002000630072006500610064006f007300200063006f006e0020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e003000200079002000760065007200730069006f006e0065007300200070006f00730074006500720069006f007200650073002e> /FRA <FEFF005500740069006c006900730065007a00200063006500730020006f007000740069006f006e00730020006100660069006e00200064006500200063007200e900650072002000640065007300200064006f00630075006d0065006e00740073002000410064006f00620065002000500044004600200070006f0075007200200075006e00650020007100750061006c0069007400e90020006400270069006d007000720065007300730069006f006e00200070007200e9007000720065007300730065002e0020004c0065007300200064006f00630075006d0065006e00740073002000500044004600200063007200e900e90073002000700065007500760065006e0074002000ea0074007200650020006f007500760065007200740073002000640061006e00730020004100630072006f006200610074002c002000610069006e00730069002000710075002700410064006f00620065002000520065006100640065007200200035002e0030002000650074002000760065007200730069006f006e007300200075006c007400e90072006900650075007200650073002e> /ITA <FEFF005500740069006c0069007a007a006100720065002000710075006500730074006500200069006d0070006f007300740061007a0069006f006e00690020007000650072002000630072006500610072006500200064006f00630075006d0065006e00740069002000410064006f00620065002000500044004600200070006900f900200061006400610074007400690020006100200075006e00610020007000720065007300740061006d0070006100200064006900200061006c007400610020007100750061006c0069007400e0002e0020004900200064006f00630075006d0065006e007400690020005000440046002000630072006500610074006900200070006f00730073006f006e006f0020006500730073006500720065002000610070006500720074006900200063006f006e0020004100630072006f00620061007400200065002000410064006f00620065002000520065006100640065007200200035002e003000200065002000760065007200730069006f006e006900200073007500630063006500730073006900760065002e> /JPN <FEFF9ad854c18cea306a30d730ea30d730ec30b951fa529b7528002000410064006f0062006500200050004400460020658766f8306e4f5c6210306b4f7f75283057307e305930023053306e8a2d5b9a30674f5c62103055308c305f0020005000440046002030d530a130a430eb306f3001004100630072006f0062006100740020304a30883073002000410064006f00620065002000520065006100640065007200200035002e003000204ee5964d3067958b304f30533068304c3067304d307e305930023053306e8a2d5b9a306b306f30d530a930f330c8306e57cb30818fbc307f304c5fc59808306730593002> /KOR <FEFFc7740020c124c815c7440020c0acc6a9d558c5ec0020ace0d488c9c80020c2dcd5d80020c778c1c4c5d00020ac00c7a50020c801d569d55c002000410064006f0062006500200050004400460020bb38c11cb97c0020c791c131d569b2c8b2e4002e0020c774b807ac8c0020c791c131b41c00200050004400460020bb38c11cb2940020004100630072006f0062006100740020bc0f002000410064006f00620065002000520065006100640065007200200035002e00300020c774c0c1c5d0c11c0020c5f40020c2180020c788c2b5b2c8b2e4002e> /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken die zijn geoptimaliseerd voor prepress-afdrukken van hoge kwaliteit. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR <FEFF004200720075006b00200064006900730073006500200069006e006e007300740069006c006c0069006e00670065006e0065002000740069006c002000e50020006f0070007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740065007200200073006f006d00200065007200200062006500730074002000650067006e0065007400200066006f00720020006600f80072007400720079006b006b0073007500740073006b00720069006600740020006100760020006800f800790020006b00760061006c0069007400650074002e0020005000440046002d0064006f006b0075006d0065006e00740065006e00650020006b0061006e002000e50070006e00650073002000690020004100630072006f00620061007400200065006c006c00650072002000410064006f00620065002000520065006100640065007200200035002e003000200065006c006c00650072002000730065006e006500720065002e> /PTB <FEFF005500740069006c0069007a006500200065007300730061007300200063006f006e00660069006700750072006100e700f50065007300200064006500200066006f0072006d00610020006100200063007200690061007200200064006f00630075006d0065006e0074006f0073002000410064006f0062006500200050004400460020006d00610069007300200061006400650071007500610064006f00730020007000610072006100200070007200e9002d0069006d0070007200650073007300f50065007300200064006500200061006c007400610020007100750061006c00690064006100640065002e0020004f007300200064006f00630075006d0065006e0074006f00730020005000440046002000630072006900610064006f007300200070006f00640065006d0020007300650072002000610062006500720074006f007300200063006f006d0020006f0020004100630072006f006200610074002000650020006f002000410064006f00620065002000520065006100640065007200200035002e0030002000650020007600650072007300f50065007300200070006f00730074006500720069006f007200650073002e> /SUO <FEFF004b00e40079007400e40020006e00e40069007400e4002000610073006500740075006b007300690061002c0020006b0075006e0020006c0075006f00740020006c00e400680069006e006e00e4002000760061006100740069007600610061006e0020007000610069006e006100740075006b00730065006e002000760061006c006d0069007300740065006c00750074007900f6006800f6006e00200073006f00700069007600690061002000410064006f0062006500200050004400460020002d0064006f006b0075006d0065006e007400740065006a0061002e0020004c0075006f0064007500740020005000440046002d0064006f006b0075006d0065006e00740069007400200076006f0069006400610061006e0020006100760061007400610020004100630072006f0062006100740069006c006c00610020006a0061002000410064006f00620065002000520065006100640065007200200035002e0030003a006c006c00610020006a006100200075007500640065006d006d0069006c006c0061002e> /SVE <FEFF0041006e007600e4006e00640020006400650020006800e4007200200069006e0073007400e4006c006c006e0069006e006700610072006e00610020006f006d002000640075002000760069006c006c00200073006b006100700061002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400200073006f006d002000e400720020006c00e4006d0070006c0069006700610020006600f60072002000700072006500700072006500730073002d007500740073006b00720069006600740020006d006500640020006800f600670020006b00760061006c0069007400650074002e002000200053006b006100700061006400650020005000440046002d0064006f006b0075006d0065006e00740020006b0061006e002000f600700070006e00610073002000690020004100630072006f0062006100740020006f00630068002000410064006f00620065002000520065006100640065007200200035002e00300020006f00630068002000730065006e006100720065002e> /ENU (Use these settings to create Adobe PDF documents best suited for high-quality prepress printing. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /ConvertToCMYK /DestinationProfileName () /DestinationProfileSelector /DocumentCMYK /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [600 600] /PageSize [612.000 792.000] >> setpagedevice
sources/157/Long et al. - 2019 - PUF-Based Anonymous Authentication Scheme for Hard.pdf
SPECIAL SECTION ON MOBILE EDGE COMPUTING AND MOBILE CLOUD COMPUTING: ADDRESSING HETEROGENEITY AND ENERGY ISSUES OF COMPUTE AND NETWORK RESOURCES
Received May 28, 2019, accepted June 19, 2019, date of publication June 26, 2019, date of current version September 13, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2925106
PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment JING LONG1,2, WEI LIANG 3, KUAN-CHING LI 4, (Senior Member, IEEE), DAFANG ZHANG5, MINGDONG TANG6, (Member, IEEE), AND HAIBO LUO7 1Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha 410081, China 2College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China 3School of Opto-Electronic and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China 4Department of Computer Science and Information Engineering, Providence University, Taichung 43301, Taiwan 5College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China 6School of Information Science and Technology, Guangdong University of Foreign Studies, Guangzhou 510006, China 7Industrial Robot Application of Fujian University Engineering Research Center, Minjiang University, Fuzhou, China
Corresponding author: Wei Liang ([email protected])
This work was supported in part by the National Natural Science Foundation of China under Grant 61572186 and Grant 61572188, in part by the Hunan Provincial Science & Technology Project Foundation under Grant 2018TP1018, in part by the Start-Up Funds of Hunan Normal University under Grant 531120-3812, in part by the Scientific Research Program of New Century Excellent Talents in Fujian Province University, China, in part by the Industrial Robot Application of Fujian University Engineering Research Center, China, in part by the Minjiang University, China, under Grant MJUKF-IRA201802, and in part by the Fujian Provincial Natural Science Foundation of China under Grant 2018J01570.
ABSTRACT With rapid advances in edge computing and the Internet of Things, the security of low-layer hardware devices attract more and more attention. As an ideal hardware solution, field programmable gate array (FPGA) becomes a mainstream technology to design a complex system. The designed modules are named as intellectual property (IP) cores. In this paper, we consider both misappropriation of hardware devices and software IPs in edge computing and propose a PUF-based IP copyright anonymous authenti- cation scheme. The scheme utilizes the double physical unclonable function (PUF) authentication model. Both the parties generate the challenge jointly in authentication to avoid replay attack and modeling attack on PUF circuit. The complexity of authentication is greatly reduced. Besides, the server of FPGA vendor is unnecessary to store all the challenge response pairs (CRPs) of each PUF-based chip due to the use of the double PUF authentication model. It saves the system resource and achieves better security. To protect software IP, IP core vendor inserts copyright information and anonymous buyer identity information into the design before trading. The anonymity of the buyer ensures the benefits of the buyer. With the participation of trustable device vendor, infringement behavior can be traced according to extracted fingerprints. The experiments show that the resource overhead of the proposed scheme is reduced by 61.96% and 31.61% by comparing with 2-1 DAPUF and built-in self-adjustable PUF. Besides, PUF stability is 99.54%. It demon- strates the good performance of the proposed scheme.
INDEX TERMS Edge computing, field programmable gate array (FPGA), IP cores, PUF authentication model, anonymous authentication.
I. INTRODUCTION With rapid advances in Internet-of-Things (IoT) and edge computing, hardware security is widely concerned by researchers and institutes all over the world [1]. As core components of hardware devices in edge computing, security of Field Programmable Gate Array (FPGA) design modules
The associate editor coordinating the review of this manuscript and approving it for publication was Junaid Shuja.
should not be neglected [2]. Due to integrated circuit (IC) manufacturing process, there are some inevitable differences in threshold voltage and oxide thickness of each produced chip [3]. Therefore, the physical structures of different chips have random differences even in the same manufacturing environment. The difference is similar to the human finger- print, iris and palm print. It will not affect normal functional- ity of chips, but can be used as unique intrinsic characteristic to identify chips. On basis of human identity authentication,
VOLUME 7, 2019 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ 124785
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
researchers presented to use the unique manufacturing dif- ference in physical structure to recognize identity of chips. The difference is named physical unclonable function (PUF), which is a microcircuit to extract the manufacturing char- acteristic from a complex physical system [4]. It will pro- duce a unpredictable unique response for an arbitrary input challenge due to the inevitable random differences in chip manufacture [5], [6]. Many PUF circuits with different types are proposed by research institutes and semiconductor com- panies in recent years, which are widely used in the fields of intellectual property (IP) protection [7], secret key generation and device authentication, etc [8].
FPGA is a semi-custom circuit in Application Specific Integrated Circuit (ASIC). It is widely used in IoT and edge computing environment due to its good programmable and reconfigurable capabilities [9]. In the view of security, IP protection techniques implemented on FPGA have better flexibility and require no extra resource overhead by com- paring to that on the traditional custom circuit. Therefore, PUF technology can be used for FPGA protection [10]. The unique intrinsic characteristic can be extracted by PUF as secret key, namely challenge response pair (CRP), which can be used as identity of a chip. In device licensing or authentication [11], [12], identity of a chip can be recognized by comparing the PUF response to the registered one.
Researches on PUF-based IP protection can ensure secu- rity of FPGA designs at the hardware level, thereby the ability of hardware circuit to resist attacks is enhanced. In this work, the participators and security issues of the entire IP trad- ing procedure are considered. A PUF based anonymous IP authentication scheme is proposed by using PUF and digital fingerprint techniques. A double PUF structure is proposed to authenticate both hardware FPGA and software IP. Therefore, FPGA vendor is unnecessary to store all CRPs in advance, thereby achieving good superiority in resource overhead, security and applicability. IP vendor can insert copyright information and anonymous identity of IP buyer into IP core before IP trading. It can realize passive IP protection and infringement tracing. The anonymity could protect benefits of IP buyer and track misappropriation behavior with partic- ipation of trustable device vendor.
This work is organized as follows. Section II analyzes previous PUF-based IP authentication schemes. Section III introduces the proposed double PUF model. In section IV, the PUF-based anonymous IP authentication is proposed. The security is analyzed in section V. Section VI evaluates experimental result. Finally, this paper is summarized.
II. RELATED WORK PUF is a novel technique to extract a ‘‘secret’’ from the complex physical system [13]. It utilizes inevitable ran- dom difference in hardware manufacturing and generates a secret with unique characteristic. Many PUF implementa- tions with various types are proposed in recent years. Based on the implementation principle, PUF can be classified into delay-based PUF (arbiter PUF, Ring Oscillator PUF) and
storage-based PUF (SRAM PUF, butterfly PUF). For the number of CRPs, there are strong PUF and weak PUF. The former (such as arbiter PUF) has numerous CRPs and is widely used in device authentication. The later has less CRPs and is mainly used in key generation. PUF input is challenge (C) and the output is response (R). In general, it appears as a challenge-response pair. The relationship between C and R can be represented by PUF(C) = R. R is different for different C, which can be evaluated by inter-hamming dis- tance. The difference of R by inputting the same challenge to a PUF can be measured by the intra-hamming distance. In the ideal situation, the response of a PUF for the same challenge will not be changed even the PUF is affected by external environmental factors, such as temperature and noise. The inter-hamming distance and intra-hamming distance can be intuitively shown by histogram.
The concept of PUF is firstly proposed by Pappu etal. [14]. After that, researchers all over the world attempt to focus on PUF based copyright protection techniques. Li et al. [15] utilized PUF, data selector and reconfigurable logic to hide original logic functions, thereby preventing illegal attackers obtaining complete circuit netlist by reverse engineering. This technique is suitable for combinational and sequen- tial logic circuits. Simulation results show the technique can realize high security with less than 10% area overhead. Kumar et al. [16] proposed a SRAM based ‘‘butterfly’’ PUF (BPUF) and a novel IP protection protocol. The proposed PUF utilized an unstable cross coupling circuit. Namely, the inverter is changed to a latch or trigger. The latch can store the circuit signal and can be cleared or reset. Real- time measurement is realized without being powered on. BPUF is suitable for all types of FPGAs. Besides, this team proposed a public key cryptography algorithm for FPGA IP protection [17]. It is unnecessary to store the key into FPGA device, thereby greatly improving the security of this algo- rithm. The improvement on security is realized with the cost of extra hardware overhead, but will not obviously degrade the performance.
To ensure the legality of IP core and make it use in a licensed device, Gora et al. [18] extracted 128 bit secret key by a PUF in FPGA and used it to encrypt software IP core. Therefore, an IP core is binding to a specific FPGA device. This scheme assumes system integration vendor is com- pletely trustable. All CRPs of PUF in FPGA are stored by the system integration vendor. Simpson et al. [19] used PUF to authenticate the third party IP and hardware platform. In this protocol, the trustable third party (TTP) knows IP content, thereby it may cause IP leakage caused by the untrustable third party. To address this issue, the authors in [20] proposed a novel PUF structure, and improved the authentication pro- tocol. In this protocol, TTP cannot obtain the content of IP core. The proposed PUF is utilized to generate the secret key for encryption and the message authentication code. The message code can be used to authenticate the originality of IP cores since encryption cannot realize authentication. Zhang et al. [21], [22] proposed several FPGA IP protection
124786 VOLUME 7, 2019
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
FIGURE 1. Improved PUF circuit.
methods to resist illegal replay attacks. Besides, they also proposed a delay-based PUF to protect FPGA IP core [23]. The above methods achieve good security but cause large hardware overhead. The author in [24] proposed a RO-PUF with low overhead and high performance to protect FPGA copyright. In authentication, all CRPs should be directly transmitted. If they are captured by illegal attackers, it will pose a great threat on the security of PUF, especially for strong PUF.
In this work, we consider the random difference in chip manufacturing and propose a PUF based anonymous IP authentication scheme to authenticate both the hardware chip and software IP. Firstly, a physical and simulated double PUF authentication model is proposed. In hardware authentication, it is unnecessary to transmit all CRPs, achieving better abil- ity against modeling attacks. The authentication parties can jointly generate the challenges of PUF and the response is matched for authentication. It can resist the replay attacks. In authentication, IP watermark and anonymous information of IP buyer can be inserted into the design for IP protec- tion and infringement tracing. The previous IP watermark- ing techniques can be directly used in the proposed method without additional modification. Only legal IP buyer can use IP core. If IP infringement occurs, the seller can track the illegal distribution and provide creditable evidences. It can prevent the dishonest seller acting as legal IP buyer to obtain compensation. Meanwhile, the identity of honest buyer is not leaked in authentication.
III. PUF-BASED AUTHENTICATION MODEL In this section, an improved arbiter PUF is realized on FPGA and a double PUF copyright authentication model is designed based on the improved PUF structure. This section will intro- duce various modules of the PUF structure and illustrate the designed PUF authentication model.
A. PUF CIRCUIT MODEL This section proposes a referenced improved arbiter PUF and its implementation on FPGA by considering the feature of IP protection protocol and the principle of arbiter PUF. As shown in Fig.1, the PUF includes three modules, chal- lenge generation, PUF feature extraction and signal voting respectively.
1) CHALLENGE GENERATION MODULE The challenge generation module includes linear feedback shift register (LFSR) and mixing function. The random
FIGURE 2. LFSR challenge generation module.
FIGURE 3. Structure of traditional arbiter PUF.
challenge signals generated by lightweight LFSR will be inputted to mixing function, thereby generating several groups of testable challenge signals. In ideal situation, a n level LFSR should have the characteristic of the maximum length sequence and the generated sequence satisfies the random characteristic of Golomb assumption. A n level LFSR consists of n flip-flops and several xor gates, as shown in Fig.2.
Where D denotes the flip-flop. f0, f1, f2, . . . , fn is feedback coefficient with the value of 0 and 1. fi = 0 represents no feedback path in the circuit and fi = 1 represents feedback path existing in the circuit. Initial challenges act as the input of LFSR. A new challenge will be generated by cyclic shift and sent to mixing function for producing multiple groups of challenges. The mixing function depends on the number of paths and levels of arbiter PUF. The extension function outputs the multiple groups of challenges to the PUF cir- cuit. For instance, 2-XOR PUF generates a 128-bit response. It requires two groups of 128-bit challenges generated by mixing function and acted as PUF input.
2) PUF-BASED CHARACTERISTIC EXTRACTION MODULE The proposed PUF structure belongs to arbiter PUF. It is a strong PUF and can provide numerous CRPs for specific application. Therefore, arbiter PUF has good ability against replay attacks due to the large number of CRPs. As arbiter PUF is not realized by detecting the absolute delay of a specific path, but by checking the relative delay difference of two symmetric paths. The PUF structure consists of multi- plexer and arbiter, as shown in Fig.3. The multiplexer has two input ports and two output ports. Each multiplexer and inner delay are various due to the manufacturing process. When a signal passes through the path, the delay time is different. If a challenge C = 0, the signal will pass through two paths directly. If C = 1, the signal will pass across the paths. By comparing the delay difference, if the top signal reaches the arbiter firstly, the arbiter will output 1. Otherwise, it will output 0.
VOLUME 7, 2019 124787
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
FIGURE 4. PUF implementation on FPGA.
FIGURE 5. Structure of Slice MUX.
However, the implementation of the traditional arbiter PUF on FPGA is difficult due to the coupling paths between multiplexers. It leads to asymmetric wiring, thereby the PUF response has low uniqueness. To address this issue, the authors in [25] proposed a double arbiter PUF. It effec- tively improves the uniqueness, but causing the growth of FPGA resources exponentially. On this basis, the authors in [26] pointed that, the coupling paths between multiplex- ers should be eliminated to realize symmetric wiring on FPGA. It can ensure good uniqueness of PUF response and reduce hardware resource of PUF. However, technique of [26] mainly reduces resource consumption of traditional arbiter PUF, which still has defects in terms of uniqueness and stability.
The proposed PUF structure is based on technique in [26] to reduce resource overhead of traditional implementation. The xor operation on outputs of two arbiter PUFs can effec- tively improve the uniqueness of PUF. Besides, signal vot- ing module is added in PUF to generate a stable response. This module follows the principle of minority subordinate to majority. The challenges are inputted into PUF circuit. The signal voting module will select the signal which appears more times as PUF response.
The PUF is implemented on Xilinx Virtex5, as shown in Fig.4. Here, both MUX components in each delay node are constituted by Slice MUX in Fig.5. As each slice includes four 6-inputs lookup table (LUT), several multiplexers, and other logic resources in Virtex5 FPGA. LUT is the basic unit to realize logic function. It can implement a 4:1 multi- plexer, thereby a slice can implement four 4:1 multiplexers. Similarly, four LUTs (namely a slice) can implement a 16:1 multiplexer. Besides, there are three specific
FIGURE 6. Structure of signal voting circuit.
multiplexers in Virtex5, F7AMUX, F7BMUX and F8MUX respectively. They can realize a 16:1 multiplexer with 11 con- trol signals by combining the LUTs. The paths are parallel and the control signals only change the transmission paths of signal within Slice MUX. The symmetry is easily realized in FPGA due to the parallel structure and the same structures of slices.
3) SIGNAL VOTING MODULE The signal voting circuit can select an output value which appears more times as the response by repeatedly inputting a challenge for several times. It follows the principle of minority subordinate to majority. It can avoid bit flipping caused by occasional factor and keep the stability of response with less hardware resources. In traditional implementation, error correction algorithm is widely used to realize stability. However, large hardware overhead is required, which is not suitable for lightweight PUF.
The structure of signal voting circuit is shown in Fig.6. The sampling counter ct is used to sample the decision result sr for several repeated challenges. rmaj represents the output that appears more times in sr. tmaj denotes times of rmaj. Firstly, the parameters of signal voting circuit are initialized. ct and tmaj are set as 0. When the challenge is given, ct starts sampling and the first response sr is used as initial value of rmaj. tmaj adds 1. If the second response is equal to rmaj, tmaj adds 1. Otherwise, tmaj reduces 1. When tmaj = 0, sr is compared to rmaj. If they are not consistent, rmaj is changed by sr. Above operations are repeated until the sampling finishes. The valid output of signal voting circuit is the value of rmaj.
B. DOUBLE PUF AUTHENTICATION MODEL The security of arbiter PUF is widely concerned in recent years. Arbiter PUF is a type of strong PUF [27]. PUF is unclonable. Namely, a simulated model with similar behavior to original physical PUF cannot be built based on PUF CRPs. However, existing arbiter PUFs can be modeled by software
124788 VOLUME 7, 2019
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
FIGURE 7. Double PUF based authentication model.
with enough CRPs. PUF response mainly depends on chal- lenge C and inner delay vector ω of PUF. ω can be calculated with enough PUF CRPs. The PUF model can be simulated by using machine learning algorithm. If an illegal attacker captures enough CRPs, the modeling attacks will be probably realized.
In previous PUF-based authentication techniques, CRPs generated by PUF are stored in database at the initial stage. The CRP will be removed from the database after a round of authentication. It can resist replay attacks. The defect of these techniques is FPGA vendor stores numerous CRPs. For strong PUF, the number of CRPs grows exponentially with the IC area. The recorded CRPs in registration may greatly exceed the requirements in authentication. The transmission of PUF CRPs requires secure channel to avoid machine learn- ing attacks.
In this work, a double PUF authentication model is pro- posed, as shown in Fig.7. In this model, FPGA manufacturer uses the simulated model and the physical PUF is deployed in chip. Legal manufacturer will set an access point for orig- inal PUF, from which the PUF CRPs can be collected. The collected CRPs can be legally analyzed and used to establish a simulated model. The access point will be destroyed per- manently after the model is successfully built. The authors in [28] pointed that, a PUF model with accuracy rate of 90% and error rate of 10% can be built with only 1000 CRPs in a short time. It is mainly simple PUF and the CRPs are completely leaked. Simulated PUF has similar behavior with the original physical PUF and can be used in identity authen- tication.
The response of n level arbiter PUF depends on delay difference of signal on each path, namely, the delay sum of all paths. The delay difference is related to the challenge signal. Therefore, µ1,i and µ0,i respectively denote the delay difference related with challenge ‘‘1’’ and ‘‘0’’ on the i− th path of n level arbiter PUF. FPGA manufacturer measures all CRPs of each chip via the access point and establishes the simulated model. For m level arbiter PUF, delay vector Eν = (ν0,ν2, . . . ,νm) can be calculated to build the simulated PUF model, as in (1).
v0 = µ0,1 −µ1,1 vi = µ0,i +µ1,i +µ0,i+1 −µ1,i+1, i ∈ [1,m−1] . . .
vm = µ0,m−1 +µ1,m−1
(1)
FIGURE 8. Participators in IP trading.
At the output end, total delay 1D is the product of trans- posed delay vector and the characteristic vector Eϕ of chal- lenge C. Namely, 1D = (Eν)T Eϕ. If 1D > 0, we have R = 1. Otherwise, R = 0. The characteristic vector Eϕ of challenge C can be represented by equation (2).{
ϕi = ∏m
t=i (−1) Ct, i ∈ [0,k −1]
ϕk = 1 (2)
In this model, PUF challenge is constituted by the random numbers from both authentication parties. Malicious attack- ers cannot completely control PUF challenge in an round of authentication. All PUF CRPs are not transmitted directly. Therefore, attackers are difficult to capture enough CRPs for modeling attacks. In PUF implementation, outputs of arbiters are performed by xor operation to enhance the resistance against attacks. Besides, the benefits of participators in IP trading are considered in this work. The digital watermark is used to protect IP copyright and the piracy of IP buyer. Some existing watermarking techniques can be directly used in the proposed scheme without extra modification.
IV. ANONYMOUS IP AUTHENTICATION SCHEME A. PARTICIPATORS IN IP TRADING This work considers authentication both in software and hard- ware, mainly involving device authentication and IP authen- tication. The former is to authenticate the legality of chip and the latter is to protect copyright of IP owner. In entire IP trading, various participators should satisfy security protocol to guarantee their benefits.
The participators in IP trading are shown in Fig. 8, involv- ing FPGA vendor, IP vendor, system integration vendor, trustable third party, etc [29]. FPGA vendor (FV) relates to the semiconductor companies of FPGA, such as Xilinx, Altera. IP core vendor (CV) is companies or individuals who design and implement an IP core. System developer (SD) utilizes the hardware from FPGA vendor and IP core from IP vendor to design a complex system. The trustable third party is assumed as an authority institute that can be trusted by other participators. It can deal with data storage, processing and transmission.
FV manufactures a new type of FPGA every 12 to 18 months. The entire flow requires numerous efforts in design, manufacturing and verification. The number of tran- sistors at a single silicon is limited. Therefore, FV only implements embedded function in FPGA for majority of
VOLUME 7, 2019 124789
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
FIGURE 9. Participator registration protocol.
users or minority of big customers. FV has two consid- erations. On one hand, FPGA design should be protected from reverse engineering, illegal copy, leakage or tampering. On the other hand, some security measures are provided to protect the design of IP user and secure trading of IP core.
SD integrates the bought IP designs into a complex system. These IP designs may come from different IP vendors. The system will be realized by following IP integration rules. The protection of system should consider the cost of implementa- tion and make it valid in the whole surviving cycle.
CV can be FV or other companies to design and sell IP cores. After an IP core is successfully verified, CV can sell IP core with different types based on the design level. The main concentration of CV is to ensure IP cores being used by legal IP buyers after trading. Malicious infringement and resale of IP cores should be avoided. Besides, IP copyright can be authenticated and infringement can be tracked when IP infringement occurs.
TTP can deal with data storage, processing and transmis- sion in authentication protocol. It is easy to add a third party in protocol. However, it will cause many problems in practice. TTP stores lots of critical information and is vulnerable to illegal attacks, such as denial of service (DoS). FV has direct relationship with SD and CV. In this protocol, FV is regarded as TTP to simplify communication complexity of PUF-based IP trading protocol.
B. PROTOCOL IMPLEMENTATION The proposed anonymous IP authentication protocol includes registration, IP trading, copyright authentication and tracing.
1) REGISTRATION The registration protocol includes chip registration, IP regis- tration and SD registration, as shown in Fig.9. The content is described as follows.
• FPGA Registration
The PUF module will be inserted into each manufactured chip Fi. For a chip ID(F
i PUF), PUF CRPs will be tested
and used for analyzing the delay attribute. The delay vector of PUF is then stored into database DB for authenticating the identity of ID(FiPUF). FV issues ID(F
i PUF) for trading.
SD or CV sends ID(FiPUF) to FV and applies to buy F i PUF.
FV records the identity of buyer ID(SDi) or ID(CVi) and allocates a unique delay vector for them.
• IP Registration
CV applies to FV for IP registration and sends {SCVi,Hash(IPi),Description} to FV. After receiving
registration information {SCVi,Hash(IPi),Description} from CV, FV generates a random number Nc and a symmetric key KeyFC for calculating the following formulas.
ID(CVi) = Hash(SCVi) (3)
ID(IPi) = Hash(IPi)⊕Nc (4)
KFC = Hash(SCVi)⊕KeyFC (5)
VNc = Hash(SCVi||Nc||KeyFC) (6)
FV is assumed to be trustable. But in registration stage, CV sends the hash message and description of IP to FV for ensuring security of IP content. After allocating ID, FV stores {ID(CVi), ID(IPi),Hash(IPi),Description}. {ID(CVi), ID(IPi),KFC,VNc} will be returned to CV after registration. CV extracts Nc and KeyFC, and verifies whether the received content is tampered. If verification is success- ful, the registered identity ID(CVi) and IP registration infor- mation {ID(IPi), IPi,Nc,KeyFC,Description} are stored into database. ID(IPi) is issued for public trading.
• SD Registration
SD needs to buy software IP design and FPGA device from CV and FV to realize the complex system. For registration, SD sends the identity SSDi to FV. FV generates random num- ber Ns and a symmetric key KeyFS after receiving registration information from SD. The following equations are calculated.
ID(SDi) = Hash(SSDi)⊕Ns (7)
KFS = Hash(SSDi)⊕KeyFS (8)
VNs = Hash(SSDi||Ns||KeyFS) (9)
FV stores ID(SDi), Ns, SSDi and KeyFS into database. ID(SDi), KFS and VNs are sent to SD. The identity ID is unique for different SD. The third party except SD and FV knows nothing about the real identity of ID. Therefore, the identity of SD is anonymous to CV. After receiving the registration information from FV, SD extracts and verifies validity of Ns and KeyFS with SSDi. If verification is success- ful, {ID(SDi),Ns,KeyFS} is stored into database. Otherwise, registration is invalid.
2) IP TRADING PROTOCOL The trading protocol includes FPGA trading and IP trad- ing. FV makes IDs and descriptions of FPGA devices be public for trading. When SD or CV needs to buy FPGA device, ID(FiPUF) is used. FV stores trading record {ID(SDi)/ID(CVi),F
i PUF} and sends FPGA device F
i PUF to
SD or CV. IP trading protocol is shown in Fig.10 and the content is illustrated as follows. Step 1: CV stores ID(IPi) and IPi into database, issues
ID(IPi) for trading IPi. Step2: SD sends {ID(SDi), ID(IPi), ID(F
i PUF)} to FV and
applies fingerprint for IP trading. Step3: FV verifies legality of ID and generates ran-
dom number Ni for calculating a temporary identity of SD by equation 10 and a disposable trading password by
124790 VOLUME 7, 2019
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
TABLE 1. Illustration of symbols in protocol.
FIGURE 10. IP trading protocol.
equation 11. {IDti, ID(IPi),Pi} is sent to CV and SD.
IDti = ID(SDi)⊕Ni (10)
Pi = H(Ni||IDti||ID(SDi)) (11)
Step4: SD receives the temporary identity IDti and password Pi for IP trading. After that, SD gener- ates a random number a, calculates R(a) and sends {ID(FiPUF), IDti,a,R(a), ID(IPi)} to CV. Step5: CV finds IP core with ID(IPi), inserts copyright
information and anonymous fingerprint of IP buyer into IP core. IPwi is generated. CV determines whether IDti exists in database. If so, IDti is sent to FV. Step6: FV calculates ID(SDi) for IDti and finds {ωSDi}ID(SDi)
. By decryption, ωSDi is obtained and sent to CVi. Step7: CVi calculates R′(a) with ωSDi. If R
′(a) = R(a), SDi is successfully verified and ID(CVi) is sent to FV. Step8: FV searches {ωCVi}ID(CV i) with ID(CVi) from
database. {ωCVi}ID(CV i) is decrypted with ID(CVi) to get ωCVi. A random number b is generated and R′(b) is calculated with ωCVi. b is sent to CVi. Step9: CVi uses b as the challenge of PUFmod. The
response is R(b). R(b) and ID(FiPUF) are sent to FV. Step10: If R(b) = R′(b), FV successfully verifies CVi and
updates ID(CVi) = ID(CVni). Step11: CVi updates ID(CVi) = ID(CVni) and calculates a
combined challenge CFiPUF with a and b. The response RFiPUF
is generated by PUFmod. The encrypted IP core E(RFiPUF :
IPwi ) is obtained. {b, ID(F i PUF),E(RFiPUF
: IPwi )} is sent to SD. Step12: SDi receives E(RFiPUF
: IPwi ) and calculates a com- bined challenge with a and b. The response is then generated as a key to decrypt IPwi , making it run normally.
In each trading procedure, SD will apply a unique finger- print to ensure the anonymity of trading. Even the same SD buys IP cores for several times, CV knows nothing about the real identity of trading customer. IPwi realizes passive copyright authentication and tracing after active encryption protection is cracked. The temporary identity IDti of SD and the signature of CV can be used as fingerprint of IP buyer and copyright information respectively. In the anonymous IP authentication scheme, the identities of SDi and CVi should be firstly authenticated to prevent the decrypted IP cores being obtained by illegal users. In the double PUF authentica- tion model, FV will generate a random number a as challenge to verify the legality of CVi. It is unnecessary for FV and CVi to store all CRPs in advance, thereby it has good superiority in resource overhead, security and applicability. • There are two authentication processes before IP decryp- tion. Firstly, CVi authenticates SDi. In registration, a unique ID ID(SDi) is allocated to SDi, which is binding to a delay vector ωSDi. CVi applies to search {ωSDi}ID(SDi) with IDti. After decryption with ID(SDi), ωSDi is sent to CVi. CVi uses the random number a generated by SD as challenge of PUF with ωSDi and calculates the response R′(a). If R′(a) and R(a) are consistent, the identity of SDi is successfully authenticated. Only legal SDi can decrypt IP content.
• FV will authenticate CVi. CVi sends ID(CVi) to FV. FV searches {ωCVi}ID(CV i) in database with ID(CVi). After decryption with ID(CVi), ωCVi is generated. A ran- dom number b acts as challenge of PUF with ωCVi, the response is R′(b). The random number b is sent to CVi. CVi uses b as challenge of its PUF, producing R(b). R(b) and ID(FiPUF) are sent to FV for comparison. If R(b) is equal to R′(b), CVi is successfully authen- ticated. ID(CVi) should be updated to avoid legal CVi leaking the delay vector. Illegal CVi may cheat FV and pass the authentication.
3) COPYRIGHT AUTHENTICATION AND INFRINGEMENT TRACING In this section, we consider hardware authentication and soft- ware IP authentication. Assume there are two cases. • Legal SD buys hardware FPGA device from seller. He can send ID(FiPUF) to FV to authenticate the legality of the device. FV searches the database to determine whether the hardware ID exists. If so, a randomly selected CRP is returned to SD for authentication.
VOLUME 7, 2019 124791
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
After receiving the authentication information, SD cal- culates PUF response with the challenge. If the response is equal to that from FV, hardware identity is legal. Otherwise, SD can inform FV the forged behavior. Both SD and FV can track the initiator of forging and pursue their infringement.
• If CV finds IP core is misappropriated illegally, he can apply to authenticate IP copyright. CV sends the identity ID(IPi) of suspected IP and ID(F
i PUF) to FV. With par-
ticipation of FV, IDti and SSDi will be extracted from the suspected IP. If the extraction is successful, IP copyright can be proven. FV uses the extracted temporary identity IDti of SD, and search ID(SDi) and real identity to track the infringement.
V. SECURITY ANALYSIS The security analysis mainly involves counterfeit attack, modeling attack, replay attack and anonymity. For coun- terfeit attack, illegal attacker pretends to be legal partic- ipator and steals key information. Modeling attack learns the response of PUF in protocol and builds a PUF model with similar behavior to the original. Replay attack uses historical challenges to generate corresponding response key. Anonymity is that the participator uses temporary identity for privacy protection. Concrete analysis is illustrated as follows.
A. COUNTERFEIT ATTACK In the proposed protocol, an attacker cannot pretend to be a trustable FV. FV is a trustable participator with respon- sibility for registration, authentication, etc. PUF hardware circuit is implemented in manufactured chip. CRPs of PUF are analyzed to build PUF model. An attacker needs to obtain all information of device and trading, thereby he can pre- tend as an illegal FV. However, it is very difficult for an attacker to obtain these information since they are critical to FV. Besides, SD and CV encrypt both identities by pub- lic key for FV before transmission. Only trustable FV can decrypt and get the real identity information. Take SD for example, FV calculates ID(SDi) = Hash(SSDi) ⊕ Ns and KFS = Hash(SSDi) ⊕ KeyFS. Ns and KeyFS are then sent to SD. Other users cannot obtain Ns and KeyFS except SD. One-way hash function could verify whether the received information is from the trustable FV. Similarly, Ns and KeyFS can be also verified. In trading process, SD verifies whether VN1 = Hash(SSD1||Ns||KeyFS) satisfies, there by authenticat- ing FV. The use of hash function for authentication can avoid an attacker pretending as FV. Moreover, disposable trading password Pi = H(Ni||IDti||ID(SDi)) is used in trading. Ni is a random key. Trustable FV can verify Pi to determine validity of SD, thereby an illegal attacker cannot pretend as legal SD.
B. MODELLING ATTACK Modeling attack is the biggest threat for arbiter PUF. PUF response R mainly depends on challenge C and inner delay
vector ω. With enough PUF CRPs, the inner delay vector ω can be calculated and used for modeling a simulated PUF. An attacker may use machine learning to perform modeling attack. A suitability function f (.) is required to determine which PUF model is closest to the original one with a givenω. However, machine learning is only suitable for those single and simple PUFs. As xor operation can mix PUF response and improve PUF security effectively. In the proposed pro- tocol, PUF challenges are constituted by random numbers from both authentication parties. Malicious attackers cannot control a complete PUF challenge in one round of authentica- tion. The protocol will not transmit all CRPs directly. FPGA manufacturer implements a PUF in each manufactured chip and tests all PUF CRPs via an access point. After building a simulated PUF with the CRPs, the access point is destroyed permanently. The simulated PUF has similar behavior with the original one. FV can use the model to authenticate the device with the original PUF. Besides, an attacker requires Nmin CRPs at least to realize modeling attack on a N level PUF [27]. Here, Nmin = N/e. e is an error threshold. If the PUF model has an accuracy rate of 90%, the error thresh- old is 10%. In this work, 2-XOR PUF is used which has better security than the traditional one. In the double PUF authentication model, an attacker is difficult to obtain enough complete CRPs for modeling attack.
C. REPLAY ATTACK The ability against replay is analyzed in two aspects. On one hand, transmitted CRPs cannot be captured by illegal attacker in hardware authentication. PUF challenges are generated jointly by FV and SD to avoid an attacker capturing complete challenge. In the worst case, the attacker can obtain a half challenge. The proposed PUF structure has good performance on avalanche effect. Namely, One changed bit of PUF chal- lenge will cause over half of PUF response bit flipping. Attackers cannot realize replay attack even he captured half of PUF challenge. On the other hand, legal SD may forge IP copyright by replay attack and pretend as IP owner. In reg- istration, CV applies to FV for identity authentication and IP registration. When CV requires to authenticate IP owner- ship, he can extract identity information and the temporary identity of the buyer from IP design. FV can participate the authentication. FV can track the real identity of IP buyer with the temporary identity. The existence of trading record can be proven. However, malicious SD may also extract a forged copyright information. But it cannot convince FV and CV. In the worst case, malicious SD removes the copyright information of CV and the fingerprint of IP buyer, thereby IP design loses passive protection. In this case, hash message of IP design is compared to the stored one of FV. If both are consistent, the counterfeit behavior of SD is proven. How- ever, If SD inserts the fingerprint in IP design and inserts his own signature. The hash message will be different with that in the database of FV. If the trading record exists, IP bitstream can be also analyzed. If the result exceeds the threshold, SD is also probable to forge the IP copyright.
124792 VOLUME 7, 2019
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
TABLE 2. Comparison of PUF resource overhead.
D. ANONYMITY In registration, trustable FV calculates identity of SD with the real identity information SSDi and a random key Ns. In the trading procedure, FV will generate a new temporary trading identity IDti for SD with ID(SDi) and Ni. The temporary trading identity ensures privacy and security of IP buyer. The anonymous identity of SD will be sent to CV and inserted into IP design as the tracking evidence of infringement. Anonymity makes illegal CV difficult to pretend as legal SD to resell IP and falsely accuse SD for compensation. If SD resells IP illegally, CV can extract the identity information to prove IP ownership. The extracted anonymous identity of SD can be sent to FV for tracking the infringement.
VI. EXPERIMENT ANALYSIS In this section, the experiments are conducted on Xilinx Vir- tex5 FPGA for performance evaluation. The design tool ISE, logic synthesis software Synplify, simulation tool Modelsim are used in experiment. The PUF circuit is implemented in Virtex 5 FPGA device. After that, a 128-bit binary sequence is generated by random function and preset in the LFSR for challenge generation. With the shift pulse, the generated challenge will be inputted to PUF circuit, producing a PUF response. This section mainly evaluates the resource over- head and PUF performance.
A. RESOURCE OVERHEAD This work implements a 128-bit PUF response via 2-XOR PUF circuit. Except the hardware resource overhead of PUF itself, the assistant modules such as challenge generation, extension function, signal voting will also consume hard- ware resources. The comparative PUFs are respectively 2-1 DAPUF [25] and built-in self-adjustable PUF [30]. The com- parison result is listed in Table. 2
As show in Table.2, the proposed PUF has good perfor- mance in resource overhead. The built-in self-adjustable PUF determine the delay of two delay paths in implementation. It achieves good uniqueness and stability of PUF by the cost of hardware resources. 2-1 DAPUF includes two arbiters. XOR operation is used to improve the uniqueness. The pro- posed PUF uses four delay paths, but resource overhead is reduced by 31.61% by comparing to the self-adjustable PUF, and 61.96% by comparing to 2-1DAPUF.
B. PERFORMANCE EVALUATION This section evaluates the randomness, stability and unique- ness for PUF. The calculation equations of these metrics are referenced from [9] and the evaluation results are analyzed.
1) RANDOMNESS PUF generates an unpredictable response. Namely, PUF response with good randomness is difficult to predict by
inputting a challenge, thereby achieving better security. Gen- erally, the number of 0-bit and 1-bit are the same in the response signal. In other words, the ratio of 0 and 1 are close to 50% respectively in the response of the PUF, which demon- strates good randomness. The randomness can be represented by equation (12). Here, RD is quantified value of randomness. l denotes the index of a certain bit in response. ri,l is the bit value at the l-th position.
RD = 1 n
n∑ l=1
ri,l ×100% (12)
By sampling the generated 128-bit PUF response, the dis- tribution of 0-bit and 1-bit are recorded. The result shows that RD is 48.64%. The difference to ideal value is only 2.36%. The performance in randomness is encouraging.
2) STABILITY Stability means the response of a PUF is reliable. It can be quantified by intra variance, which represents the num- ber of changed bits in the response signal by inputting the same challenges to a PUF in different environments. In the- ory, the response will not be changed. However, it will be affected by external factors. If the differences among multiple responses for the same challenges fall into the range of the preset threshold, it will be acceptable. The intra variance is measured by HDintra. The value of HDintra close to 0 illus- trates good stability of the PUF. Let m and n be the number of responses and the number of bits in response respectively. Ri,k represents the response in k-th round. On this basis, the stability of PUF can be calculated by equation (13).
S = 1−HDintra
= 1− 1 m
m∑ k=1
HD(Ri,R′i,k)
n ×100% (13)
In this experiment, a random challenge is generated to evaluate the stability of response. FPGA device is partitioned into 15 regions and each can implement an independent PUF circuit. A challenge is repeatedly inputted to a PUF in each region for five times. The hamming distances of responses are recorded. In each region, there are 10 results for every two PUF responses. For different regions, there are 15 groups of results by repeating the challenge.
In Fig.11, x-axis is the percentage of hamming distance and y-axis is density of hamming distance in a region. The statistics of 150 hamming distances show that, about 71.3% of results falling within 0 ∼ 1%. Namely, majority of PUF responses has the instability less than 1%.
In Fig. 12, x-axis is the region and y-axis is the stability percentage of PUF. By repeating a challenge for several times, the PUF in a region will generate a constant response in theory. By equation (13), if the response is constant, HDintra is equal to 0 and the stability is 100%. The evaluation result for each point at x-axis can be regarded as an average value of multiple PUFs. Due to the use of signal voting module,
VOLUME 7, 2019 124793
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
FIGURE 11. Average intra hamming distance.
FIGURE 12. Stability of PUF response under the same temperature.
FIGURE 13. Stability of PUF response under different temperatures.
PUF will select the output that appears more times as the final response. The average stability value is 99.54%.
Besides, we evaluate the impact of environmental factor such as temperature on PUF stability. A hairdryer is used to simulate the environmental temperatures from 25◦C to 70◦C. The stability of PUF in a region is evaluated in environment
FIGURE 14. Density distribution of average hamming distances for pairs of PUF response in different regions.
of different temperatures and the result is shown in Fig.13. When temperature is changing from 25 ◦C to 70◦C, the insta- bility of PUF falls within 1%, thus demonstrating a good stability.
3) UNIQUENESS Uniqueness represents the response of a PUF is unclonable. It can be quantified by inter-variance. The PUF with the same structure is deployed into different chips. The inter- variance can be measured by the number of different bits in the response signal by inputting the same challenge to the deployed PUFs. The difference of physical structure of IC is randomly distributed. Thus the structures of different chips should be unclonable. Uniqueness can be evaluated by (14).
U = HDinter
= 2
t(t −1)
t−1∑ i=1
t∑ j=i+1
HD(Ri,Rj) n
×100% (14)
Here, HDinter is the average inter hamming distance. HD(Ri,Rj) is the inter hamming distance of two PUFs. Ri and Rj are the response of i-th and j-th PUF. t denotes the number of PUFs in experiment. The quantified value of uniqueness is calculated by the average value of hamming distances between response pairs of t-th PUF.
In this experiment, the regions of FPGA is also 15 and each implements a PUF to simulate PUF implementation on different chips. With the same challenges, the number of 0- bit and 1-bit in PUF responses are recorded. The hamming distances between pairs of responses are also calculated, producing 105 results. The density distribution of these ham- ming distances is shown in Fig.14. In this figure, x-axis is percentage of hamming distance and y-axis is the distribution density of hamming distance in a region. High histogram demonstrates large distribution density of hamming distances in a certain region.
The same PUF implemented in different chip will generate different responses by using the same challenges. In ideal case, the difference may be 50%. However, the evaluated
124794 VOLUME 7, 2019
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
TABLE 3. PUF performance comparison.
value may be different with the ideal value. In this experi- ment, the maximum and minimum values of inter hamming distance are 62.5% and 21.9%. Namely, there are 80 and 28 different bits respectively. By statistics, the average value of 105 hamming distances is 59.95 and the percentage is about 46.84%. There are more differences of different FPGAs than that of different regions in a FPGA. So, it will perform better in different FPGAs.
Finally, Table.3 lists the comparison of three PUFs. The first column is ideal values for various metrics. The self- adjustable APUF improves PUF uniqueness and random- ness by the self-adjustable module. It has better random- ness than the proposed PUF, but causing more hardware resources. By comparing to self-adjustable PUF and 2- 1 DAPUF, the proposed PUF improves the stability by 7.17% and 12.86% respectively. The uniqueness of the proposed PUF is slightly improved than 2-1 DAPUF.
VII. CONCLUSION Hardware authentication issues are critical in edge computing and IoT environment. To address this issues, a PUF based anonymous IP authentication technique is proposed for both hardware FPGA and software IP designs. When an infringe- ment occurs, the double PUF protocol can be used for authen- tication. In hardware authentication, challenges information is jointly generated by both authentication parties. It can resist against replay attack and modeling attack. In the double PUF authentication protocol, FPGA vendor is unnecessary to store numerous PUF CPRs, which saves plenty of storage. The IP copyright information and anonymous identity of IP buyer will be inserted into IP design before trading. It realizes pas- sive IP protection and infringement tracing. The anonymity can protect benefits of IP buyer and track IP infringement with the participation of trustable device vendor.
REFERENCES [1] W. Z. Khan, M. Y. Aalsalem, and M. K. Khan, ‘‘Communal acts of
IoT consumers: A potential threat to security and privacy,’’ IEEE Trans. Consum. Electron., vol. 65, no. 1, pp. 64–72, Feb. 2019.
[2] L. Zhang and C.-H. Chang, ‘‘A pragmatic per-device licensing scheme for hardware IP cores on SRAM-based FPGAs,’’ IEEE Trans. Inf. Forensics Security, vol. 9, no. 11, pp. 1893–1905, Nov. 2014.
[3] Q. Xiang, P. Zhang, and D. Ouyang, ‘‘Multiple frequency slots based physical unclonable functions,’’ J. Electron. Inf. Technol., vol. 34, no. 8, pp. 2007–2012, Aug. 2012.
[4] W. Liang, B. Liao, J. Long, Y. Jiang, and L. Peng, ‘‘Study on PUF based secure protection for IC design,’’ Microprocessors Microsyst., vol. 45, pp. 56–66, Aug. 2016.
[5] M. T. Rahman, F. Rahman, D. Forte, and M. Tehranipoor, ‘‘An aging- resistant RO-PUF for reliable key generation,’’ IEEE Trans. Emerg. Topics Comput., vol. 4, no. 3, pp. 335–348, Jul./Sep. 2016.
[6] Z. Huang and Q. Wang, ‘‘A PUF-based unified identity verification frame- work for secure IoT hardware via device authentication,’’ World Wide Web, Apr. 2019, pp. 1–32.
[7] A. Sengupta, D. Roy, and S. P. Mohanty, ‘‘Triple-phase watermarking for reusable IP core protection during architecture synthesis,’’ IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 37, no. 4, pp. 742–755, Apr. 2018.
[8] Q. Guo, J. Ye, Y. Gong, Y. Hu, and X. Li, ‘‘PUF based pay-per-device scheme for IP protection of CNN model,’’ in Proc. IEEE 27th Asian Test Symp. (ATS), Oct. 2018, pp. 115–120.
[9] J. Zhang, Y. Lin, and Y. Lyu, ‘‘A PUF-FSM binding scheme for FPGA IP protection and pay-per-device licensing,’’ IEEE Trans. Inf. Forensics Security, vol. 10, no. 6, pp. 1137–1150, Jun. 2015.
[10] Q. Ma, C. Gu, N. Hanley, C. Wang, W. Liu, and M. O’Neill, ‘‘A machine learning attack resistant multi-PUF design on FPGA,’’ in Proc. 23rd Asia South Pacific Des. Automat. Conf. (ASP-DAC), Jan. 2018, pp. 97–104.
[11] Z. Siddiqui, O. Tayan, and M. K. Khan, ‘‘Security analysis of smartphone and cloud computing authentication frameworks and protocols,’’ IEEE Access, vol. 6, pp. 34527–34542, 2018.
[12] D. He, N. Kumar, M. K. Khan, L. Wang, and J. Shen, ‘‘Efficient privacy- aware authentication scheme for mobile cloud computing services,’’ IEEE Syst. J., vol. 12, no. 2, pp. 1621–1631, Jun. 2018.
[13] G. Li, P. Wang, and Y. Zhang, ‘‘A highly reliable lightweight PUF circuit with temperature and voltage compensated for secure chip identification,’’ in Proc. IEEE 12th Int. Conf. ASIC (ASICON), Oct. 2018, pp. 60–63.
[14] R. Pappu, B. Recht, J. Taylor, and N. Gershenfeld, ‘‘Physical one-way functions,’’ Science, vol. 297, no. 5589, pp. 2026–2030, Sep. 2002.
[15] D. Li, W. Liu, X. Zou, and Z. Liu, ‘‘Hardware IP protection through gate- level obfuscation,’’ in Proc. 14th Int. Conf. Comput.-Aided Des. Comput. Graph. (CAD/Graphics), Aug. 2015, pp. 186–193.
[16] S. S. Kumar, J. Guajardo, R. Maes, G.-J. Schrijen, and P. Tuyls, ‘‘Extended abstract: The butterfly PUF protecting IP on every FPGA,’’ in Proc. IEEE Int. Workshop Hardware-Oriented Secur. Trust, Jun. 2008, pp. 67–70.
[17] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, ‘‘Physical unclon- able functions and public-key crypto for FPGA IP protection,’’ in Proc. Int. Conf. Field Program. Logic Appl., Aug. 2007, pp. 189–195.
[18] M. A. Gora, A. Maiti, and P. Schaumont, ‘‘A flexible design flow for software IP binding in commodity FPGA,’’ in Proc. IEEE Int. Symp. Ind. Embedded Syst., Jul. 2009, pp. 211–218.
[19] E. Simpson and P. Schaumont, ‘‘Offline hardware/software authentica- tion for reconfigurable platforms,’’ in Proc. Int. Workshop Cryptograph. Hardw. Embedded Syst., Oct. 2006, pp. 311–323,
[20] J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls, ‘‘FPGA intrinsic PUFs and their use for IP protection,’’ in Proc. Int. Workshop Cryptograph. Hardw. Embedded Syst, 2007, pp. 63–80.
[21] J. Zhang, Y. Lin, Y. Lyu, R. C. C. Cheung, W. Che, Q. Zhou, and J. Bian, ‘‘Binding hardware IPs to specific FPGA device via inter-twining the PUF response with the FSM of sequential circuits,’’ in Proc. IEEE 21st Annu. Int. Symp. Field-Program. Custom Comput. Mach., Apr. 2013, p. 227.
[22] J. Zhang, Y. Lin, and G. Qu, ‘‘Reconfigurable binding against FPGA replay attacks,’’ Acm Trans. Design Autom. Electron. Syst., vol. 20, no. 2, Feb. 2015, Art. no. 33.
[23] J. Zhang, Q. Wu, Y. Lyu, Q. Zhou, Y. Cai, Y. Lin, and G. Qu, ‘‘Design and implementation of a delay-based PUF for FPGA IP protection,’’ in Proc. Int. Conf. Comput.-Aided Des. Comput. Graph., Nov. 2013, pp. 107–114.
[24] G. Zhang, Q. Liu, and Q. Zhang, ‘‘Low cost and high performance RO- PUF design for IP protection of FPGA implementations,’’ Xi’an Dianzi Keji Daxue Xuebao/J. Xidian Univ., vol. 43, no. 6, pp. 97–102, Dec. 2016.
[25] T. Machida, D. Yamamoto, M. Iwamoto, and K. Sakiyama, ‘‘Implementa- tion of double arbiter PUF and its performance evaluation on FPGA,’’ in Proc. 20th Asia South Pacific Des. Automat. Conf., Jan. 2015, pp. 6–7.
[26] Z. Liu, B. Liu, and Z. Lu, ‘‘FPGA design of low resource consumed arbiter PUF,’’ J. Huazhong Univ. Sci. Technol., vol. 2, pp. 5–8, Feb. 2016.
[27] J. Delvaux, R. Peeters, D. Gu, and I. Verbauwhede, ‘‘A survey on lightweight entity authentication with strong PUFs,’’ Acm Comput. Surv., vol. 48, no. 2, Nov. 2015, Art. no. 26.
[28] M. Majzoobi, M. Rostami, F. Koushanfar, D. S. Wallach, and S. Devadas, ‘‘Slender PUF protocol: A lightweight, robust, and secure authentication by substring matching,’’ in Proc. IEEE Symp. Secur. Privacy Workshops, May 2012, pp. 33–44.
[29] R. Maes, D. Schellekens, and I. Verbauwhede, ‘‘A pay-per-use licensing scheme for hardware IP cores in recent SRAM-based FPGAs,’’ IEEE Trans. Inf. Forensics Security, vol. 7, no. 1, pp. 98–108, Feb. 2012.
[30] Y. Gong, J. Ye, and Y. Hu, ‘‘Built-in self adjustable arbiter PUF,’’ J. Comput.-Aided Des. Comput. Graph., vol. 29, no. 9, pp. 1734–1739, 2017.
VOLUME 7, 2019 124795
J. Long et al.: PUF-Based Anonymous Authentication Scheme for Hardware Devices and IPs in Edge Computing Environment
JING LONG received the M.S. degree from the College of Computer Science and Engineering, Hunan University of Science and Technology, China, in 2012, and the Ph.D. degree from the College of Computer Science and Electronic Engi- neering, Hunan University, China, in 2018. She is currently a Lecturer with the College of Infor- mation Science and Engineering, Hunan Normal University. Her current research interests include hardware security, the Internet of Things, and network security.
WEI LIANG received the Ph.D. degree in com- puting science from Hunan University, China, in 2013. He is currently a Full Professor with the School of Opto-Electronic and Communica- tion Engineering, Xiamen University of Technol- ogy, China. His current research interests include steganography, real-time embedded systems, field programmable gate arrays, and vehicle networks.
KUAN-CHING LI (SM’07) is currently a Distinguished Professor with the Department of Computer Science and Information Engineering, Providence University, Taiwan. Besides publish- ing numerous research papers and articles, he is coauthor/co-editor of several technical profes- sional books published by CRC Press/Taylor & Francis, Springer, McGraw-Hill, and IGI Global. His research interests include parallel and dis- tributed processing, GPU/many-core computing,
and big data and cloud. He is a member of the AAAS, a Life Member of the TACC, and a Fellow of the IET. He received distinguished and chair professorships from universities in China and other countries, and a recipient of awards and funding support from several agencies and high-tech companies. He has been actively involved in several major conferences and workshops in program/general/steering conference chairman positions and has organized numerous conferences on high-performance computing and computational science and engineering.
DAFANG ZHANG was born in Shanghai. He received the Ph.D. degree in applied mathe- matics from Hunan University, China, in 1997, where he is currently a Professor, a Doctor, and a Ph.D. Supervisor with the College of Computer Science and Electronic Engineering. His research interests include dependable systems/networks, network security, network measurement, hardware security, and IP protection.
MINGDONG TANG received the B.S. degree in electrical engineering from Tianjin University, Tianjin, China, in 2000, the M.S. degree in control engineering from Shanghai University, Shanghai, China, in 2003, and the Ph.D. degree in computer science from the Institute of Computing Tech- nology, Chinese Academy of Sciences, Beijing, China, in 2010. He is currently a Professor with the School of Information Science and Technol- ogy, Guangdong University of Foreign Studies,
Guangzhou, China, and the Guangdong Provincial Key Laboratory of Com- putational Intelligence and Cyberspace Information, South China University of Technology, Guangzhou. He has published more than 100 peer-reviewed scientific research papers in various journals and conferences. His research interests include information security, service-oriented computing, data min- ing, and blockchain. He is a member of China Computer Federation and ACM.
HAIBO LUO received the B.E. degree in commu- nication engineering from the Wuhan University of Technology, China, in 2006, the M.E. degree in information and communication engineering from Hunan University, China, in 2009, and the Ph.D. degree with the College of Physics and Infor- mation Engineering, Fuzhou University, Fuzhou, China. His research interests include the Internet of Things, the cognitive Internet of Things, edge computing, and mobile computing.
124796 VOLUME 7, 2019
- INTRODUCTION
- RELATED WORK
- PUF-BASED AUTHENTICATION MODEL
- PUF CIRCUIT MODEL
- CHALLENGE GENERATION MODULE
- PUF-BASED CHARACTERISTIC EXTRACTION MODULE
- SIGNAL VOTING MODULE
- DOUBLE PUF AUTHENTICATION MODEL
- ANONYMOUS IP AUTHENTICATION SCHEME
- PARTICIPATORS IN IP TRADING
- PROTOCOL IMPLEMENTATION
- REGISTRATION
- IP TRADING PROTOCOL
- COPYRIGHT AUTHENTICATION AND INFRINGEMENT TRACING
- SECURITY ANALYSIS
- COUNTERFEIT ATTACK
- MODELLING ATTACK
- REPLAY ATTACK
- ANONYMITY
- EXPERIMENT ANALYSIS
- RESOURCE OVERHEAD
- PERFORMANCE EVALUATION
- RANDOMNESS
- STABILITY
- UNIQUENESS
- CONCLUSION
- REFERENCES
- Biographies
- JING LONG
- WEI LIANG
- KUAN-CHING LI
- DAFANG ZHANG
- MINGDONG TANG
- HAIBO LUO
sources/160/send.pdf
A Dissertation
entitled
PUF based FPGAs for Hardware Security and Trust
by
Muslim Mustapa
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Doctor of Philosophy Degree in Electrical Engineering
_________________________________________
Dr. Mohammed Niamat, Committee Chair
_________________________________________
Dr. Mansoor Alam, Committee Member
_________________________________________
Dr. Jackson Carvalho, Committee Member
_________________________________________
Dr. Junghwan Kim, Committee Member
_________________________________________
Dr. Weiqing Sun, Committee Member
_________________________________________
Dr. Patricia R. Komuniecki, Dean
College of Graduate Studies
The University of Toledo
August 2015
Copyright 2015, Muslim Mustapa
This document is copyrighted material. Under copyright law, no parts of this document
may be reproduced without the expressed permission of the author
iii
An Abstract of
PUF based FPGAs for Hardware Security and Trust
by
Muslim Mustapa
Submitted to the Graduate Faculty as partial fulfillment of the requirements for the
Doctor of Philosophy Degree in Electrical Engineering
The University of Toledo
August 2015
Hardware security threats have become a major issue in the technology sector and
cyberspace. In 2011, more than 1300 counterfeit incidents were reported from around the
world to the Electronic Resellers Association International (ERAI). The incidents
reported in 2011 were more than double compared to the incidents reported in 2010 and
2008. The federal contract report states that counterfeiting of electronic parts has
threatened the operability and reliability of the US weapons system. Electronic parts
counterfeiting has become a very big business perpetrated by corrupt operators.
Just like ASIC semiconductors, reconfigurable hardware is also prone to hardware
security threats. The most commonly used reconfigurable hardware is the Field
Programmable Gate Array (FPGA). Demand for FPGAs has increased as can be seen by
the growth in FPGA companies such as Xilinx and Altera. Despite the increased demand
and use of FPGAs in the market, there is a great concern that security is not currently a
part of the FPGA hardware and software to the fullest extent. Design theft, and hardware
tampering threats on FPGAs can be dealt using Ring Oscillator Physical Unclonable
Function (ROPUF). A ROPUF takes advantage of the process variation on a silicon chip
iv
to generate a unique ID for the purpose of authentication. A ROPUF can be implemented
on an FPGA chip to produce a unique ID for each FPGA chip. An adversary that tries to
tamper with the ROPUF inadvertently changes the properties of the process variation in
the silicon chip; thus any tampering attempt can be detected.
In this research, ROPUF based hardware security for FPGAs is presented. A total
of 50 Xilinx FPGAs are used in our investigation. Performance in terms of uniqueness
and reliability is evaluated. The effects of temperature variation, voltage variation, and
aging on these parameters are also studied. Our work shows that lower number of stages
used in the Ring Oscillator (RO) offers better security feature. The lower number of
stages used in ROs yield higher Challenge and Response Pairs (CRPs). The higher
number of CRPs contributes to higher security. In addition, we have introduced a
technique called Random Patch Mixer (RPM) to minimize the systematic variations
effect on the frequency generated from ROPUFs on FPGA. The results obtained by using
RPM technique are shown to be better than other techniques that have been proposed
before. The responses generated from ROPUF after applying the RPM technique passed
most of the NIST statistical test for randomness. Finally, we show how the ROPUF can
be used for the security of a Smart Grid. The security of ROPUF system is also tested
using support vector machine (SVM). The SVM is trained using a large data set of
challenges to predict the response sets. Results obtained show that the SVM fails to
predict ROPUF responses based on the challenges, thus enhancing the security offered by
the proposed authentication system.
v
Acknowledgements
My wonderful Ph.D. journey will never be a success without great people around
me. Firstly, I would like to thank my advisor, Dr. Mohammed Niamat for his guidance,
advice and support throughout my time here at University of Toledo. Secondly, my
thanks go to my Ph.D. committee members who have spent their time to review my
research work and share their thoughts with me. I cannot forget to thank all my
colleagues for the fun time that we have had together. I also would like to use this
opportunity to thank my sponsor, the Malaysian Government, which has given me the
opportunity to further my Ph.D. studies. Finally, and the most important one for me, I
would like to thank my wife, children, parents, and my family for their continuous
support, understanding, and sacrifices throughout this Ph.D. journey.
vi
Table of Contents
Abstract .............................................................................................................................. iii
Acknowledgements ..............................................................................................................v
Table of Contents ............................................................................................................... vi
List of Tables .......................................................................................................................x
List of Figures ................................................................................................................... xii
List of Abbreviations .........................................................................................................xv
1 Introduction ..............................................................................................................1
1.1 Motivation ..........................................................................................................2
1.2 Research Objectives ...........................................................................................3
2 Research Background ..............................................................................................5
2.1 Process Variations ..............................................................................................5
2.2 Physical Unclonable Functions (PUFs) .............................................................6
2.2.1 Arbiter Physical Unclonable Function (APUF) ....................................6
2.2.2 Butterfly Physical Unclonable Function (BPUF) .................................7
2.2.3 Ring Oscillator Physical Unclonable Function (ROPUF) ...................7
2.2.4 PUF Implementation on FPGA .............................................................8
2.3 Challenge and Response on ROPUF .................................................................9
vii
2.4 RO Delay .........................................................................................................10
3 Frequency Uniqueness in ROPUF on FPGA .........................................................11
3.1 Introduction ......................................................................................................11
3.2 Experimental Setup ..........................................................................................12
3.3 Results and Analysis ........................................................................................14
3.3.1 Impact of Number of Stages on PUFs ................................................14
3.3.2 Explanation of High Frequency Variation in a 3-Stage RO ...............16
3.4 Summary ..........................................................................................................20
4 Relationship between Number of Stages in ROPUF and CRP Generation ...........21
4.1 Introduction ......................................................................................................21
4.2 Background ......................................................................................................22
4.2.1 Number of Stages of the RO ...............................................................23
4.2.2 RO Parameters ....................................................................................24
4.3 Experimental Setup ..........................................................................................26
4.4 Results and Analysis ........................................................................................29
4.5 Summary ..........................................................................................................36
5 A Novel RPM Technique for Minimize Systematic Variations ............................38
5.1 Introduction ......................................................................................................38
5.2 Systematic Variations and Bit Flip Minimization ...........................................40
5.2.1 The Effect of Systematic Variations ...................................................40
5.2.2 Regression Based Distiller ..................................................................40
5.2.3 RPM Technique ..................................................................................42
5.3 Results and Discussion ....................................................................................43
viii
5.3.1 Systematic Variations Effect on Frequency Distribution ...................44
5.3.2 Systematic Variations Minimization by RPM Technique ..................46
5.3.3 NIST Statistical Test for Randomness ................................................50
5.4 Summary .........................................................................................................51
6 Temperature, Voltage, and Aging Effects on ROPUFs Function ..........................52
6.1 Introduction ......................................................................................................52
6.2 Background ......................................................................................................53
6.2.1 Ring Oscillator PUF Response ...........................................................54
6.2.2 Number of Stages in Ring Oscillator ..................................................54
6.3 Experimental Setup ..........................................................................................54
6.4 Results and Analysis ........................................................................................56
6.5 Summary ..........................................................................................................66
7 A Comparative Study of Ring Oscillator PUFs on Different FPGA Families ......67
7.1 Introduction ......................................................................................................67
7.2 Related Work ...................................................................................................69
7.3 Background ......................................................................................................70
7.3.1 Ring Oscillator PUF Response ...........................................................70
7.3.2 Number of Stages in Ring Oscillator ..................................................70
7.3.3 ROPUF Parameters .............................................................................70
7.4 Experimental Setup ..........................................................................................73
7.5 Results and Analysis ........................................................................................74
7.5.1 ROPUF Uniqueness ............................................................................75
7.5.2 ROPUF Uniformity .............................................................................78
ix
7.5.3 ROPUF Bit Aliasing ...........................................................................79
7.5.4 ROPUF Reliability ..............................................................................79
7.5.2 ROPUF Diverseness ...........................................................................84
7.5 Summary ..........................................................................................................84
8 ROPUF Application: Hardware-Oriented Security-Based Authentication for
Advanced Metering Infrastructure .........................................................................85
8.1 Introduction ......................................................................................................85
8.2 Related Work ...................................................................................................88
8.3 Hardware-Oriented Security-Based Authentication for AMI ..........................90
8.3.1 ROPUF Design ...................................................................................96
8.3.2 Authentication .....................................................................................98
8.4 Proof of Concept ............................................................................................101
8.4 Summary ........................................................................................................106
9 Conclusions ..........................................................................................................107
9.1 Summary and Conclusions ............................................................................107
9.2 Contributions and Results ..............................................................................108
9.3 Future Works .................................................................................................110
References ........................................................................................................................112
x
List of Tables
2.1 Comparison of ROPUF, APUF, and BPUF .............................................................9
3.1 Standard deviation for all RO stages .....................................................................15
3.2 Number of slice and CLB used on FPGA for single RO .......................................18
4.1 Uniqueness, bit-aliasing, diverseness, and uniformity for 3, 5, and 7-stage ROs .31
4.2 SD for Chip 1, Chip 2, and Chip 3 .........................................................................32
4.3 Reliability for Chip 3 .............................................................................................33
4.4 Number of comparison pairs generated on Chip 1, Chip 2, and Chip 3 ................36
5.1 Hamming Distance (HD) for FPGA chip 1 before RPM is applied ......................49
5.2 Hamming Distance (HD) for FPGA chip 1 after RPM is applied .........................49
5.3 NIST statistical test for randomness results ...........................................................51
6.1 Bit flip occurrences on Spartan 3E ........................................................................57
6.2 Bit flip occurrences due to aging on Spartan 3E....................................................59
6.3 Percentage of bit flip occurrences ..........................................................................65
6.4 Number of comparison pairs according to threshold frequency ............................66
7.1 ROPUF’s parameters comparison..........................................................................75
7.2 ROPUF’s reliability due to changes in temperature, voltage, and aging ...............80
8.1 Comparison of different schemes based on Smart Grid requirements ..................89
8.2 Number of possible CRPs ......................................................................................98
xi
8.3 Authentication time for each level .......................................................................102
8.4 Data storage size for each authentication level ....................................................102
8.5 Data storage size needed based on number of devices on the AMI .....................103
xii
List of Figures
2-1 Arbiter Physical Unclonable Function (APUF) .......................................................6
2-2 Butterfly Physical Unclonable Function (BPUF) ....................................................7
2-3 Ring Oscillator Physical Unclonable Function (ROPUF) .......................................8
2-4 ROPUF circuit .........................................................................................................9
3-1 ROs mapped used three slices ...............................................................................13
3-2 Test circuit diagram ...............................................................................................13
3-3 Frequency pattern for a 3-stage ring oscillator ......................................................14
3-4 Frequencies (MHz) for 3, 5, 7, 9, and 11 stage ring oscillators .............................18
4-1 5-stage RO .............................................................................................................22
4-2 3-stage RO .............................................................................................................23
4-3 7-stage RO .............................................................................................................23
4-4 Xilinx Spartan 2 CLBs layout ................................................................................27
4-5 FST circuit diagram ...............................................................................................28
4-6 Difference between comparison pairs and CRPs ...................................................34
5-1 Frequencies across columns 1-3 of the CLBs ........................................................41
5-2 Location of ROs on Spartan 2 FPGA ....................................................................42
5-3 3-stage RO frequencies across Spartan 3E ............................................................44
xiii
5-4 3-stage RO frequencies across Spartan 3E FPGA after RPM technique has been
applied ....................................................................................................................46
6-1 ROs numbering system based on spatial location..................................................56
6-2 The relationship between the RO frequency distance and the probability of a PUF
output flip ...............................................................................................................57
6-3 Frequency changes in ROs due to the aging effect on Spartan 3E ........................60
6-4 Frequency changes with respect to the temperature variations on Spartan 3E ......62
6-5 Frequency changes with respect to the voltage variations on Spartan 3E .............63
6-6 Temperature chamber ............................................................................................64
7-1 RO frequencies versus location on Spartan 3E ......................................................76
7-2 RO frequencies versus location on Artix-7 ............................................................77
7-3 Spartan 3E ..............................................................................................................81
7-4 Artix-7 ....................................................................................................................82
7-5 RO frequency changes with respect to aging .........................................................83
8-1 AMI in Smart Grid .................................................................................................92
8-2 ROPUF connected to a smart meter.......................................................................92
8-3 Smart meter to utility company authentication ......................................................93
8-4 Smart meter to utility company fail authentication................................................93
8-5 Data concentrator to utility company authentication .............................................94
8-6 Utility company to smart meter authentication ......................................................95
8-7 Utility company to smart meter fail authentication ...............................................95
8-8 ROPUF logic blocks ..............................................................................................97
8-9 Parity Bits PBi generator .....................................................................................99
xiv
8-10 Parity bits from 128 response bits form 64 parity bits ...........................................99
8-11 ROPUFs registration with utility company .........................................................100
8-12 Parity bits and corresponding ROs ......................................................................104
8-13 SVM prediction accuracy for ROPUF .................................................................105
8-14 Bit flip probability vs. frequency difference (MHz) ............................................106
xv
List of Abbreviations
AMI ............................Advanced Metering Infrastructure
APUF .........................Arbiter Physical Unclonable Function
BPUF..........................Butterfly Physical Unclonable Function
CLB ............................Configurable Logic Blocks
CPBP ..........................Challenges and Parity Bit Pairs
FDT ............................Frequency Difference Thresholds
FPGA .........................Field Programmable Gate Array
FST .............................Full Scan Technique
HD ..............................Hamming Distance
HW .............................Hamming Weight
MUX ..........................Multiplexer
PUF ............................Physical Unclonable Function
RO ..............................Ring Oscillator
ROPUF .......................Ring Oscillator Physical Unclonable Function
RPM ...........................Random Patch Mixer
SD ..............................Standard Deviation
SVM ...........................Support Vector Machine
1
Chapter 1
Introduction
As a society, a lot of trust is placed on the hardware we use on a daily basis. For
example, communication regularly occurs on sophisticated digital phones or computers.
These same devices are also capable of monitoring our bank accounts and buying and
selling goods electronically. For these reasons, it is vital that the security of hardware
devices continues to improve in order to ensure the secure transfer of information across
untrusted networks. By using hardware based authentication, a digital system can verify
that a user is in fact who he or she claims to be through the use of unique secret keys. This
secret key can be stored in the memory, or generated specifically when it needs to be used.
The first option is not used because memory is vulnerable to inexpensive attacks [5][6].
The second option is more appealing because it is both simple to implement and difficult
to attack.
Physically Unclonable Functions (PUFs) are one way of generating secret keys on
the spot, without relying on memory. PUFs exploit process variations, which are
unintentionally introduced during the manufacturing process of integrated circuits. The
process variation in turn causes small amounts of additional delays within the circuit. By
2
using this additional delay effectively, secure bits can be generated. Silicon PUFs (SPUFs)
are PUFs that are specifically designed to take advantage of the silicon manufacturing
process. The SPUFs are designed to exploit the process variation and circuit delays to
create unique challenge-response patterns [6]. There are two kinds of SPUF; Arbiter PUFs
and Ring Oscillator PUFs (ROPUFs). An Arbiter PUF is constructed from multiplexers
and an arbiter. ROPUFs are constructed from delay loops (ring oscillators) and counters.
Arbiter PUF circuits need to be symmetric in order to ensure that the routing
lengths are the same [6]. ROPUFs on the other hand do not need to be symmetric. For this
reason, ROPUFs are the preferred solution when working with FPGAs [7]. There are
many techniques used by researchers in order to improve the reliability and uniqueness of
ROPUFs [6][8][9][10]. Reliability means that the secret key generated by the ROPUF will
be the same despite any change in operating conditions [8]. Uniqueness, on the other hand,
refers to how each and every FPGA is able to generate a unique secret key [8].
1.1 Motivation
A ROPUF takes advantage of the process variation on the silicon chip to generate
a unique ID for authentication. A ROPUF can be implemented on any VLSI chip
including a FPGA to produce a unique ID for each FPGA chip. An adversary who tries to
tamper a ROPUF will change the properties of the process variation in the silicon chip;
thus any tampering effort will fail [3]. A ROPUF cannot be modeled because the process
variation on a silicon chip is random. Until now, there was no technology that could
measure the process variation with high accuracy [3].
3
1.2 Research Objectives
ROPUF research areas can be divided into four main categories: fabrication
variation extraction [11], secret selection [6][9][10][11][22][25][26], error correction, and
tests for security and reliability. Fabrication variation extraction is the study on the
physical behavior of the silicon chip. This is the most fundamental research area in
ROPUF that interacts directly with the process variation. The uniqueness and reliability
parameters of the ROPUF are studied thoroughly in this work to take full advantage of
the process variation [11][12]. Secret selection is the study of the algorithm to select the
comparison pairs that are known as challenge and response pairs. The randomness
parameter of the ROPUF is also studied in this research. Error correction is studied by
using an algorithm that corrects any flipped bits. This is important especially for ROPUF
implementation as a cryptography technique, where zero bit flipped occurrence is
expected [6]. Finally, tests for security and reliability research look into the diffuseness,
bit-aliasing, and probability of misidentification parameters of ROPUF.
This research focusses on process variation extractions, secret selection of
challenge-response pairs, and tests for reliability and security. For the process variation
extraction, the relationship between different numbers of stages used in RO with the
ROPUF’s reliability and uniqueness is studied. The objective of this study is to improve
the ROPUF’s uniqueness and reliability parameters by manipulating the structure of ROs.
For the challenge-response secret selection, the systematic variation effect on the ROPUF
is studied. The objective in this study is to develop an algorithm that can dismiss the
systematic variation effect on ROPUF. For the tests and security reliability, we have
4
conducted a study on the weaknesses of the ROPUFs. The objective in this study is to
develop an algorithm which will enhance the security and reliability of the ROPUF.
This dissertation is organized as follows:
Chapter One: This chapter briefly introduces the motivation and objective of this
research.
Chapter Two: This chapter gives background information about PUF, RO and ROPUF.
Chapter Three: This chapter discusses the relationship between the number of stages used
in an RO and the uniqueness of the frequency of the RO.
Chapter Four: This chapter discusses the relationship between the number of stages used
in a ROPUF and the number of challenge-response pairs on CRPs on a FPGA.
Chapter Five: This chapter discusses the RPM technique developed to dismiss systematic
variation effect on a ROPUF.
Chapter Six: This chapter discusses the temperature, voltage, and aging effects in
ROPUF.
Chapter Seven: This chapter discusses a comparative study of ring oscillator PUFs on
different FPGA families.
Chapter Eight: This chapter discusses the hardware-oriented security-based
authentication for advanced metering infrastructure.
Chapter Nine: This chapter forms the conclusion of this dissertation and also discusses
future work.
5
Chapter 2
Research Background
2.1 Process Variations
The reduced feature sizes in silicon chip devices make it hard to attain uniformity
in manufacturing. This results in variation in the transistor gate length and oxide
thickness that introduces propagation delays in the silicon chip [13]. This variability in
the manufacturing is known as process variation. Process variations are random and
cannot be controlled. Process variations can be divided into two types namely intra-die
variations and inter-die variations. Intra-die variations are the variations within a single
die and inter-variations are variations from chip to chip.
Intra-die variations can be categorized into two types: systematic (process shift)
and stochastic (process spread) variations [13]. Systematic variations are created by
reticle stepper alignment errors, mask errors from inaccuracies in the process model, and
lithographic off-axis focusing errors. The sources of stochastic variations are: wafer
unevenness, non-uniformity in resist thickness, and vibrations during lithography.
6
2.2 Physical Unclonable Functions (PUFs)
A PUF is a chip level structure that deliberately exploits random process
manufacturing variations to establish the chip’s identity. There are three common types
of delay PUFs that can be used to extract the delay introduced by the process variations:
Arbiter PUF (APUF), Butterfly PUF (BPUF), and ROPUF.
2.2.1 Arbiter Physical Unclonable Function (APUF)
An APUF is composed of two identically configured delay paths that are
stimulated by an activating signal as shown in Figure 2-1. The difference in the
propagation delays of the signals in the two delay paths is measured by an edge triggered
flip-flop known as an arbiter. There are two main components used in an APUF:
switches, and the arbiter [14]. Various response bits can be generated by configuring
different delay paths.
Figure 2-1: Arbiter Physical Unclonable Function (APUF).
7
2.2.2 Butterfly Physical Unclonable Function (BPUF)
A BPUF consists of two cross coupled latches as shown in Figure 2-2. The BPUF
exploits the random assignment of a stable state from an unstable state that is forcefully
imposed by holding one latch in preset while holding the other in clear mode by an
excitation signal. The final state is determined by the random delay mismatch in the pair
of feedback paths and the excitation signal paths due to process variations [15].
Figure 2-2: Butterfly Physical Unclonable Function (BPUF).
2.2.3 Ring Oscillator Physical Unclonable Function (ROPUF)
A ROPUF is composed of an odd series of inverters. The RO frequency is generated from
the inverted signal that travels through the RO loop as shown in Figure 2-3. The presence
of process variations inside logic gates and wires causes an uneven delay across the chip.
8
Figure 2-3: Ring Oscillator Physical Unclonable Function (ROPUF).
A pair of ROs could produce two different frequencies because of the presence of process
variations.
2.2.4 PUF Implementation on FPGA
Researchers have compared the implemetation of APUF, BPUF and ROPUF on
FPGAs [16]. For APUF and BPUF, the implementation on FPGA is tedious because both
designs need to be symmetric as shown in Table 2.1. It is almost impossible to get
symmetric design on an FPGA because the design needs to be mapped using a fixed
routing. ROPUF design does not need symmetric design which makes it the best
candidate for PUF on FPGAs. ROPUF implemetation on FPGAs require identical
instantiation as shown in Figure 2-3.
A pair of
ROs
9
Table 2.1: Comparison of ROPUF, APUF, and BPUF [16].
ROPUF APUF BPUF
Does not require
symmetric routing in a
building block.
Requires symmetric routing in
a building block.
Requires symmetric routing in
a building block.
Building blocks require
identical instantiation.
Identical instantiation of
building blocks may not be
necessary.
Identical instantiation of
building blocks may not be
necessary.
2.3 Challenge and Response on ROPUF
The response bit from an ROPUF can be generated by comparing the output
signals of two ROs. More response bits can be generated by comparing additional RO
pairs. The RO pair selection of RO pairs is determined by the challenge. For example,
RO1 and RO2 pair generates one response bit and RO3 and RO4 pair generates another
response bit. All ROs are connected to two MUXs, as shown in Figure 2-4. The challenge
bits for the ROPUF circuit shown in Figure 2-4 are applied at the input of each MUX.
Figure 2-4: ROPUF circuit.
10
The challenge selects one RO to each MUX. The selected RO from each MUX will be
fed into the counter to measure the number of cycles generated from each RO for a
certain period of time. After both counters have measured the number of cycles from
each RO, the comparator will compare the number of cycles generated from each RO.
Finally, the response bit is generated. The logic used is: if the number of cycles measured
from the first counter is larger than the number of cycles measured from the second
counter, then the response bit is ‘1’; otherwise, it is ‘0’ or vice versa.
2.4 RO Delay
Equation 2-1 shows the RO delay is comprised of three components [11]. The
parameter 𝑑𝑎𝑣𝑔is the delay component that comes from the routing and is the same for all
ROs. The parameter 𝑑𝑃𝑉𝑎 is the delay component that comes from the process variations
and is expected to be different for different ROs. The parameter 𝑑𝑁𝑂𝐼𝑆𝐸𝑎 comes from the
noise factor and is a dynamic component that changes over time. When the delay
between two ROs are compared (𝑑𝑎 and 𝑑𝑏), the 𝑑𝑎𝑣𝑔 cancels each other. Thus, the delay
difference between two ROs comes from process variations and noise delay components.
𝑑𝑎 = 𝑑𝑎𝑣𝑔 + 𝑑𝑃𝑉𝑎 + 𝑑𝑁𝑂𝐼𝑆𝐸𝑎 (2-1)
𝑑𝑏 = 𝑑𝑎𝑣𝑔 + 𝑑𝑃𝑉𝑏 + 𝑑𝑁𝑂𝐼𝑆𝐸𝑏 (2-2)
𝑑𝑎 − 𝑑𝑏 = (𝑑𝑃𝑉𝑎 − 𝑑𝑃𝑉𝑏 ) + (𝑑𝑁𝑂𝐼𝑆𝐸𝑎 + 𝑑𝑁𝑂𝐼𝑆𝐸𝑏 ) = ∆𝑑𝑃𝑉 + ∆𝑑𝑁𝑂𝐼𝑆𝐸 (2-3)
11
Chapter 3
Frequency Uniqueness in ROPUF on FPGA
Hardware security in Field Programmable Gate Arrays (FPGAs) that use PUF
rely on the ability to produce a large number of unique frequencies. This chapter explores
the frequency uniqueness as it relates to the number of stages used to build an RO.
3.1 Introduction
There are many techniques used by researchers in order to improve the reliability
and uniqueness of ROPUFs [6][8][9][10]. Reliability means that the secret key generated
by the ROPUF will be the same despite changing operating conditions [8]. Uniqueness on
the other hand refers to how each and every FPGA is able to generate a unique secret key
[8].
The measure of how much each ring oscillator frequency varies from the next is
called frequency uniqueness. By increasing the frequency uniqueness of a system, it is
possible to increase the security of that system. Until now, to the best of our knowledge,
there has not been any research on how the number of stages in a ring oscillator PUF
affects the frequency uniqueness. This chapter addresses this issue.
12
3.2 Experimental Setup
This section explains the procedure used to determine frequency uniqueness for
varying stages of ring oscillators. For this purpose, three FPGA development boards are
used. Each board has a single Xilinx Spartan 2 XC2S100 TQ144 FPGA. The three boards
generate a total of 60 ring oscillators for each stage. Initially, data was obtained at room
temperature. Various configurations of ring oscillators are tested, including rings
oscillators with 3, 5, 7, 9, and 11 stages. For each of these stages, the frequency produced
by each ring oscillator is measured and recorded.
The first step in designing the experiment is to create a hard macro for a single ring
oscillator. This ensures that the routing lengths for each ring oscillator are identical. It is
important that the routing lengths are the same so that there is no additional delay in any
single ring oscillator. The total delay for each ring oscillator is as shown in Equation 2-1.
For the FPGAs used in this experiment, one Configurable Logic Block (CLB) is
composed of two slices, as shown in Figure 3-1. For the hard macros used in this
experiment, each slice contains only one inverter. So an N stage ring oscillator will use N
inverters, N slices and N/2 CLBs. This macro is horizontally placed 20 times on the
Spartan 2 FPGA as shown in Figure 3-2. Each of these ring oscillators is connected to a 1-
to-20 demultiplexer which acts as an enable signal for each of the ring oscillators. The
purpose of this is to ensure that neighboring ring oscillators do not cause extra noise while
they are not in use. Enabling all oscillators at the same time will also produce extra heat
that could affect the frequencies being generated [13].
The output of the ring oscillators is fed to a 20-to-1 multiplexer with the same
select lines as the demultiplexer shown in Figure 3-2. The output of the multiplexer can
13
be sampled and measured. Measurements for this experiment are done using an Agilent
16801A Logic Analyzer. By using a logic analyzer, the entire waveform produced by the
ring oscillators is observed and counted. The counting feature of the logic analyzer is
particularly useful because the patterns produced by the ring oscillators are not uniform,
as shown in Figure 3-3. So, the frequencies reported are actually the average frequencies.
Figure 3-1: ROs mapped use three slices.
Figure 3-2: Test circuit diagram.
14
Figure 3-3: Frequency pattern for a 3-stage ring oscillator.
3.3 Results and Analysis
In this section, the results of the experiments described in the previous section are
presented and analyzed. This section is divided into two parts relating to the number of
stages used in the ring oscillator and the reasons why some ring oscillators vary in
frequency more than others. The first part focuses on the security of the Physically
Unclonable Functions on an FPGA as it relates to the number of stages used. The second
part focuses on the reasons why 3-stage ring oscillators have a higher variation in terms of
frequency as compared to ring oscillators with more stages.
3.3.1 Impact of Number of Stages on PUF
Figure 3-4 displays the average frequency (MHz) produced by the ring oscillators.
Each line represents the results based on the number of stages that were used while
implementing the ring oscillators on the Spartan 2 FPGA. For each of the different stage
ring oscillators, the frequency produced is nearly constant, except for the 3-stage ring
oscillators. The 3-stage ring oscillators have a much greater frequency variation compared
with others.
15
While ring oscillators with more than 3 stages may vary by a few MHz, 3-stage
ring oscillators have been shown to vary from 120 MHz to nearly 200 MHz. The
frequencies produced by 5, 7, 9, and 11 stage ring oscillators remain almost constant in
comparison. Their frequencies are centered around the 115 MHz, 80 MHz, 65 MHz and
55 MHz marks, respectively. As the number of stages increases, it appears that the
frequencies become more consistent. The value of the average frequency generated by the
ring oscillator also decreases as the number of stages increase. As more stages are added,
more delays are introduced in the circuit.
Table 3.1 shows the standard deviation of the frequencies produced from the ring
oscillators when configured with different number of stages. As the table shows, the
standard deviation is very low for all ring oscillators with more than 3 stages. A low
standard deviation indicates that the ring oscillators may be more susceptible to bit
flipping caused by noise, and therefore, erroneous results. For this reason, these ring
oscillators are not suitable for ROPUF applications. However, the larger standard
deviation in the 3-stage ring oscillator makes it more appropriate for ROPUF applications.
Table 3.1: Standard deviation for all RO stages.
Number of
stages
Standard
deviation
3 stages 11.3
5 stages 1.6
7 stages 0.74
9 stages 1
11 stages 0.62
In Table 3.1, the highest standard deviation for multistage ring oscillators occurs
when there are 3 stages. This indicates that a 3-stage ring oscillator will have the highest
16
frequency uniqueness of any of the multistage ring oscillators measured. Due to the high
frequency uniqueness, this model will be useful in generating secret keys, since the
likelihood of flipping the bits is low.
By assuming that there are N ring oscillators that produce unique frequencies, the
circuit will produce log2(N!) bits of entropy [6]. If 60 unique ring oscillators existed on a
device, 272 security bits could reliably be produced. If 100 unique ring oscillators existed,
525 security bits could reliably be produced. This is only possible when each frequency is
sufficiently unique from the others. As the frequency uniqueness is reduced, so is the
possible number of security bits generated.
As the frequencies of two ring oscillators approach each other, the possibility for
comparison errors increase due to noise. At one instance in time, the first frequency may
be faster; however, at another instance, the second might be slightly faster. This will result
in a flipped security bits, making the secured message completely unreadable by the
receiving party. To reduce the possibility of this happening, it is important that there be a
minimum difference between all ring oscillator frequencies that will be used. Ring
oscillators that are not sufficiently unique should not be used. To generate a large number
of security bits, it is important to maximize the number of unique frequencies produced by
the ring oscillators.
3.3.2 Explanation of High Frequency Variation in a 3-stage RO
This section discusses the reasons why 3-stage ring oscillators have the highest
variations among the ring oscillators tested. In [18], some of the basic ideas of why lower
stage oscillators have higher process variation are discussed. One of the reasons this
17
occurs is that increasing the number of stages used by a ring oscillator also increases the
correlation coefficient between the actual delays and the theoretical delays. As more
stages are added, the delay is less dependent on process variation. This means that the
frequencies generated by a ring oscillator will converge to a central frequency as the
number of stages increases.
The authors of [13] have shown that there are two types of variation within-die,
systematic and stochastic. Systematic variation is caused by lithographic off-axis focusing
errors, reticle stepper alignment errors, and mask errors due to inaccuracies in the process
model. Stochastic variation is caused by non-uniformity in resists thickness, vibrations
during lithography, and wafer unevenness. The researchers in [11] suggest that systematic
variations are the primary cause of process variation, and thus, the largest influence on
frequency uniqueness. Certain patterns are enforced in the die via systematic variation that
reduces frequency uniqueness as the ring oscillators increases in stages.
There is a direct relationship between the amount of space consumed on an FPGA
and the number of stages in a ring oscillator. Table 3.2 shows that as the number of stages
increases, so do the number of slices and CLBs used, and therefore, space consumed on
the FPGA also increases. As this area increases, the delay from one ring oscillator will
begin to correspond more with the delay from other ring oscillators [18]. In effective
ROPUF applications, this correspondence should be minimized. By minimizing the
correspondence in delay, the ROPUF will effectively be able to produce more secure bits,
and thus, increase the security of the application.
18
Table 3.2: Number of slice and CLB used on FPGA for single RO.
Ring oscillator Slice used CLB used
3 stages 3 2
5 stages 5 3
7 stages 7 4
9 stages 9 5
11 stages 11 6
(a)
Figure 3-4: Frequencies (MHz) for 3, 5, 7, 9, and 11 stage ring oscillators.
(a) FPGA Board 1.
19
(b)
(c)
Figure 3-4: Frequencies (MHz) for 3, 5, 7, 9, and 11 stage ring oscillators.
(b) FPGA Board 2.
(c) FPGA Board 3.
0
20
40
60
80
100
120
140
160
180
200
R o
w 1
R o
w 2
R o
w 3
R o
w 4
R o
w 5
R o
w 6
R o
w 7
R o
w 8
R o
w 9
R o
w 1
0
R o
w 1
1
R o
w 1
2
R o
w 1
3
R o
w 1
4
R o
w 1
5
R o
w 1
6
R o
w 1
7
R o
w 1
8
R o
w 1
9
R o
w 2
0
3-stages
5-stages
7-stages
9-stages
11-stages
F re
q u
e n
cy (
M H
z)
0
20
40
60
80
100
120
140
160
180
200
R o
w 1
R o
w 2
R o
w 3
R o
w 4
R o
w 5
R o
w 6
R o
w 7
R o
w 8
R o
w 9
R o
w 1
0
R o
w 1
1
R o
w 1
2
R o
w 1
3
R o
w 1
4
R o
w 1
5
R o
w 1
6
R o
w 1
7
R o
w 1
8
R o
w 1
9
R o
w 2
0
3- stages 5- stages 7- stages 9- stages
F re
q u
e n
cy (
M H
z)
20
3.4 Summary
The number of stages of a ring oscillator plays a critical role in generating secure
bits on a FPGA. By choosing the correct number of stages while designing a ring
oscillator, the number of unique frequencies can be maximized. As the number of unique
frequencies increases, the number of frequency comparisons also increases; thus, creating
more secure bits, which could be used in a secret key.
21
Chapter 4
Relationship between Number of Stages in ROPUF and
CRP Generation
4.1 Introduction
Physical Unclonable Function (PUF) is commonly used to prevent hackers from
stealing information from semiconductor chips. PUFs utilize the process variations on
the chip to create an irreversible function that generates unique response bits for each
challenge. A good response bit can be generated by comparing two Ring Oscillators (RO)
frequencies that have a significant amount of difference. An insignificant amount of
frequency difference can cause bit flip in the response bit generated. A higher threshold
for the frequency difference is preferred to dismiss the bit flip occurrence. As the
frequency difference threshold (FDT) increases, the number of challenge and response
pairs (CRP) is reduced. In this chapter, it is shown that the higher Standard Deviation
(SD) of RO frequencies can compensate the higher FDT. The Full Scan Technique (FST)
is used on different number of RO stages to determine the number of stages that have the
highest SD for RO frequencies. The experimental results show that the SD of RO
frequencies increase as the number of stages decrease. It is also shown that by reducing
the number of stages, good Inter-Hamming Distance (HD), Hamming Weight (HW), and
percentage of bit flip occurrences can still be obtained.
22
Despite the promising solution offered by ROPUF, there are still challenges that
need to be overcome for ROPUF to become a practical solution. Making the ROPUF
response better in uniqueness and increasing its reliability are among the challenges.
Uniqueness refers to the ability of similar ROPUF circuits to generate unique responses
on different chips. Reliability refers to the generation of the same response under various
environmental conditions, such as temperature and humidity.
This chapter focuses on the process variation extraction for a ROPUF on a FPGA.
Three different RO stages are tested and compared in terms of the SD, HW, and inter-
HD. The three different RO stages are tested using our new proposed Full Scan
Technique (FST), which records frequencies from all CLBs available on the FPGA.
4.2 Background
RO frequency is generated from the inverted signal that travels through the RO
loop, as shown in Figure 4-1. The presence of process variation inside logic gates and
wires causes an uneven delay across the chip. As a result, a pair of ROs will produce two
different frequencies: fa and fb. The frequencies are compared to see if fa is greater than
fb. If fa is greater than fb, response bit 1 is generated; otherwise, the response is 0 as
shown in Equation 4-1.
Figure 4-1: 5-stage RO.
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑏𝑖𝑡 = { 1 𝑖𝑓 𝑓𝑎 > 𝑓𝑏
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (4-1)
23
4.2.1 Number of Stages of the RO
In this experiment, there are three different number of stages used. Figure 4-1
shows the 5-stage RO where each component in the RO counts as one stage. The 5-stage
RO consists of one NAND gate and 4 inverter gates. The NAND gate is used to control
the switching of the RO. The RO is activated (starts to produce an oscillation) when the
input is set to high. Figure 4-2 shows the 3-stage RO. The 3-stage RO consists of one
NAND gate, one buffer gate and two inverter gates. The reason for using a 3-stage RO is
explained in section 4.4. One buffer gate is used instead of inverter gate because the
inverting components need to be odd in number in order to produce an oscillation. The
buffer gate is added to increase the total delay in the RO; therefore, reducing the RO
frequency. Finally, Figure 4-3 shows the 7-stage RO. The 7-stage RO consist of one
NAND gate and 6 inverter gates.
Figure 4-2: 3-stage RO.
Figure 4-3: 7-stage RO.
24
4.2.2 RO Parameters
There are number of parameters proposed to measure PUF performance, such as
uniformity, reliability, steadiness, uniqueness, diverseness, bit-aliasing, and probability of
misidentification [12][13][14][15][16]. In this research, 4 existing parameters and one
newly proposed parameter are used. The 4 existing parameters are chosen based on the
suitability of measuring the performance of the different number of stages used in the
ROPUF. The 4 parameters are uniqueness, reliability, uniformity and bit-aliasing. One
new parameter proposed in this research is diverseness. Uniqueness represents the ability
of a PUF to uniquely differentiate a particular chip among a group of chips of the same
type [12]. Uniqueness can be measured by calculating the inter-chip HD, as shown in
Equation 4-2. In this equation, m is the number of chips used, u and v are the two chips
being compared, and n is the number of response bits generated. Ru and Rv are the
response bits from the same challenge C for chip u and v. HD is the hamming distance
between response bits generated from chip u and v. A good uniqueness value is around
50%. This means that at least 50% of the responses generated from chip u and v differ
from each other (responses obtained by giving the same challenge to chip u and v).
𝑈𝑛𝑖𝑞𝑢𝑒𝑛𝑒𝑠𝑠 = 2
𝑚(𝑚−1) ∑ ∑
𝐻𝐷(𝑅𝑢,𝑅𝑣)
𝑛 × 100%𝑚𝑣=𝑢+1
𝑚−1 𝑢=1 (4-2)
Reliability refers to how efficient a PUF is in reproducing the response bits.
Reliability can be measured by using Equation 4-3 and 4-4. Rs is the response from chip i
at normal operating condition (at room temperature). Rs,t is t-th sample of R’s response
from chip i at different operating conditions such as different temperature setting. A good
25
reliability value is 100%. As can be seen in Equation 4-4, if the HD intra (comparison of
response under normal operating conditions and different operating conditions) is low or
zero, then the reliability is around 100%.
𝐼𝑛𝑡𝑟𝑎 − 𝑐ℎ𝑖𝑝 𝐻𝐷 = 1
𝑘 ∑
𝐻𝐷(𝑅𝑠,𝑅′𝑠,𝑡)
𝑛 × 100%𝑘𝑡=1 (4-3)
𝑅𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 100% − 𝐻𝐷 𝐼𝑛𝑡𝑟𝑎 (4-4)
Uniformity estimates how uniform the ratio of ‘0’s and ‘1’s is in the response bits
of a PUF. Uniformity can be measured by calculating the intra-chip Hamming Weight
HW, as shown in Equation 4-5 where rs,l is the l-th binary bit. A good value for
uniformity is around 50%, which means the response from RO is well distributed
between ‘0’s and ‘1’s.
𝑈𝑛𝑖𝑓𝑜𝑟𝑚𝑖𝑡𝑦 = 1
𝑛 ∑ 𝑟𝑠,𝑙 × 100%
𝑛 𝑙=1 (4-5)
Bit-aliasing estimates the uniformity of ‘1’s and ‘0’s in each bit in the responses
across a group of chips of the same type. Bit-aliasing can be measured by calculating the
inter-chip HW, as shown in Equation 4-6. A good value for bit-aliasing is 50%, which
means each bit in the responses across a group of chip is well distributed between ‘0’s
and ‘1’s. Uniformity and bit-aliasing are the parameters that can measure the randomness
features in the responses generated.
26
𝐵𝑖𝑡 − 𝑎𝑙𝑖𝑎𝑠𝑖𝑛𝑔 = 1
𝑚 ∑ 𝑟𝑠,𝑙 × 100%
𝑚 𝑖=1 (4-6)
Finally, the new parameter, diverseness, is used to measure the range of
frequencies in the different number of stages used in a ROPUF. Diverseness can be
measured by calculating the standard deviation SD of the frequencies from each stage
used in an ROPUF as shown in Equation 4-7, 4-8, and 4-9. h is the number of ROs used
on a chip. fi,j is the individual frequency for each RO. fi,j,q is the q-th frequency sample of
the j-th RO in the i-th chip. favg is the average frequency on a chip.
𝐷𝑖𝑣𝑒𝑟𝑠𝑒𝑛𝑒𝑠𝑠 = √ 1
ℎ−1 ∑ (𝑓𝑖,𝑗 − 𝑓𝑎𝑣𝑔 )
2ℎ 𝑗=1 (4-7)
𝑓𝑖,𝑗 = 1
𝑞 ∑ 𝑓𝑖,𝑗,𝑞
𝑞 𝑞=1 (4-8)
𝑓𝑎𝑣𝑔 = 1
ℎ ∑ 𝑓𝑖,𝑗
ℎ 𝑗=1 (4-9)
4.3 Experiment Setup
In this experiment, three Xilinx Spartan 2 XSA-100 boards are used. There are
600 CLBs on each chip, as shown in Figure 4-4 [19]. Each CLB contains two slices and
each slice contains two Lookup Tables (LUTs). One stage in the RO occupies one LUT.
One CLB is used for the 3-stage RO and two CLBs are used for the 5-stage and the 7-
27
stage RO. Six hundred 3-stage ROs and 300 5-stage and 7-stage ROs are mapped on each
chip.
Figure 4-4: Xilinx Spartan 2 CLBs layout.
The FPGA area is divided into two parts, left and right (three hundred CLBs on
each part). The experiment is run two times for each chip and RO stage. The first run
occupied the right area with ROs and the left area with other circuits needed, such as
MUX and counters. The blue boxes in Figure 4-4 show the occupied CLBs. As can be
seen on the right side of Figure 4-4, 300 ROs occupy half of the FPGA. The other half of
the FPGA is partially occupied by other logic used in FST. In the second run, the left area
is switched for other logic and right area for ROs. For each RO, the frequency is recorded
10 times. Overall, 18,000 frequencies for 3-stage ROs and 9,000 frequencies for each 5-
stage and 7-stage ROs are recorded.
Figure 4-5 shows the logic blocks for the FST test circuit. The challenge generator
produces the inputs to MUX that activate one RO at a time. Each RO is activated for 0.4
28
ms, and there is a 0.1 ms time period before the next RO is activated. This reduces the
noise in the form of heat that is generated from the adjacent CLB [20]. The RO is
activated from the top and moves down to the bottom of each column of the CLBs. A 0.2
ms gap between the RO and counter activation allows the signal to stabilize before the
measurement is started. The timing controller regulates all time intervals involved, such
as the time interval for each RO being activated and the time interval for the counter to
measure each RO.
Figure 4-5: FST circuit diagram.
Frequency is computed using Equation 4-10, where x is the cycle counts from
each RO and y is the cycle counts for the 50 MHz reference clock. The preset value for y
is set to be 7,000 cycles. That means the RO cycles are measured within a 0.14 ms
29
period. The accuracy of the measurement is 0.007 MHz/cycle which is adequate to record
the differences between frequencies generated from the ROs.
𝑓 = 𝑥 × 50
𝑦 MHz (4-10)
4.4 Results and Analysis
Response bits from 4, 5, and 7-stage ROs are generated to calculate diverseness,
uniformity, uniqueness, bit-aliasing, and reliability. The response bits are generated using
the chain-like neighbor coding method where neighboring ROs are compared [6]. The
first response bit generated from the comparison of RO1 is mapped in row 1 and column
1 of the CLB with RO2 mapped in row 2 and column 1 of the CLB. Equation 4-1 is used
for comparison.
Table 4.1 shows the diverseness, uniformity, uniqueness, and bit-aliasing for 4, 5,
and 7-stage ROs. The diverseness of frequencies for the 3-stage RO is the highest
compared to 5 and 7-stage. The results in Table 4.1 clearly show that as the number of
stages used in ROs is reduced, the diverseness of the RO frequencies increases. However,
there is a limitation on the usability of ROs with low number of stages because each
FPGA chip has a maximum operating frequency. The Spartan 2 FPGA family has the
maximum operating frequency of 200 MHz [19]. The lowest number of RO stages that
can be used on Spartan 2 FPGA is 4 because the average frequency generated from the 3-
stage RO on Spartan 2 is 182.77 MHz. The average frequency generated from the 3-stage
RO on Spartan 2 is 220 MHz, which exceeds the maximum operating frequency for
Spartan 2. If n RO is produces a frequency beyond the operating frequency of the FPGA,
30
the frequency from the RO cannot be measured correctly. Frequencies generated from the
RO for all stages are verified using the Agilent Logic Analyzer, where the RO output is
connected directly to the output pin of the FPGA board [17].
A high diverseness of RO frequencies is good for ROPUF because it indicates
that there are high amounts of frequency variations. High frequency variations are
desirable for generating a high number of good CRPs which are discussed later in this
section. For authentication in ROPUF applications, the challenge cannot be reused
because this reduces the security level of ROPUF as the response bits traverse the open
domain for verification and is susceptible to adversary attack [6]. This means that in
order to make the ROPUF effective, ample number of CRPs is needed.
A good RO should exhibit high diverseness for RO frequencies and should also
have good uniformity and uniqueness. As mentioned in Section 4.2, a good uniformity
and uniqueness average is 50%. For the uniformity average, the 5-stage ROs have the
highest value, and the 7-stage ROs have the lowest. However, the difference in
uniformity between the two is only 0.87%. The average uniformity results for all stages
used can be considered good as the values are close to 50%. High uniformity value means
the secret bits generated are uniformly distributed between 1s and 0s which is a desired
randomness characteristic.
31
Table 4.1: Uniqueness, bit-aliasing, diverseness, and uniformity for 3, 5, and 7-
stage ROs.
Stage Uniqueness (%) Bit-aliasing (%) Diverseness
(MHz) Uniformity (%)
3 40.178 47.0228 1.9469 47.0228
5 34.5596 47.7146 1.2375 47.7146
7 40.5797 46.1539 0.736 46.1538
Table 4.1 shows the 3-stage and 7-stage ROs have better uniqueness than the 5-
stage ROs for inter-chip measurements. Average uniqueness is obtained by comparing
the responses generated from all three FPGA chips. The higher the differences between
responses from each chip, the higher the value of uniqueness. It is important to make sure
that the uniqueness is high because this indicates that the ROPUF could generate unique
response from mass number of FPGA chips under the same challenge.
For bit-aliasing, the 5-stage RO has the highest percentage at 47.71%, and the 7-
stage has the lowest percentage at 46.15%. Nevertheless, all stages have good bit-aliasing
percentages that are close to 50%. Table 4.2 shows the diverseness frequencies for 4, 5
and 7-stage ROs for each FPGA chip used. The 3-stage ROs have the highest SD of RO
frequencies value for all three chips used compared to the other stages. These results are
consistent with the previous results presented in [17] where it was shown that as the
number of stages used in an RO is reduced, the diverseness of RO frequencies obtained is
higher. All three different FPGA chips showed the same pattern. It is found that the
diverseness of RO frequencies increases as the number of stages in ROs is reduced.
32
Table 4.2: SD for Chip 1, Chip 2, and Chip 3.
Standard Deviation (MHz)
CHIP 1 CHIP 2 CHIP 3
3-Stage 2.117440 2.586673 1.136851
5-Stage 0.938292 1.878756 0.895488
7-Stage 0.828066 0.821414 0.558649
We also study the effect of the number of stages on a ROPUF based on the
percentage of bit flip occurrences. To calculate the percentage of bit flip occurrences,
responses are recorded at different environmental conditions. In this experiment,
responses from 3, 5 and 7-stage ROs are generated at four different temperature settings,
as shown in Table 4.3. The experiment is conducted in a temperature controlled test
chamber. The frequencies from each RO are recorded 10 times at each temperature
setting. The responses are generated by comparing the average RO frequencies obtained.
The bit generation equation used is shown in Equation 4-1.
All responses obtained at various temperature settings are compared with the
responses generated at room temperature. The results obtained are shown in Table 4.3.
The lowest reliability is 97.32% at 0°C for 3-stage ROs. In this case 8 bits flipped out of
299 bits. The highest reliability is found to be 99.33% at 20°C for 4 and 7-stage ROs. In
this case 4 bits flipped, out of 599 bits for the 3-stage ROs, and 2 bits flipped out of 299
bits for the 5-stage ROs. From Table 4.3, it can be observed that reducing the number of
stages in ROs has no direct relationship with the percentage of bit flip occurrences.
33
Table 4.3: Reliability for Chip 3.
Reliability %
ROs stage 0°C 20°C 45°C 70°C
3 98.1636 99.3322 98.9983 98.9983
5 97.3244 98.9967 98.9967 97.9933
7 98.3278 99.3311 98.9967 98.6622
We also investigate the relationship between the number of stages used in ROs
and CRP generation. To do this, all possible comparison pairs need to be generated. It
should be noted that there is a differences between a challenge and comparison pairs. A
challenge is the selection of the comparison pairs to form a response bitstream. One
challenge can consist of many comparison pairs depending on the design of the challenge
and the length of the response. For example, a challenge that produces 128 bits reponse
might have 128 comparison pairs.
Figure 4-6 shows the list of possible challenge formations. The first three
response bits are generated from Pair 1, Pair 2, and Pair 3. Assume that comparison result
for Pair 1, Pair 2, and Pair 3 are 1,0, and 1, then response bits are 101. The challenges for
this response are the combination of the MUX inputs for Pair 1, Pair 2, and Pair 3 that are
0000 0001 0010. The number of possible challenges can be measure by n!/(n-r)!(r!)
where n is the number of available comparison pairs and r is the number of response bits.
As the number of available comparison pairs increases, the number of possible challenges
will also increase.
34
Figure 4-6: Difference between comparison pairs and CRPs.
The easiest way to generate all possible comparison pairs is by selecting a sort
algorithm that has O(n 2 ) complexity. As mentioned earlier, the comparison pairs
generated need to be good. This means each comparison pair needs to pass a certain
Frequency Difference Threshold (FDT). To determine the FDT, the frequency differences
at all bit flip occurrences on all FPGA chips are checked. The maximum frequency
difference that causes the bit flip is set as FDT. It is observed that the majority of bit flips
occur when the frequency difference between ROs is 1 MHz and below. The maximum
frequency difference where bit flips can occur is 3.5 MHz which is also the FDT.
35
The pseudocode of the algorithm used to generate the various comparison pairs is
shown below. The input to the algorithm are all RO frequencies generated at room
temperature. The algorithm compares the frequency difference between one RO with the
rest of the ROs available based on O(n 2 ) complexity. If the frequency difference passes
the FDT, then those ROs are selected as the comparison pair.
Comparison Pair Generation in pseudocode
Input: 1) 600 frequencies for 3-stage ROs and 300 frequencies for 5 and
7-stage ROs represented as RO frequencies(i).
2) n is equal to the number of ROs.
Output:
1) The list of all possible ROs comparison pairs that passed the
FDT represented as ROs comparison pair(k,i).
Algorithm 1. i <- 0, j <- 0, k <- 1 2. for i = 1 to n-1 3. for j = i + 1 to n 4. frequency difference = absolute (ROs frequecies(i)-RO
frequencies(j))
5. if frequency difference > FDT 6. comparison pair(k,1) = i 7. comparison pair(k,2) = j 8. k++ 9. end if 10. end for 11. End for
Table 4.4 shows the results obtained for comparison pair generation. It is
observed that the highest number of comparison pairs are generated from 3-stage ROs on
FPGA chip 2 at FDT equal to 2 MHz. The lowest number of comparison pairs are
generated from 7-stage ROs on FPGA chip 1 and 3 at FDT value equal to 3 and 3.5 MHz.
In general, Table 4.4 shows that the number of comparison pairs generated are higher
when the number of stages used in RO is reduced. As the FDT increases the number of
comparison pairs are reduced.
36
Table 4.4: Number of comparison pairs generated on Chip 1, Chip 2, and Chip 3.
Frequency Difference Threshold FDT (MHz)
2 2.5 3 3.5
FPGA Chip ROs stage Number of Comparison Pairs
1
3 45757 28746 16606 8685
5 5955 2749 1116 381
7 1221 167 8 1
2
3 102800 95161 86713 75831
5 22287 18422 14036 9776
7 1171 525 342 302
3
3 37932 23095 12769 7122
5 3811 2790 843 150
7 154 44 1 1
As mentioned earlier, the FDT used to filter all the bit flip occurrences is 3.5MHz.
In Table 4.4, it is observed that the 7-stage RO is adversely affected by the higher value
of FDT. The comparison pairs generated from 7-stage ROs are 302 on chip 2, and only 1
on chips 1 and 3. This shows that 7-stage ROs cannot be used in ROPUF as the lower
number of comparison pairs generated diminishes the ROPUF application. For 5-stage
ROs, the comparison pairs generated at FDT 3.5 MHz are very low for chips 1 and 3 (381
and 150). The 3-stage ROs have the highest number of comparison pairs that can be
generated at FDT equal to 3.5 MHz.
4.5 Summary
This experiment was run on the Xilinx Spartan 2 FPGA chip that uses 180 nm
semiconductor process technology. Therefore, conclusions are based on the Xilinx
Spartan 2 FPGA and cannot be generalized on different FPGA technology. For the
Spartan 2 FPGA chips, it can be concluded that the diverseness of RO frequencies
37
increases as the number of stages is reduced. The lowest number of stages that can be
used in an RO is dependent on the operating frequency of the FPGA chip. For Spartan 2
FPGA, the maximum operating frequency is 200 MHz. Therefore, the lowest number of
RO stages that can be used is 4 as the frequency produced from a 3-stage RO exceeds the
maximum operating frequency. This chapter shows that the lower number of stages used
in an RO does not compromise the uniqueness, uniformity, bit-aliasing, and reliability of
the ROs. The relationship between the number of stages used in the ROs and CRPs is
also established experimentally. It is found that more comparison pairs are generated
when lower number of stages is used.
38
Chapter 5
A Novel RPM Technique for Minimize Systematic
Variations
Because PUFs rely highly on process variations, the response bits generated are
governed by the systematic process variations which reduce the randomness in the
response bits. In this chapter, we describe a novel Random Patch Mixer (RPM) technique
to minimize the systematic variation effects on the response bits. The RPM technique is
applied on data obtained from FPGA chips. It is shown that the RPM technique
successfully nullifies the systematic variation effect on the response bits generated by the
ROPUF on FPGA. It also demonstrates that the responses generated after application of
the RPM Technique pass the NIST statistical test for randomness [21].
5.1 Introduction
The ROPUF produces a stream of ‘1’s and ‘0’s based on the process variation on
a silicon chip. The process variation is a random process that occurs during silicon chip
fabrication and is caused by inaccuracy in the fabrication process. This inaccuracy
produces a small delay that is not visible in the functional operation of the circuit on a
silicon chip. The ROPUF magnifies this small delay through the frequency generation
from a Ring Oscillator (RO). The difference in the frequencies generated by ROs is used
39
to generate a random binary bit stream, which is used for authentication or producing a
cryptographic encryption and decryption key [6]. The random binary bit stream is known
as the response. Each response is generated by a given challenge from the user. A
challenge is a binary bit stream that selects the RO pairs for comparison. Each challenge
produces a unique response.
The response generated from the ROPUF needs to be truly random for it to be
used for authentication or developing the cryptographic key. One way to verify true
randomness is by applying the NIST statistical test for randomness. Generating true
random response is one of the main challenges in a ROPUF. By default, a ROPUF does
not produce a true random response because the process variation is not completely
random [8]. In silicon chips, there are two types of variations, systematic and stochastic
[13]. Systematic variation is caused by process and equipment non-uniformity, dissimilar
interactions between circuit layout and chemical mechanical polishing process, and the
gradient of thermal annealing[13][27][28]. A stochastic variation is caused by the random
component that accounts for the difference between the observed data and the model
estimates. These include atomic-level stochastic phenomena, such as random dopant
profiles, any unidentified patterns, and measurement errors [3][28][29]. It seems that
systematic variation is more prevalent than stochastic variation in a ROPUF [8].
A systematic variation has a direct link with the true randomness in the response
generated from a ROPUF. It has been shown that the immediate output from a ROPUF
fails the NIST statistical test for randomness [29]. Therefore, the responses generated
from an ROPUF are not truly random [29]. Amongst the several techniques used to deal
with systematic variation is the regression based distiller [29]. The regression based
40
distiller is based on the polynomial regression and is applied before the secret selection
step. The regression based distiller has high computational cost when implemented on the
hardware.
In this chapter, a new Random Patch Mixer (RPM) technique used to cancel the
systematic variation effect on an FPGA chip is developed.
5.2 Systematic Variations and Bit Flip Minimization
5.2.1 The Effect of Systematic Variations
Shown in Figure 5-1 systematic variations cause neighboring frequencies to be
correlated to each other. The graph shows a repeating pattern. Response bits are
generated by comparing the neighboring ROs, RO-n and RO-n+1. The response is 1 if
RO-n is greater than RO-n+1, otherwise the response is 0. For RO-1 until RO-20, the
response A is 00110000100100100101, and for RO-21 until RO-40, the response B is
01010100110110100101. RO-1 to RO-20 are mapped on the first column of the CLBs
and RO-21 to RO-40 are mapped on the second column of the CLBs, as shown in Figure
5-2. The hamming distance for the two responses is 5, which is very low. This implies
that response A and B, are 75% similar to each other. This reduces the randomness of the
responses.
5.2.2 Regression Based Distiller
One way to normalize the systematic variation effect on frequency distribution is
to apply the regression-based distiller technique as proposed in [13]. This technique uses
polynomial regression to capture the systematic variation. The regression-based distiller
41
technique eliminates the systematic variation effect for polynomial regression of order 2
and above. The regression-based distiller has been tested on several challenge selection
techniques such as S-sequence, T-sequence, 1-out-of-8 coding and neighbor coding. The
responses generated from all 4 techniques are evaluated using the NIST randomness test,
but none of the responses fully pass the tests. Nonetheless the response from S-sequence,
T-sequence and 1-out-of-8 coding pass most of the tests. Only the neighbor coding
technique failed the entire test.
Figure 5-1: Frequencies across columns 1-3 of CLBs.
42
Figure 5-2: Location of ROs on Spartan 2 FPGA.
The regression-based distiller also incurs more computational cost. It can be seen
that majority of the challenge selection techniques require a polynomial regression of
order 2 or above to pass the NIST randomness test.
5.2.3 RPM Technique
The RPM technique is based on the uniform random number generated from the
pseudorandom number generator. There are three steps involved in this technique:
a) Generation of N uniform random numbers that range from 0 to 1 (N is equal to
number of ROs) as shown in Equation 5-1.
b) Normalization of the random numbers generated to the maximum value of RO
frequency difference from the average RO frequency on an FPGA chip as given
by Equation 5-2, 5-3, and 5-4. The normalized random numbers will be the Patch
of the RO frequencies.
43
c) Addition of Patch to the RO frequencies. We use Equation 5-5 to determine
how the Patch can be added to the RO frequencies.
𝑃𝑅𝑁 𝑥 = 𝑥𝑖 , 𝑥𝑖+𝑛, … 𝑥𝑛 {𝑖 = 1,2,3. . 𝑛} 0 < 𝑥 < 1 (5-1)
𝑓𝑎𝑣𝑔 = 1
𝑛 ∑ 𝑓𝑖
𝑛 𝑖=1 (5-2)
𝑎 = max{𝑓𝑖 − 𝑓𝑎𝑣𝑔 , 𝑓𝑖+1 − 𝑓𝑎𝑣𝑔 , … 𝑓𝑛 − 𝑓𝑎𝑣𝑔 } (5-3)
𝑥 ^ = 𝑎{𝑥𝑖 , 𝑥𝑖+1, … , 𝑥𝑛 } (5-4)
𝑓𝑖 ′ = {
𝑥𝑖 ^ + 𝑓𝑖 , 𝑖𝑓 𝑓𝑖 < 𝑓𝑎𝑣𝑔
𝑥𝑖 ^ − 𝑓𝑖 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(5-5)
The RPM technique is designed to be simple and easy to implement on the
ROPUF circuit. The technique utilizes the normalized pseudorandom number (Patch) to
improve the frequency distribution randomness. The Patch used is differrent for each of
the FPGA chips used. The Patch is stored in the memory as part of the ROPUF cirucit. A
concern with ROPUF security is whether or not an adversary could predict the responses
if he knows Patch of a certain ROPOF. In a later section, the responses generated after
the RPM technique has been applied, will be shown to have no correlation with the
Patch; hence, the responses cannot be predicted from a Patch.
5.3 Results and Discussion
In this section, the RPM technique is applied to the frequencies obtained from 3-
stage ROs on 29 Spartan 3E FPGA chips. The Spartan 3E chip has 240 CLBs and each
44
CLB has 4 slices with two LUTs on each slice. Each stage occupies one LUT. Thus, a 3-
stage RO can be fitted in one CLB. A total of 240 3-stage ROs are mapped on each
FPGA chip.
5.3.1 Systematic Variation Effect on Frequency Distribution
Figure 5-3 shows the frequency distributions across Spartan 3E FPGA 1, 2, and 3
for 3-stage ROs. In Figure 5-3 (a), the ROs on the left side region of the FPGA chip 1
produce lower frequencies, and the ROs in the center region produce higher frequencies.
It can be observed that the frequency of each RO is close to the neighboring ROs’
frequency. A similar trend can be observed on all three FPGA chips shown in Figure 5-3.
(a)
Figure 5-3: 3-stage RO frequencies across Spartan 3E.
(a) Board 1
45
(b)
(c)
Figure 5-3: 3-stage RO frequencies across Spartan 3E.
(b) Board 2
(c) Board 3
46
It is observed that for the 29 Spartan 3E FPGA chips, high frequency ROs are
grouped mostly in the center region and low frequency ROs are distributed around the
high frequency ROs’ region. Figure 5-3 shows the effect of systematic variation on the
ROs frequency distribution.
5.3.2 Systematic Variations Minimization by RPM Technique
Figure 5-4 shows the ROs frequency distribution after the RPM technique is
applied to FPGA chips 1, 2, and 3. It can be observed that the RPM technique efficiently
increases the frequency distribution randomness and minimizes the systematic variation
effect on the frequency distribution across FPGA chips 1, 2, and 3. There are no more
low and high frequency regions, as can be observed in Figure 5-3.
(a)
Figure 5-4: 3-stage RO frequencies across Spartan 3E FPGA after RPM technique has
been applied.
(a) Chip 1
47
(b)
(c)
Figure 5-4: 3-stage RO frequencies across Spartan 3E FPGA after RPM technique has
been applied.
(b) Chip 2
(c) Chip 3
48
Responses from FPGA Spartan 3E chip 1 before and after the RPM technique is
applied are compared to verify that the RPM technique successfully minimizes the
systematic variation effect on the responses generated. A response is generated using
neighbor coding. A total of 65 response bits are generated from each FPGA chip, as
shown in Table 5.1 and 5.2. The response bits are generated from the center part of the
Spartan 3E FPGA where the systematic variation is visibly present. The correlation for
responses before and after the RPM technique is applied is compared by measuring the
hamming distance (HD) percentage between each CLB column (comparisons are made
for CLB columns 7 until 11). For this comparison, an acceptable HD percentage is 50%
or higher, which means the responses generated between the CLBs columns have around
50% dissimilarity. Table 5.1 shows the response bits generated from each CLBs column
(table on the left) and the HD between neighboring CLBs column (table on the right)
before the RPM technique is applied. The results in Table 5.1 show that the HD
percentages for CLBs column 7-8 and 8-9 are low which represents the effect of the
systematic variation.
Table 5.2 shows the response bits generated from each CLBs column (table on
the left) and the HD between neighboring CLBs column (table on the right) after the
RPM technique is applied. The results in Table 5.2 show how the RPM technique
successfully minimized the systematic variation effect on the response generated. The
HD percentage for CLBs column 7-8 is increased from 15.38% to 46.15% and for CLBs
column 7-8 is increased from 30.77% to 53.85%
49
Table 5.1: Hamming Distance (HD) for FPGA chip 1 before RPM is applied.
CLBs Column
Hamming Distance HD between CLBs Column R
e sp
o n se
B it
s
7 8 9 10 11
7-8 8-9 9-10 10-11
0 0 0 0 0
0 0 0 0
1 1 1 1 1
0 0 0 0
0 0 0 0 1
0 0 0 1
1 0 0 1 0
1 0 1 1
1 1 1 0 1
0 0 1 1
0 0 1 0 0
0 1 1 0
1 1 1 1 1
0 0 0 0
1 1 0 1 0
0 1 1 1
0 0 0 0 1
0 0 0 1
1 1 0 1 0
0 1 1 1
1 0 1 0 0
1 1 1 0
0 0 0 0 1
0 0 0 1
1 1 1 1 0
0 0 0 1
2 4 6 8
HD Percentage % 15.38 30.77 46.15 61.54
Table 5.2: Hamming Distance (HD) for FPGA chip 1 after RPM is applied.
CLBs Column
Hamming Distance HD between CLBs Column
R e sp
o n se
B it
s
7 8 9 10 11
7-8 8-9 9-10 10-11
1 1 1 1 0
0 0 0 1
1 0 0 1 0
1 0 1 1
0 0 0 1 1
0 0 1 0
1 1 0 0 0
0 1 0 0
0 0 1 0 1
0 1 1 1
0 0 0 0 0
0 0 0 0
0 1 0 1 1
1 1 1 0
1 0 1 0 1
1 1 1 1
0 0 1 0 1
0 1 1 1
0 1 0 1 0
1 1 1 1
1 1 0 1 0
0 1 1 1
1 0 0 1 1
1 0 1 0
0 1 1 0 0
1 0 1 0
6 7 10 7
HD Percentage % 46.15 53.85 76.92 53.85
50
The responses from Patch values for each FPGA chip used are generated to
measure the correlations between a Patch and the responses generated from the ROs
frequency after applying the RPM technique. A total of 239 bits are generated from each
Patch and ROs’ frequency. This is done to ensure that no correlation exists between the
Patch and response generated. If there is a correlation between the Patch and response
generated, the level of ROPUF security is compromised. HD percentage is used to
measure the correlation between the response generated from the Patch and the ROs’
frequency. The average HD obtained for 29 FPGA chips is 49.97%. Therefore, there is no
correlation between the Patch and the response.
5.3.3 NIST Statistical Test for Randomness
The responses generated from a ROPUF need to be truly random to ensure a good
security level for the ROPUF. The NIST Statistical Test for Randomness can be used to
measure the randomness feature inside the response generated from a ROPUF. In this
research, we tested responses generated by using neighbor coding, and an 8-to-1 selection
technique for 29 FPGA chips.
A total of 240 ROs are used to generate the responses. Results obtained for the
NIST statistical test for randomness are shown in Table 5.3. The responses generated
after applying the RPM technique by using the Neighbor Coding selection passed the
entire test except for ‘runs’ test. The responses generated from the 8-to-1 selection passed
all the tests.
51
Table 5.3: NIST statistical test for randomness results.
Success Percentage (%)
Statistical test
NIST input
parameters
Neighbor
Coding 8-to-1
Frequency
100 100
BlockFrequency 128 100 100
CumulativeSums
100 100
Runs
Failed 100
LongestRun
60 100
FFT
100 100
NonOverlapping
Template 9 80 80
OverlappingTemplate 9 100 100
ApproximateEntropy 4 100 100
Serial 16 80 100
LinearComplexity 500 100 100
The results obtained are better than the regression-based distiller technique results [3].
None of the polynomial distiller order makes meaningful improvement in the randomness
for the neighbor coding selection. For the 8-to-1 selection, the responses passed all of the
tests only when the 4 th
order distiller is applied.
5.4 Summary
As a ROPUF utilizes the process variations to generate the secured response bits,
vulnerability still exists. The systematic variations dominated the overall process
variations; therefore it is important to nullify the systematic variation effect and increase
the true randomness on the response generation in a ROPUF. To address this issue, we
propose our RPM technique which gives better results in terms of the response
randomness generated from the ROPUF.
52
Chapter 6
Temperature, Voltage, and Aging Effects on ROPUFs
Function
6.1 Introduction
Silicon physical unclonable functions (PUFs) take advantage of the random
process variations inherent in silicon chips. The random process variations are unique for
each chip and cannot be modeled. This randomness and uniqueness characteristics of
silicon chips have been exploited by researchers in designing PUFs for hardware security.
As PUFs are highly reliant on process variations in the chip, it is desirable that they
should be resilient towards other temporal changes. The changes may occur due to
exposure to temporal variabilities which can be observed in the frequency of the Ring
Oscillator. The temporal variabilities can be divided into reversible and irreversible
variabilities. The reversible variability causes temporary changes to the circuit’s behavior
inside the silicon chip and can be caused by the environmental variations such as voltage
and temperature. However, overexposure to high voltage and temperature may lead to
irreversible variability [30]. Irreversible variability can also be caused by silicon chip
aging. There are three types of aging effects, the first one is the hot carrier injection or
HCI, the second is the trap charge in the dielectric due to bias temperature instability, and
the third is the oxide breakdown due to electrically active defects known as traps. These
traps occur within the dielectric. In this study, we simulate the first type of aging that is
53
caused by HCI. The HCI causes the electrical charges to build up within the dielectric
layer thereby increasing the threshold voltage needed to turn the transistor on. The
increased threshold voltage results in increased transistor switching time, thus slowing
the transistor speed [31].
In this chapter, we present accelerated aging experimental results along with
temperature and voltage effects done under normal environmental conditions on 9
Spartan 3E FPGAs. ROs having 3, 5, and 7-stages are mapped on Spartan 3E FPGAs.
Voltage and temperature variation experiments are performed separately (3 and 5-stage
ROs).
6.2 Background
We briefly describe work done in the past to study the effect of temperature,
voltage, and aging on ROs. Accelerated aging experiment for 5-stage ROs mapped on 90
nm FPGAs is presented in [32]. In this study, it is observed that aging causes ROPUF
responses to be unreliable. Simulated aging on ROs using HSPICE is presented in [33]. It
is observed that 4% of the ROPUF bits are prone to instability due to aging. Temperature
and voltage effects on ROs have been analyzed in [11]. This study concludes that
ROPUF reliability reduces due to voltage and temperature variations. Whereas prior work
is focused on studying the effect of temporal changes for fixed stage ROPUFs, we
analyze the effect of temporal changes on ROPUFs having different stages.
54
6.2.1 Ring Oscillator PUF response
RO frequency is typically generated from a series of inverters comprising the RO
loop. The presence of process variation inside the chip causes uneven delays across the
chip. Hence a pair of ROs mapped at two different chip locations produces two different
frequencies: fa and fb. Frequencies fa and fb are compared to see which one has the
higher frequency. If fa is greater than fb, a response bit 1 is generated; otherwise the
response is 0.
6.2.2 Number of Stages in Ring Oscillator
In this experiment, ROs having three different stages are used. The 5-stage RO
consists of one NAND gate and 4 inverter gates as shown in Figure 4-1. The NAND gate
is used to control the on and off switching of the RO. The RO is activated (starts to
produce an oscillation) when the input is set to high. The 3-stage RO consists of one
NAND gate, one buffer, and two inverter gates. The buffer is used instead of the inverter
to obtain an odd number of inversions. The buffer gate is added to increase the total delay
in the RO in order to reduce the RO frequency. The 7-stage RO consists of one NAND
gate and 6 inverters.
6.3 Experimental Setup
The experimental circuitry is shown in Figure 4-5. The challenge generator is used
to produce the inputs to the MUX which activates one RO at a time. ROs are activated,
one at a time, from the top to the bottom of each column of the FPGA. Each RO is
55
activated for 0.4 ms. There is a 0.1 ms delay before the next RO is activated; this is to
reduce the noise in the form of heat that can be generated from the adjacent CLB [20]. A
0.2 ms delay gap is given between the RO and the counter activation for the signal to be
stabilized before the measurement starts. The timing controller controls all time intervals
involved, such as the time interval for the RO activation and the time interval for the
counter to measure each RO.
The frequency is computed using x × (50/y), where x is the cycle counts from each
RO and y is the cycle counts for the 50 MHz reference clock. The preset value for y is set
to 10000 cycles implying that the RO cycles are measured within a 0.2 ms period. The
accuracy of the measurement is 0.005 MHz/cycle which is good enough to measure the
differences between frequencies generated from the ROs.
For the accelerated aging experiment on Spartan 3E, each RO is activated every
64 ms. Each activation turns on the RO for a time period of 0.4 ms. Thus, each RO is
activated 1.3 million times a day. This aging experiment is conducted for 30 days. The
number of ROs mapped on Spartan 3E is 120. ROs are numbered according to the
location they are mapped on the Spartan 3E as shown in Figure 6-1. Responses are
generated by using a neighbor chain scheme where RO(n) is compared with RO(n+1). In
total, there are 119 response bits generated from 120 ROs.
56
Figure 6-1: ROs numbering system based on spatial location.
6.4 Results and Analysis
Table 6.1 shows the total number of bit flip occurrences for Spartan 3E FPGAs
for the 30 day aging period. Responses from all FPGAs are recorded once every day.
Thus 30 responses are recorded for 30 days from each FPGA. These responses are
compared to the responses generated at normal setting to measure the bit flip occurrences.
The total bit flip occurrences for 3, 5, and 7 stage ROs are found to be 192, 250 and 267,
respectively (three FPGAs are used for each RO stage). FPGA 3 has the lowest number
of bit flip occurrences of 18. This is because many of the RO comparison pairs in FPGA
3 have a high frequency difference. On the other hand, FPGA 9 has the highest number of
bit flip occurrences since many of the RO comparison pairs have a low frequency
difference. Figure 6-2 (a) shows the example of the bit flip occurrence when the
difference between RO comparison pair is small (below 1 MHz) [6]. The frequency
57
generated from the blue RO tends to reduce faster when compared to the green RO’s
frequency when the temperature is increased, therefore bit flip occurs. Figure 6-2 (b)
shows how the bit flip occurrence can be prevented by selecting an RO comparison pair
that has higher frequency difference. It is important to note that most of the bit flips occur
at the same bit locations which have lower frequency difference in the RO pairs.
Table 6.1: Bit flip occurrences on Spartan 3E.
3-stage ROs 5-stage ROs 7-stage ROs
FPGA FPGA FPGA
1 2 3 4 5 6 7 8 9
88 86 18 75 78 97 82 82 103
Total 192 250 267
(a) (b)
Figure 6-2: The relationship between the RO frequency distance and the probability of a
PUF output flip.
Table 6.2 shows the bit flip occurrences for 3, 5, and 7-stage ROs with respect to
frequency differences in RO pairs. For 5 and 7-stage ROs, most of the bit flips occur
when the frequency difference in RO comparison pairs is lower than 0.4 MHz. Few bit
flips occur when the frequency difference lies between 0.3 and 0.7 MHz. For the 3-stage
ROs, maximum bit flips occur when the frequency difference is lower than 0.3 MHz.
58
There are some bit flips at the higher frequency range. These results suggest that the 3-
stage ROs are more susceptible to noise compared to 5 and 7-stage ROs. For the
maximum frequency difference (1.0-1.2 MHz), the number of bit flips in the 3-stage
ROs is 8 which still can be considered as low since 3-stage ROs have a standard
deviation of 1.9 MHz compared to 1.2 and 0.7 MHz for 5 and 7-stage ROs, respectively
[34]. High standard deviation in ROs implies that the range between the minimum and
maximum frequency used in the ROPUF is high. Therefore, many RO comparison pairs
that have high frequency differences are generated.
Figure 6-3 (a), (b), and (c) show the frequency changes due to aging effects for 10
different ROs for 3, 5, and 7-stages. It can be seen that the 3, 5, and 7-stage ROs have
minimal frequency fluctuations for the 30 day aging period. Some frequencies overlap
(e.g. RO1 and RO2, RO4 and RO9 in Figure 6-3 (b)). This illustrates how bit flips can
occur. It can also be seen from Figure 6-3 that there is no significant difference in
frequency fluctuations, as a result of aging, when different number of stages are used in
the ROPUF.
59
Table 6.2: Bit flip occurrences due to aging on Spartan 3E.
RO Pairs’ Frequency Differences Ranges (MHz)
Bit Flip Occurrences
3-stage 5-stage 7-stage
0-0.09 46 109 182
0.1-0.19 42 114 70
0.2-0.29 46 14 7
0.3-0.39 11 12 1
0.4-0.49 12 0 0
0.5-0.59 7 1 1
0.6-0.69 5 0 2
0.7-0.79 12 0 1
0.8-0.89 2 0 0
0.9-0.99 1 0 0
1.0-1.2 8 0 0
60
(a)
(b)
(c)
Figure 6-3: Frequency changes in ROs due to the aging effect on Spartan 3E.
(a) 3-stage ROs
(b) 5-stage ROs
(c) 7-stage ROs
61
Figure 6-4 (a) and (b) show the frequency changes with respect to the temperature
variations for 40 ROs that are mapped at different spatial locations on the same FPGA
chip for 3 and 5-stage ROs. Responses are generated at three different environment
temperatures, namely, room temperature, 45°C and 70°C. Different temperatures are
generated using temperature chamber as shown in Figure 6-6. It is observed that the ROs
are sensitive to the temperature variations. As the environment temperature increases,
both 3 and 5-stage RO frequencies decrease uniformly. Similar patterns are observed for
all RO frequencies at each of the three different environment temperatures which
suggests that the effect of the temperature variations are uniformly distributed on the
FPGA chip.
Figure 6-5 (a) and (b) show the frequency changes with respect to the voltage
variations for 40 ROs that are mapped on different locations (as shown in Figure 6-1) in
the same FPGA chip for 3 and 5-stage ROs. Responses are generated for three different
internal core supply voltages (VCCINT), namely, 1.2V (normal), 1.3V and 1.4V. It is
observed that both 3 and 5-stage RO frequencies are sensitive to the voltage variations.
The average frequency increment is 20 MHz for 1.3V and 50 MHz for 1.4V when
compared to the normal 1.2V VCCINT. Although the frequency has high increment with
respect to the higher VCCINT, the RO frequency for 3 and 5-stage ROs follow the same
pattern which implies the voltage variation effect is uniformly distributed throughout the
FPGA chip. The bit flips do occur due to temperature and voltage variations but only
when the frequency difference in the RO comparison pair is lower than 1.5 MHz.
62
(a)
(b)
Figure 6-4: Frequency changes with respect to the temperature variations on Spartan 3E.
(a) 3-stage ROs
(b) 5-stage ROs
63
(a)
(b)
Figure 6-5: Frequency changes with respect to the voltage variations on Spartan 3E.
(a) 3-stage ROs
(b) 5-stage ROs
64
Figure 6-6: Temperature chamber.
Table 6.3 shows the percentages of bit flip occurrences due to temperature,
voltage variations, and aging on 9 Spartan 3E FPGAs. For temperature variations, the
responses from ROPUF are generated at three different settings: room temperature, 45°C,
and 70°C. For voltage variations, the responses are generated using three different
internal core supply voltages: 1.2V (normal), 1.3V, and 1.4V.
Responses generated at different temperature and voltage settings are compared
with the responses generated at normal settings to measure the percentage of bit flip
65
occurrences. As the number of stages increases, the percentage bit flip occurrences also
increase (except at 70°C). Voltage variations seem to be causing the most bit flip
occurrences followed by temperature and aging. The maximum bit flip percentages for 3,
5, and 7-stage ROPUFs are 2.8%, 5.6%, and 8.4%, respectively. Based on our
experimental results, we conclude that bit flips occur only when the frequency difference
in the RO comparison pair is lower than 1.5 MHz
Table 6.3: Percentage of bit flip occurrences.
Percentage of bit flips occurrences (%)
RO number of stages Temperature Voltage
Aging 45°C 70°C 1.3V 1.4V
3 2.24 1.96 2.24 2.80 1.79
5 2.52 3.92 5.88 5.60 2.33
7 4.20 3.08 7.28 8.40 2.46
The results presented in this chapter suggest that the temporal variabilities can
affect the ROPUF functionality only if the frequency difference between RO comparison
pair is low. We propose that a high threshold be used to select the RO comparison pairs
in ROPUF to prevent the effect of temporal variabilities. Table 6.4 shows the number of
RO comparison pairs generated based on different frequency thresholds. The comparison
pairs are generated using select sort algorithm that has O(n 2 ) complexity [34].
The higher number of RO comparison pairs are also required for better security
[34]. The results in Table 6.4 suggest that the 3-stage RO has better security feature as it
has the highest number of RO comparison pairs compared to the 5 and 7-stage ROs.
66
Table 6.4: Number of comparison pairs according to threshold frequency.
Threshold Frequency (MHz)
2 2.5 3 3.5
ROs stage Number of RO comparison pairs
3 4671 4158 3682 3152
5 4043 3446 2860 2336
7 3128 2344 1701 1191
6.5 Summary
In this chapter, we study the effect of accelerated aging, voltage, and temperature
variations for different number of stages used in a ROPUF. Our experimental results show
that RO frequencies are sensitive to aging, voltage, and temperature regardless of the
number of RO stages used in a ROPUF. The percentage of bit flips is observed to increase
as the number of stages increase. Most bit flips occur when the frequency difference
between RO comparison pairs is low. We suggest that only RO comparison pairs that have
high frequency differences be used in a ROPUF in order to reduce temporal variabilities.
Our work shows that the 3-stage ROPUF has the lowest percentage of bit flip occurrences
and is more secure.
67
Chapter 7
A Comparative Study of Ring Oscillator PUFs on
Different FPGA Families
7.1 Introduction
ROPUF utilizes ring oscillators (ROs) to exploit the process variation inside a
silicon chip to generate a unique ID. A typical ROPUF comprises of ring oscillators,
multiplexers (MUXs), counters, and a comparator. A ROPUF can generate a binary bit
stream (response) from a given input bit stream (challenge). A ROPUF can generate
multiple sets of responses from different sets of challenges. A challenge that produces a
response is known as a challenge-response pair (CRP).
Earlier studies have shown that ROPUF can be implemented on FPGAs [3][44].
The fact that ROPUF circuits do not need to be symmetric compared to other types of
PUFs, such as Arbiter PUF (APUF) and Butterfly PUF (BPUF) that require a stringent
symmetric circuit makes the ROPUFs attractive. The only requirement in ROPUF circuit
is that the ROs need to be identical, and this can be achieved by creating a hard macro for
the RO and instantiating it as many times as needed. If two ROs are identical then the
difference in the frequencies generated is due to process variation.
FPGA security is a concern among the FPGA manufacturers. FPGAs are prone to
several security issues such as IP protection, cloning, side channel attack and tampering.
Xilinx, for example, has introduced Device DNA as the additional security feature in
68
some of its FPGA chips [45]. ROPUF can be used as an additional security feature in
FPGAs. Any tampering attempt by hackers will change the unique parameters of the
process variation [3]. ROPUF can be applied as a secret bit generator (which is known as
response in PUF applications) where it can generate n bits of response for authentication
purpose. Besides that, response bits generated from ROPUF can be applied as a
cryptography key to encode and decode secure information [6].
Despite the promising solution offered by ROPUF, there are still challenges that
need to be overcome for ROPUF to become a practical solution. Making the ROPUF
response better in uniqueness and increasing in reliability are among the challenges.
Uniqueness refers to the ability of similar ROPUF circuits to generate unique responses
on different chips. Reliability refers to the generation of same response under various
environmental conditions such as temperature and voltage. ROPUF’s reliability can also
be affected by silicon aging.
Current FPGA families are fabricated using the latest silicon technology which
provides smaller transistor size. Smaller transistors size gives better performance on the
FPGA chip in terms of speed and power consumption, but in terms of the performance of
ROPUF implementation on FPGA, it still needs to be studied [6]. As the silicon
technology shrinks, the process variation parameters will also change [13]. In this work,
we analyze ROPUF parameters on two different Xilinx FPGA families that use different
silicon technologies; 28 nm technology (Artix 7) and 90 nm technology (Spartan 3E).
The work focuses on:
1) ROPUF’s comparison on two FPGA families that use different
silicon technologies: We compare ROPUF’s responses from two different
69
FPGA families in terms of five parameters; uniqueness, reliability,
uniformity, bit aliasing, and diverseness.
2) Temperature, voltage, and aging effects: For reliability, we compare the
responses generated at different temperature and voltage settings. We also
compare the responses generated through an accelerated aging
experiment.
7.2 Related Work
Some work has been done in the past to study ROPUF performance on FPGA.
Large scale characterization of ROPUF on Spartan 3E (90 nm silicon technology) FPGAs
has been done in [11]. They show that the average inter-die hamming distance (HD) for
ROPUF is 47.31% and the average intra-die HD is 0.86% at normal operating condition.
The hamming weight (HW) for ROPUF responses is shown to lie between 46% and 56%.
In [6], implementation of ROPUF on Virtex 4 (90 nm silicon technology) FPGAs is
presented. It is shown that the inter-chip HD is 46.15%. The accelerated aging
experiment on 5-stage ROs mapped on Spartan 3E FPGAs is presented in [46]. It is
observed that aging causes ROPUF responses to be unreliable. Simulated aging on ROs
using HSPICE is shown in [47]. It is observed that 4% of the ROPUF bits are prone to
instability due to aging. The experiment on temperature and voltage effects on ROs is
presented in [11]. It is shown that ROPUF reliability reduces due to voltage and
temperature variations.
70
7.3 Background
7.3.1 Ring Oscillator PUF response
RO frequency is typically generated from a series of inverters comprising the RO
loop. The presence of process variation inside the chip causes uneven delays across the
chip. Hence a pair of ROs mapped at two different chip locations produces two different
frequencies: fa and fb. Frequencies fa and fb are compared to see which one has the
higher frequency. If fa is greater than fb, a response bit 1 is generated; otherwise the
response is 0.
7.3.2 Number of Stages in Ring Oscillator
In this experiment, 5-stage ROs are used. The 5-stage RO consists of one NAND
gate and 4 inverter gates as shown in Figure 4-1. The NAND gate is used to control the
on and off switching of the RO. The RO is activated (starts to produce an oscillation)
when the input is set to high.
7.3.3 ROPUF parameters
For PUF implementations, different researchers have used different parameters in
the past [11][34][48]. In this work, we use five of the most common parameters. These
parameters are uniqueness, reliability, uniformity, bit-aliasing and diverseness. The
uniqueness can be measured by comparing the Hamming Distance (HD) between
responses from different FPGA chips in the same family. The equation used to measure
the uniqueness is shown in Equation 7-1:
71
𝑈𝑛𝑖𝑞𝑢𝑒𝑛𝑒𝑠𝑠 = 2
𝑚(𝑚−1) ∑ ∑
𝐻𝐷(𝑅𝑢,𝑅𝑣)
𝑛 × 100% 𝑚𝑣=𝑢+1
𝑚−1 𝑢=1 (7-1)
where, m is the number of chips used, u and v are the two chips being compared,
and n is the number of responses generated. Ru and Rv are the response from the same
challenge C for chips u and v. HD is the hamming distance between the responses
generated from chips u and v. The higher uniqueness percentage represents the better
uniqueness in the response generated from ROPUF. But considering the large number of
response bits, a good uniqueness percentage should be around 50%. This means that at
least 50% of the responses generated from chip u and v differ from each other (responses
obtained by given the same challenge to chip u and v).
The reliability can be measured by comparing the response from the same FPGA
chip that is generated under different environmental conditions such as temperature and
voltage. The equation used to measure the reliability is shown in Equations 7-2 and 7-3.
Rs is the response from chip i at normal operating condition (at room temperature and
normal operating voltage). Rs,t is t-th sample of R’s response from chip i at a different
operating condition such as different temperature and voltage settings. A good reliability
value is 100%. As can be seen in Equation 7-3, if the HD intra is low or zero, then the
reliability will be around 100%.
𝐻𝐷 𝐼𝑛𝑡𝑟𝑎 = 1
𝑘 ∑
𝐻𝐷(𝑅𝑠,𝑅′𝑠,𝑡)
𝑛 × 100%𝑘𝑡=1 (7-2)
𝑅𝑒𝑙𝑖𝑎𝑏𝑖𝑙𝑖𝑡𝑦 = 100% − 𝐻𝐷 𝐼𝑛𝑡𝑟𝑎 (7-3)
72
The uniformity and bit-aliasing parameters can be measured by using Hamming
Weights (HWs) as shown in Equation 7-4 and 7-5 where rs,l is the l-th binary bit. The HW
of the response from an FPGA chip represents the uniformity and the HW of the
responses from different FPGA chips represents the bit-aliasing. The HW for bit aliasing
is measured across the same bit location in responses from different FPGA chips. A good
value for uniformity and bit aliasing is around 50%, which means the response from RO
is well distributed between ‘0’s and ‘1’s.
𝑈𝑛𝑖𝑓𝑜𝑟𝑚𝑖𝑡𝑦 = 1
𝑛 ∑ 𝑟𝑠,𝑙 × 100%
𝑛 𝑙=1 (7-4)
𝐵𝑖𝑡 − 𝑎𝑙𝑖𝑎𝑠𝑖𝑛𝑔 = 1
𝑚 ∑ 𝑟𝑠,𝑙 × 100%
𝑚 𝑖=1 (7-5)
The diverseness of the frequency can be measured by using standard deviation as
shown in Equation 7-6, 7-7 and 7-8. Diverseness represents the range of the frequency
generated from the ROs. A ROPUF’s diverseness that has a value which is close to 0
shows that the ROs’ frequencies tend to be very close to the ROs’ mean frequency. A
high diverseness shows that the frequencies are spread out over a wider range of values.
The advantages of having higher diverseness have been discussed in detail [34]. In
equation 7-6, h is the number of ROs, fi,j is frequency for each RO, fi,j,q is the q-th
frequency sample of the j-th RO in the i-th chip. favg is the average frequency of the ROs
on an FPGA chip.
𝐷𝑖𝑣𝑒𝑟𝑠𝑒𝑛𝑒𝑠𝑠 = √ 1
ℎ−1 ∑ (𝑓𝑖,𝑗 − 𝑓𝑎𝑣𝑔 )
2ℎ 𝑗=1 (7-6)
73
𝑓𝑖,𝑗 = 1
𝑞 ∑ 𝑓𝑖,𝑗,𝑞
𝑞 𝑞=1 (7-7)
𝑓𝑎𝑣𝑔 = 1
ℎ ∑ 𝑓𝑖,𝑗
ℎ 𝑗=1 (7-8)
7.4 Experimental Setup
In this work, ROPUF performance on 29 Xilinx Spartan 3E and 20 Xilinx Artix-7
FPGA chips is analyzed. Test circuitry that runs completely on the FPGA chip has been
developed. The ROs’ frequencies are recorded using Agilent 16801A logic analyzer. The
architecture of the design is shown in Figure 4-5. The challenge generator is used to
produce the inputs to the MUX which activates one RO at a time. ROs are activated, one
at a time, from the top to the bottom of each column of the FPGA. Each RO is activated
for 0.4 ms. There is a 0.1 ms delay before the next RO is activated; this is to reduce the
noise in the form of heat that can be generated from the adjacent CLB [20]. A 0.2 ms
delay gap is given between the RO and the counter activation for the signal to be stabilized
before the measurement starts. The timing controller controls all time intervals involved,
such as the time interval for the RO activation and the time interval for the counter to
measure each RO.
The frequency is computed using Equation 7-9 where x is the cycle counts from
each RO and y is the cycle counts for the 50 MHz reference clock. The preset value for y
is set to 10000 cycles implying that the RO cycles are measured within a 0.2 ms period.
74
The accuracy of the measurement is 0.005 MHz/cycle which is good enough to measure
the differences between frequencies generated from ROs.
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 = 𝑥 × 50
𝑦 (7-9)
For the accelerated aging experiment on Spartan 3E and Artix-7 FPGAs, each RO
is activated every 64 ms and 107.5 ms respectively. Each activation turns the RO on for a
time period of 0.4 ms. Therefore, each RO is activated 1.3 million times a day for Spartan
3E and 0.8 million times a day for Artix-7. This aging experiment is conducted for 30
days. The number of ROs mapped on Spartan 3E and Artix-7 is 120 and 171 respectively.
ROs are numbered according to the location they are mapped as shown in Figure 6-1.
Responses are generated by using a chain-like neighbor coding where RO(n) is compared
with RO(n+1). In total, there are 119 response bits generated from 120 ROs for Spartan
3E and 170 response bits are generated from 171 ROs for Artix-7.
7.5 Results and Analysis
ROs are mapped on all the CLBs available on the Spartan 3E FPGAs, and on half
of the CLBs available on Artix-7 to record the frequencies. Responses are generated from
the frequencies recorded. Chain-like neighbor coding technique is used to select the RO
comparison pair [3]. Equation 7-10 is used to generate the response. Table 7.1 shows the
uniqueness, reliability, uniformity, bit aliasing and diverseness results for both FPGA
families used in this experiment.
75
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑏𝑖𝑡 = { 1 𝑖𝑓 𝑓𝑎 > 𝑓𝑏
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 (7-10)
Table 7.1: ROPUF’s parameters comparison.
Spartan 3E
(90nm)
Artix-7
(28nm)
Uniqueness (%) 39.79 45.15
Uniformity (%) 51.25 50.17
Bit aliasing (%) 50.54 50.17
Reliability (%) 96.34 97.28
Diverseness 2.09 3.88
7.5.1 ROPUF Uniqueness
ROPUF responses on Artix-7 has the highest uniqueness percentage (45.16%)
compared to Spartan 3E (39.79%). Each response from Artix-7 and Spartan 3E contains
780 bits and 239 bits respectively. The Artix-7 used in this experiment has 101,440 logic
cells compared to the Spartan 3E that has 2160 cells. Thus Spartan 3E has limited
resources compared to Artix-7. The maximum number of ROs that can be mapped on
Spartan 3E is 240. Artix-7 uniqueness is shown to be closer to the ideal uniqueness value
of 50%. Spartan 3E uniqueness seems to be a little bit far from the ideal uniqueness
value.
Figure 7-1 and Figure 7-2 show the planar view of graphs for RO frequencies
versus RO locations for Spartan 3E and Artix-7. These graphs are plotted to better
understand the uniqueness difference in ROPUF responses between these two FPGA
families. Figure 7-1 shows RO frequencies from three Spartan 3E FPGAs. The dark blue
blocks are the inaccessible area on the FPGA. It can be observed that ROs with high
76
frequency are mostly distributed in the red circle. The same observation can be made for
all 29 Spartan 3E FPGAs where most of the ROs with high frequencies are located in the
middle of the FPGA. Figure 7-2 shows RO frequencies for the three Artix-7 FPGAs. In
Figure 7-2, (a) and (b), it can be observed that most of the ROs with high frequency are
located in the top part of the FPGAs. But this same observation is not found in the 20
Artix-7 FPGAs.
(a) (b) (c)
Figure 7-1: RO frequencies versus location on Spartan 3E.
(a) FPGA 1
(b) FPGA 2
(c) FPGA 3
77
(a) (b) (c)
Figure 7-2: RO frequencies versus location on Artix-7.
(a) FPGA 1
(b) FPGA 2
(c) FPGA 3
78
The high frequency distribution at the same FPGA location in Spartan 3E FPGAs
used in this experiment demonstrates the effect of systematic variation. There are two
types of process variations: systematic and stochastic variation [13]. The systematic
variation is usually caused by the mask, lithographic, and reticle stepper errors.
Systematic variation has high correlation on all ICs that are manufactured on the same
line. The stochastic variation is caused by the vibrations during lithography, wafer
unevenness and non-uniformity in resist thickness. Stochastic variation possesses more
random characteristics, which are different for each chip [13]. Stochastic variation’s
effect can be observed when there is a random high and low RO frequency distribution.
On the contrary, systematic variation’s effect can be observed when there is a certain
pattern of high or low RO frequency distribution in a group of FPGAs. ROPUF response
uniqueness decreases due to the systematic variation on FPGAs. Response bits generated
from ROs located in an area that is affected by systematic variation tends to be the same
[49].
7.5.2 ROPUF Uniformity
As far as uniformity is concerned, the Spartan 3E and Artix-7 chips both have
good uniformity percentages (51.25% and 50.17%) which represent a good balance
between ‘1’ and ‘0’ bits in the responses. The uniformity result shows that the ROPUF
carry the randomness feature in the response within the FPGA chip regardless of the
change in the silicon technologies used. This result also shows that the systematic
variation effect that is observed in Spartan 3E FPGAs does not affect the ROPUF’s
uniformity.
79
7.5.3 ROPUF Bit Aliasing
The bit aliasing percentages for Spartan 3E (50.54%) and Artix-7 (50.17%) chips
are close to the ideal value of 50%. These results represent that there is a balance in the
bits ‘1’ and ‘0’ composition across the same bit location in the responses from different
FPGAs. The ROPUF responses are observed to carry the randomness feature between the
FGPAs in the same family regardless of the change in the silicon technologies used.
7.5.4 ROPUF Reliability
The ROPUF’s responses are generated at different temperature and voltage
settings to measure the reliability. Temperature variation experiment is done using a
temperature chamber. Four different temperature settings are used: 0°C, 20°C, 45°C, and
70°C. For voltage variations, three different internal core supply voltages (VCCINT) are
used for Spartan 3E: 1.2V (normal), 1.3V, and 1.4V. For Artix-7, two different VCCINT are
used: 1.0V (normal) and 1.2V. The responses that are generated at the different
temperature and voltage settings are compared with the responses that are generated at
room temperature. The 30 day accelerated aging experiment is also implemented to
extend the ROPUF’s reliability study on FPGAs that are fabricated using different silicon
technologies.
The average ROPUF’s reliability for Spartan 3E and Artix-7 is 96.34% and
97.28%, respectively. Table 7.2 shows the individual ROPUF’s reliability due to
temperature, voltage, and aging effects on ROPUF. It can be seen that Artix-7 reliability
is higher than Spartan 3E for temperature, voltage, and aging. The lowest ROPUF’s
reliability for both Spartan 3E and Artix-7 is caused by the voltage variations that are
80
94.26% and 94.82% respectively. The ROPUF’s reliability on the temperature and
voltage effects for both Spartan 3E and Artix-7 is fairly high.
Table 7.2: ROPUF’s reliability due to change in temperature, voltage, and aging.
Reliability (%)
Spartan 3E Artix-7
Temperature 97.10 98.16
Voltage 94.26 94.82
Aging 97.67 98.86
Figures 7-3 and 7-4 show the RO frequencies with respect to the temperature and
voltage variations for Spartan 3E and Artix-7, respectively. Figure 7-3 (a) and Figure 7-4
(a) show how the RO frequencies on Spartan 3E and Artix-7 decrease when the
environment temperature increases. It can be observed that the frequency for each RO is
decreasing uniformly with respect to the temperature changes. This observation suggests
that the temperature changes affect the RO frequency uniformly regardless of its location.
The only difference that can be noticed in the RO frequency changes due to the
temperature effect between Spartan 3E and Artix-7 is the frequency decrement quantity.
The frequency of the ROs on Spartan 3E reduces 2 to 3 MHz on an average at 45°C and 5
to 6 MHz at 70°C. For Artix-7, the RO frequency reduces 1 to 0.5 MHz at 45°C and 1.5
to 1 MHz at 70°C.
Figure 7-4 (b) and Figure 7-4 (b) show the RO frequency changes due to the
voltage variation for Spartan 3E and Artix-7, respectively. The RO’s frequency can be
seen as increasing uniformly as the VCCINT is increased. This observation suggests that the
voltage changes affect the RO’s frequency uniformly despite the RO’s location. It can be
noticed from these figures that the RO’s frequency is sensitive towards the VCCINT
81
changes. The RO’s frequency for Spartan 3E increases 20 MHz at 1.3V and 40 MHz at
1.4V. The RO’s frequency on Artix-7 increases by 90 MHz at 1.2V. The RO frequency
change due to the voltage variation is significantly higher than the changes due to the
temperature variations. This observation suggests that the ROPUF’s reliability is lowest
for both Spartan 3E and Artix-7 due to the voltage effect.
(a)
(b)
Figure 7-3: Spartan 3E
(a) RO frequency changes with respect to temperature variations.
(b) RO frequency changes with respect to voltage variations.
82
(a)
(b)
Figure 7-4: Artix-7
(a) RO frequency changes with respect to temperature variations.
(b) RO frequency changes with respect to voltage variations.
Figure 7-5 shows the aging effect on 10 RO frequencies on Spartan 3E and Artix-
7. For Spartan 3E, the frequencies from 10 ROs are observed to have normal fluctuations.
No increasing or decreasing frequency pattern is observed. However, for Artix-7, the
83
frequencies from 10 ROs are observed to have similar decreasing pattern. In average, the
RO frequencies on Artix-7 are reduced by 0.5 MHz at the end of the aging experiment.
This observation suggests that the aging affects the frequency of the ROs uniformly
regardless of their spatial location.
(a)
(b)
Figure 7-5: RO frequency changes with respect to aging.
(a) Spartan 3E
(b) Artix-7
84
7.5.4 ROPUF Diverseness
The Artix-7 has the highest ROPUF diverseness of 3.89. Thus represents the high
gap between the maximum and minimum frequencies. High diverseness is a good feature
for ROPUF as it increases the number of CRPs which can be generated [34]. The Spartan
3E diverseness is found to be slightly lower than Artix-7 namely, 2.09. It is observed that
the diverseness increases when advanced silicon technologies are used. This is due to the
reduced transistor size which increases the RO’s frequency. Therefore, slight changes in
the process variations are amplified by the higher RO’s frequency.
7.6 Summary
In this work, we have implemented ROPUFs on 20 Artix-7 FPGA chips and 29
Spartan 3E chips which cover a wide range of silicon technologies. We have recorded and
analyzed thousands of RO frequencies from each chip. We conclude that only diverseness
parameter changes with respect to the silicon technologies used, and the uniqueness
parameter improves as the FPGA chip density increases.
85
Chapter 8
ROPUF Application: Hardware-Oriented Security-
Based Authentication for Advanced Metering
Infrastructure
8.1 Introduction
A smart grid is often described as the merging of a traditional power grid with an
advanced communication technology to increase the power network’s delivery efficiency.
The smart grid is capable of two-way electricity and information. The two-way
communication in the smart grid allows many parties in the network to exchange
information. For example, the power provider can receive information on the customer’s
power usage, and the customer can receive the recent pricing information from the power
provider. The information exchanges facilitate the power provider to estimate and control
the power generation more efficiently. The customer can utilize the pricing information
to optimize their electricity usage. Many other benefits can be obtained through the smart
grid’s implementations that are not mentioned here [35].
Though the smart grid has been utilized in some places in the world, there is a
growing concern on its security. The smart grid’s security objectives can be grouped into
three categories: availability, integrity, and confidentiality [35]. Availability ensures
timely and reliable access to and use of information in all components of the smart grid.
Integrity guards the information from being modified or destructed to ensure information
nonrepudiation and authenticity. Confidentiality preserves the authorized restrictions on
86
information access in order to protect personal privacy and proprietary information.
Authentication is one of the important security features that contributes to the integrity
and confidentiality security objectives.
In a smart grid, the authentication scheme has to be different from the one used by
internet technology [35]. This is due to the different threats that exist in the smart grid
network compared to threats that exist in internet technology. The possible threats of a
smart grid network include attacks targeting data integrity and operation disruption
[35][36]. These types of attacks can be managed by having secured authentication. There
are some basic requirements in smart grid authentication protocol such as high efficiency
and tolerance to faults and attacks [35]. We discuss our design based on these
requirements later in section 8.4.
We use a Physical Unclonable Function (PUF) as the authentication scheme for
the Advanced Metering Infrastructure (AMI) in the smart grid. There are many types of
PUFs: this work uses silicon PUF, specifically PUF on a FPGA. During fabrication of the
silicon PUF, minor irregularities occur. The minor irregularities cause slight differences
in the electrical delay in the silicon chip. The differences are not noticeable in the
functionality of the chip, but PUF exploits the minor irregularities to generate a number
of binary IDs that are unique for each chip. Delay-based PUF uses ring oscillators (ROs)
to extract the minor irregularities and make them visible through different frequencies
(will be discussed in Section 8.3). A Ring Oscillator Physical Unclonable Function
(ROPUF) is highly secure; it cannot be modeled since the minor irregularities that occur
during the fabrication process are random for each chip. Confidential information is not
stored on its circuit.
87
In this work, we discuss our proposed hardware oriented security based
authentication on AMI using ROPUF on FPGAs. Our contributions are as follows:
1. We introduce an authentication scheme using ROPUF. The authentication scheme
is limited to the network between the utility company and smart meter which we
refer to as AMI. The intention is to set a design boundary so that the proposed
scheme is developed to meet the authentication requirements within that boundary.
2. The proposed authentication scheme focuses on the current enhancement of the
existing AMI. No major changes in the protocol are needed to implement the
scheme. Our scheme can be combined with the existing protocol.
3. We have proved that our ROPUF is tolerant to attack since it cannot be modeled.
A linear support vector machine (SVM) is used to test the ROPUF, and the results
show that the SVM fail to model the ROPUF.
4. We have also proved that the proposed ROPUF meet the efficiency and tolerance
requirements through experiments conducted for the proof of concept.
88
8.2 Related Work
In this chapter, we limit the scope of authentication to communication in AMI.
There are many authentication schemes that have been proposed. We divide the proposed
schemes into two groups. The first group comprises of schemes proposed in terms of
algorithms on the existing resources. In [37], a lightweight message authentication
scheme that uses a shared session key established using Diffie-Hellman exchange
protocol is presented. In [38], an authentication scheme that is based on Merkle hash tree
scheme which is used to construct a tree based on a one-way cryptographic hash function
is described. In [39], smart grid key management (SGKM) based on enhanced identity
based cryptography (EIBC) is suggested. These schemes are based on non-volatile
memory technologies that are vulnerable to invasive/spoofing attacks.
The second group for the authentication scheme is based on hardware-oriented
security. In [40], an interoperable device identification in a smart grid based on a trusted
platform module (TPM) is proposed. This technology is defined by a trusted computing
group implementing consistently behaving computer systems as a technology. Trusted
computing technology provides methods for reliably checking a system’s integrity and
identifying anomalous or unwanted characteristics. In [41] an authentication and key
management scheme for advanced metering infrastructures using PUF is proposed.
In this chapter, our proposed scheme for authentication in AMI fits in the second
group which uses PUF for authentication. The advantages offered by our scheme
compared to [40] and [41] are discussed in section 8.5. We list the requirements of the
smart grid security environment and discuss each authentication scheme that belongs to
the second group according to the requirements as shown in Table 8.1 [36].
89
Table 8.1: Comparison of different schemes based on Smart Grid requirements
Smart Grid Security
Applications
Requirements
Trusted Computing Technology
[40]
PUF Scheme [41]
1) High performance
in terms of latency
and jitter in message
exchange.
(efficiency)
Message length is not mentioned.
Authentication process takes 11
steps to complete.
Message length is not
mentioned. Average
authentication process takes
3 steps to complete.
2) Timeliness:
computation and
communications
subsystems must
meet real-time
requirements of
applications
(efficiency).
The scheme takes 982.91 ms to
complete the authentication
process.
The exact information is
not stated. The only
information mentioned
about time is 2.4 ms for
PUF execution and 0.2 ms
(average) for the SHA-1 on
32 bit PUF response using
the PC.
3) Comprehensive
security design, as
schemes are likely
targets for
sophisticated cyber-
attacks (tolerance to
attacks).
Highly sophisticated TPM which
provides unique identity for each
module and strong cryptography
co-processor. TPM has secure
storage using a unique asymmetric
storage root key (SRK) of which
the private part never leaves the
TPM.
Scheme based on the
unclonable features derived
from the process variation
on the silicon chip.
4) Adaptable and
evolvable designs
because components
typically have a
lifetime of 15 or
more years once
deployed (tolerance
to faults).
This scheme is adaptable since it is
added to the existing devices on a
smart grid. It is not evolvable as the
design is made on application
specific IC (ASIC).
This scheme is both
adaptable and evolvable.
The scheme is independent
and can be added to any
smart meter. It is also
evolvable because it uses
FPGA; hence, it can be
reprogrammed.
90
The latency to complete the authentication process is 11 steps for the trusted
computing scheme and 3 steps for the PUF scheme. The trusted computing scheme takes
982.91 ms to complete the authentication process. The timing information for the
authentication using PUF scheme is not sufficient to summarize the total time. The
latency implies the availability of the scheme to operate in real time. Both technologies
have very good features that support tolerance to attacks: the security scheme is
embedded in the hardware. The last requirement discussed in Table 8.1 regards the
adaptable and evolvable designs due to the limitation on the components’ life span. The
trusted computing scheme is adaptable because the module can be replaced over time, but
is not evolvable as it uses ASIC in the design. The PUF scheme is both adaptable and
evolvable because it is designed to be a stand-alone unit that can integrate with any smart
meter and is also evolvable through the configurability features that are offered by FPGA.
8.3 Hardware-Oriented Security-Based Authentication for
AMI
Based on the AMI, the utility companies monitor their customers’ usage through
the smart meter. All data from smart meter is sent to the utility companies through a
number of smart meters (hopping network) and data concentrators. This means that all
data sent to and from the utility companies goes through a number of hops before it
reaches the destination as shown in Figure 8-1. For the authentication process, we
propose the utility company to be the center point for the data concentrators and smart
meters authentication. There are three good reasons for the utility company to be the
center point. The first one is that utility companies need to monitor and update their
91
customers regularly. The second reason is that all critical control messages, such as
switching off certain users’ appliances, can only come from the utility companies; and the
third reason is the utility companies have bigger and more secure storage (secured from
site channel attacks).
All devices (data concentrators and smart meters) involved in the communication
from the utility company to the smart meter need to have the ROPUFs chip as shown in
Figure 8-2. The utility company does not need to have the ROPUF chip as it is the trusted
authority that controls and monitors the network. The ROPUF uses the existing network
protocol. In this chapter we present the proposed short authentication protocol that needs
to be added in addition to the existing protocol. Our focus is to provide a practical
solution that can be applied to the existing technology at low cost.
The first step in the implementation is to recognize all devices present in the AMI.
The utility company scans and records all the challenge and parity bits pairs (CPBPs)
from each ROPUF chip. The scanned ROPUF chips are connected to each device in the
AMI. Through this method an utility company can keep track of the devices that are
present in the AMI.
92
Figure 8-1: AMI in Smart Grid [42].
Figure 8-2: ROPUF connected to a smart meter.
When the smart meter needs to send data to the utility company, it needs to
authenticate itself. The smart meter first requests that the utility company send data as
shown in Figure 8-3. Then utility company sends a challenge Ci to the smart meter. The
smart meter then uses the Ci received to produce the authentication code in the form of
Parity Bits (PBi) and send it back to the utility company. The utility company verifies the
93
response PBi; if it is correct then permission is granted, but if it is wrong, two more
chances are given until the smart meter can send a correct response as shown in Figure 8-
4. The worst case scenario is that the smart meter fails to send a correct response, in
which case the utility company sends a broadcast signal (BLOCK(broadcast)) so that all
devices in the AMI drop any packet received from that particular smart meter, and no
data can be sent through the smart meter. This action automatically isolates the smart
meter from the AMI. To solve this problem, the utility company needs to verify with the
customer whether it is a technical error or an adversary attack.
Figure 8-3: Smart meter to utility company authentication.
Figure 8-4: Smart meter to utility company fail authentication.
The next potential adversary attack is against the data concentrator. The data
concentrator acts as the forwarding device that completes the data path from the smart
meter to the utility company. Identity impersonation and data jamming are potential
94
attacks on the utility company. To make sure that all data concentrators in the network
are real, the authentications need to be done regularly. For the smart grid authentication,
we propose an authentication every 15 minutes, as suggested in [36][41]. For the data
concentrator authentication, the utility company first sends a verification request signal
with the challenge (VER(Ci)) to the data concentrator as shown in Figure 8-5. The data
concentrator uses the Ci sent by the utility company to produce the PBi and send it to
utility company. The utility company verifies the PBi, and if the PBi matches, then an
acknowledge (ACK) is sent and the data concentrator can operate as usual. If the PBi
does not match then two chances are given for the data concentrator to send a correct PB.
If the data concentrator still fails to produce a correct PB, the utility company drops that
data concentrator from the network by sending a broadcast signal to all devices on the
network in order to isolate the particular data concentrator. The steps taken are the same
as shown in Figure 8-4.
Figur 8-5: Data concentrator to utility company authentication.
The most critical attack possible is the utility company impersonation. The utility
company holds the main authority over all devices in the network. An adversary can get
control of the customer’s smart meter if they can impersonate the utility company. The
ANSI C12.18 standard defines six security levels of access that indicate different
privileges, L0 to L5, with L5 being the highest privilege. The security level L0 requires
95
no password [41]. To differentiate the security levels we are using five different lengths
of PBs. The five different length of the PBs are 64, 128, 256, 512 and 1024 bits. Higher
security privilege access requires a higher length of PBs. Authentication occurs when the
utility company sends a request to the smart meter for a specific level of access
permission (REQ(level)) as shown in Figure 8-6. Then the utility company sends a
challenge and a hamming code parity bits pair (CPBPi-length(level)) to the smart meter.
The smart meter verifies the PBi-length sent from the utility company by generating its own
PBi-length from the given Ci-length. The utility company has two more chances if the smart
meter fails to verify the CRPi-length(level) as shown in Figure 8-7. If the adversary tries
to impersonate the utility company, the utility company detects the attack when receiving
the first NACK from the smart meter.
Figure 8-6: Utility company to smart meter authentication.
Figure 8-7: Utility company to smart meter fail authentication.
96
8.3.1 ROPUF Design
In our ROPUF design we use 3-stage ROs. The benefits of using 3-stage ROs
have been discussed in another chapter [17]. For the smart grid authentication application
we propose to use ROPUF that can generate up to 2048 bits responses. Our ROPUF
design uses 120 ROs. If the smart meter in the smart grid application is required to
authenticate itself every 15 minutes for the smart meter to update the usage information,
this means that it needs to authenticate 1,752,000 times or 50 years’ time span. Our
ROPUF design is capable of handling this type of requirement. Each authentication uses
new response bits which makes it harder for the adversary to launch an attack.
Figure 8-8 shows the logic blocks for the ROPUF circuit. Each RO activates for
0.4 ms and there is a 0.1 ms gap before the next RO is activated; this is to reduce the heat
noise that generates from the adjacent CLB [17]. A 0.2 ms gap between the RO and
counter activation allows the signal to be stabilized before the measurement starts. The
timing controller controls all time intervals involved, such as the time interval for each
RO’s activation and the time interval for the counter to measure each RO. The counter 1
measures the number of cycles generated from the RO’s frequency selected through
multiplexer 1 (Mux 1) and Counter 2 measures the number of cycles generated from
RO’s frequency selected through Mux 2. Then the comparator compares the number of
cycles recorded by Counter 1 and 2 to generate one response bit. The generated response
is stored in the register.
97
Figure 8-8: ROPUF logic blocks.
To reduce the bit flip occurrences we measured the response generation under
various environmental factors such as temperature and voltage variations in order to
study the bit flip occurrences [34]. From the study, we found the importance of using the
ROs comparison pairs that have high differences to avoid the bit flip occurrences. Figure
6-2 (a) shows the example of ROPUF output flip occurrence when the difference between
two selected ROs is small. Figure 6-2 (b) shows how the output flip can be prevented by
selecting two ROs that have a higher frequency difference. From the data obtained on
specific FPGA chips (S3E100), we found the best threshold to be 5 MHz. The threshold
measures the number of possible ROs comparison pairs that have higher frequency
differences than 5 MHz. The number of ROs comparison pairs and CRPs that pass the
5MHz threshold for 5 different FPGA chips (S3E100) are shown in Table 8.2 (we use
98
(n!/(n-r)!(r!)) equation). The number of possible CRPs generated is abundant enough to
support the frequent authentication requirement, and the security is guaranteed as CRPs
will not be reused.
Table 8.2: Number of possible CRPs.
Number of possible CRPs
ROPUF Comparison pairs 128 bits
1 1863 1.18 x 10 285
2 3942 5.758 x 10 243
3 5880 1.947 x 10 266
4 4504 1.919 x 10 251
5 3381 1.18 x 10 235
8.3.2 Authentication
For the authentication in the AMI, the Ri generated from the ROPUF is not used
because the adversary could model the ROPUF based on the Ci and Ri that are sent
through the network. To enhance security, the hamming code parity bits (PBs) are sent
out for authentication as shown in Figure 8-9. Hamming code is a linear error correcting
code that generalizes the hamming (7,4) code. A block of data that has a length of k bits
is assigned with n-k parity bits. The length of the message after adding the parity bits is n
bits. The block length is represented as n=2r-1, and the message length is represented as
k=2r-r-1where r is the length of the parity bits and r ≥ 0. The block of data represented
by m and the data with parity bits represented by x are given by: x=mG, where G is the
generating matrix.
99
Figure 8-9: Parity Bits PBi generator.
PBi is used in the authentication by generating 4 parity bits for every 8 bits of Ri
as shown in Figure 8-10. For 128 bits Ri, 64 bits of PBi are generated. Based on the
equation mentioned before, 4 hamming code parity bits are able to cover 16 bits of data
for the error detection and correction. But in our ROPUF design, we use hamming code
as an authentication code generator. We maximize the length of authentication code to
increase the security level by generating 4 hamming code parity bits for every 8 bits of
data. There are three advantages of using hamming code parity bits as the authentication
code. First, hamming code is a one-way function. The second advantage is that there is no
way to model the ROPUF using the Ci and PBi (discussed in Section 8.4). Third, the
authentication code has a shorter length compared to Ri, yet produces better security.
Figure 8-10: Parity bits from 128 response bits form 64 parity bits.
Figure 8-11 shows how each ROPUF chip is registered with the utility company.
The utility company has a database that stores all possible challenge for each ROPUF.
100
The combinations of challenge for each ROPUF are different as discussed in the previous
section because only ROs comparison pairs that pass the frequency difference threshold
are selected. Challenge for all ROPUFs should be provided by the manufacturer. Utility
company sends one Ci at a time to each ROPUF and records the PBi generated from each
particular ROPUF in the database. The ROPUF registration process starts with the
challenge for 128 bits response and continues until 2048 bits response for different level
of security access.
Figure 8-11: ROPUFs registration with utility company.
ROPUFs that have been registered can be used by the smart meters and data
concentrators in the AMI. The ROPUF does not need any additional storage from the
devices. It just needs to be connected serially to the device as shown in Figure 8-2 via
serial connection, and the firmware on the devices needs to be updated to support the
additional protocol proposed to generate the authentication key.
101
8.4 Proof of Concept
In this section, we discuss the proposed ROPUF design for smart grid
authentication based on three smart grid requirements mentioned in Section 8.1. To prove
the concept, we implement our ROPUF design on Spartan 3E FPGAs. The PC acts as an
utility company that stores all the CRPs, and the smart meters and data concentrators are
implemented on FPGAs. This set-up simulates the protocol and the effectiveness of our
scheme. Table 8.3 shows the time taken to transfer the PBi and Ci through the USB
connection. For the first authentication level, the total authentication time taken is 65.364
ms via the USB connection. This time is within the range of achieving real-time
communication. The first authentication level is used the most as it involves the
information passing between the utility company and smart meter. The second level
authentication takes 130.73 ms, followed by third (261.46 ms), fourth (522.91 ms) and
fifth (1045.83 ms) level authentications. In terms of high efficiency, our system meets the
smart grid requirement in which the authentication can be achieved in real-time.
Another factor to consider is cost of storage. Extra data storage is needed to store
the challenge and PB from ROPUFs. Table 8.4 shows the data storage needed to store all
challenge and parity bits pairs (CPBPs) for 50 years. For the first authentication level, we
assume that authentication would take place every 15 minutes. In this case, one ROPUF
needs 35136 CPBPs in one year. The size of data storage needed to store the CPBPs for
one year is 8 MB. If the life span of the AMI in smart grid is expanded to 50 years, then
the data storage size needed to store CPBPs for one ROPUF would be 408 MB. For other
authentication levels, we assume that a 5 times a day usage will require 1830 CPBPs in
one year for one ROPUF. For a 50-year life span, the second authentication level needs
102
46 MB data storage, followed by the third, fourth and fifth level authentications (93 MB,
186 MB and 371 MB, respectively).
Table 8.5 shows the total data storage needed to store all the CPBPs according to
the number of devices (data concentrator and smart meter) involved in the AMI. Storing
complete CPBPs (all authentication levels) for one device takes 1.1 GB of data storage. If
the AMI has 2000 devices, 2207 GB of data storage is needed. Two TB of data storage
cost around $140 currently. Thus, the proposed authentication scheme using ROPUF
does not incur high cost to the utility company and is cost effective.
Table 8.3: Authentication time for each level.
Time (ms)
Authentication
level
Ri
(bits)
PBi
(bits)
Ci
(bits)
PBi
Generation
PBi
Transfer
Ci
Transfer
Total
Authentication
First L1 128 64 1792 64 0.047 1.317 65.364
Second L2 256 128 3584 128 0.094 2.634 130.728
Third L3 512 256 7168 256 0.188 5.268 261.457
Fourth L4 1024 512 14336 512 0.376 10.537 522.913
Fifth L5 2048 1024 28672 1024 0.753 21.074 1045.827
Table 8.4: Data storage size for each authentication level.
CPBP authentication level Year(s) Data size (megabytes)
First
1 8.152
10 81.516
20 163.031
30 244.547
40 326.062
50 407.578
Second 50 46.4
Third 50 92.8
Fourth 50 185.6
Fifth 50 371.2
103
Table 8.5: Data storage size needed based on number of devices on the AMI.
Number of devices Data Size (gigabytes)
1 1.103578
100 110.3578
200 220.7156
300 331.0734
400 441.4312
500 551.789
600 662.1468
700 772.5046
800 882.8624
900 993.2202
1000 1103.578
2000 2207.156
To test our ROPUF security level, we use a support vector machine (SVM) to
model the ROPUF based on the Ci and PBi. In this model we assume that the adversary
has knowledge of the encryption code used in the network and also of the hamming code
to generate the PBi for every 8 bits of the response. We use SVM because the Ci and PBi
can be classified as ‘1’ and ‘0’. A SVM classifies data by finding the best hyper-plane
that separates one class from another class. The best hyper plane has the largest margin
between the two classes.
First, we train the SVM classifier with a group of data with the correct classifier.
The data X and classifier Y are fed to the SVM train function to train the classifier. The
Gaussian Radial Basis Function Kernel with a scaling factor of sigma equal to one is used
for the classifier training. The data set consists of the RO pairs used to generate the parity
bit as shown in Figure 8-12. As an example, response bit b0, b1, b3, b4, and b6 are used for
the generation of the first parity bit. RO1 and RO2 are used to generate b1, RO3 and RO4
104
are used to generate b2, and so on as shown in Figure 8-12. The parity bit p0 is the
classifier Y, and the data X consists of RO1 to RO10. One authentication key has 64 parity
bits comprises of p0 until p63.
Figure 8-12: Parity bits and corresponding ROs.
Figure 8-13 shows the accuracy of the prediction results. As the number of data
used to train the SVM classifier increases, the accuracy also increases to a point, but then
gradually decreases as the number of data used is further increased. The best accuracy
obtained is 60.9%, showing that the SVM cannot model the ROPUF by using the Ci and
PBi. This test proves that the proposed ROPUF design for the AMI is secure from the
adversary’s attack. Another advantage of using ROPUF as the authentication system is
that no clues useful for the adversary to crack the ROPUF are stored on the devices. Even
if the adversary is able to model one of the ROPUFs in the AMI, it will take more time to
break another ROPUF because the only way to break the ROPUF (if there is a way of
doing it) is by gathering the challenge and PB pairs and creating a model.
The security of the database that stores all challenge and PBi for the devices on
the AMI is also important. However, the database is less vulnerable to adversary attack
105
since the other devices have no access to the utility company. The only communication
the utility company has with the devices in the network is sending and receiving
information. But, if the adversary is able to hack the utility company, or some of the
utility company employees breach trust, then the threat is unavoidable. However, if that
type of attack occurs, our ROPUF authentication system can be fixed. The ROPUF can
be reprogrammed, and different sets of ROs can be used, rendering the previous
challenge and PB invalid. In terms of the tolerance to attack requirements, the ROPUFs
design requires endless effort from the adversary in order to model the ROPUFs. The
vulnerability exists at the utility company database, and we assume that the utility
company has to have a good firewall to protect the system as a whole.
Figure 8-13: SVM prediction accuracy for ROPUF.
Regarding the tolerance to faults, the authentication system is designed to tolerate
10% of discrepancies in the PB. Additionally, the comparison pairs used to generate the
response in the ROPUFs have at least 5 MHz difference. This ensures that the Ri
generated from the ROPUFs will not get flipped when exposed to anomaly voltage and
temperature conditions. Figure 8-14 shows that the bit flip probability trend reduces when
the frequency difference increases. We find that the bit flip occurs most when the
maximum frequency difference is 1 MHz [34]. To guarantee the ROPUF is able to deal
106
with the worst case scenario, though, we recommend using the largest threshold
frequency possible.
Figure 8-14: Bit flip probability vs. frequency difference (MHz)
8.5 Summary
In this work, we propose a new scheme for authentication of the Advanced
Metering Infrastructure in a smart grid. The novel authentication scheme using ROPUF
offers high security with less overhead compared to previous proposed schemes [40][41].
In terms of latency, our proposed scheme takes, at most, four steps for authentication.
The complete authentication times for the most used security level L1 and L2 are 65.4 ms
and 130.7 ms, respectively. These times satisfy the availability requirements in a smart
grid. The authentication keys sent through the network do not provide any clues that
allow the adversary to model the ROPUF as proved in the SVM trained data results. This
system is designed as a stand-alone unit so that it can work in addition to the existing
protocol currently used in the industry. The system is also designed to tolerate the fault
occurrence in the system, such as using high RO comparison pairs only, and tolerating
10% discrepancies in the PB. The reconfigurability feature offered by the FPGA makes
the ROPUF evolvable as it can be reprogrammed at any time.
107
Chapter 9
Conclusions
9.1 Summary and Conclusions
The importance of hardware security and trust is increasing as the industry supply
chain has become more complex and also more vulnerable to adversary attacks. An
article in the IEEE Spectrum entitled “The Hidden Dangers of Chop-Shop Electronics”
describes how clever counterfeiters sell old components as new, thus threatening both
military and commercial systems [43]. On August 17, Boeing warned the U.S. Navy that
an ice-detection module in the P-8A Poseidon (new reconnaissance aircraft) contained a
reworked part that should not have been put on the airplane originally and should have
been replaced immediately. The company that supplies the ice-detection module has
blamed the part, a Xilinx field-programmable gate array (FPGA), for the failure of the
ice-detection module during a test flight. However, retracing that FPGA’s path led not to
Xilinx but to a Chinese company called, “A Access Electronics”. It apparently had turned
a quick profit by selling used Xilinx parts as new. This incident is one of the examples of
108
how the vulnerability in the hardware security and trust has become a security threat.
There are two points that can be highlighted from this true incident. The first is how
widely the programmable chip or FPGA is used in the industry. The second is the
vulnerability that exists in the industry’s supply chain that could lead to serious safety
and security issues. Therefore, it is important to increase the security of FPGAs and other
custom designed chips. Some techniques have been proposed in the past to enhance the
security of FPGAs. In this work, we have proposed a ring oscillator based technique
which extracts the process variation effects from the FPGA and converts it to a unique
ID. A ROPUF can be used as an authentication technique for an FPGA to verify its
trustworthiness. Apart from that, ROPUFs can also be used as a cryptography technique
to encrypt and decrypt information.
9.2 Contributions and Results
Major contributions made in this research are listed below:
Three different FPGA families which are fabricated using different silicon
technologies are used to explore the ROPUF. The ROs are studied and compared
based on five parameters; uniqueness, reliability, uniformity, bit-aliasing, and
diverseness.
o The temperature variations, voltage variations, and accelerated aging
experiments are done to measure the reliability.
o The FPGA fabricated using the latest technology shows better
performance based on the five parameters used.
109
o Different numbers of stages used in ROs are explored. The experimental
results obtained suggest that a lower number of stages used in a ROPUF
on FPGA contributes to better performance regardless of the silicon
technologies used.
o Different FPGAs may have different minimum number of stages that can
be used in ROs due to the limitation on the FPGA components’ maximum
operating frequency. The minimum number of RO stages that can be used
in Spartan 2 and Spartan 3E is 3, and for Artix-7 is 5-stage.
o Based on the experimental results, we conclude that a ROPUF is
applicable regardless of different silicon fabrication technologies used to
produce an FPGA.
The systematic variation effect on ROPUF’s security reliability has also been
studied in this work.
o The experimental results showed that systematic variation does affect the
ROPUF responses’ randomness and uniqueness parameter.
o The RPM technique is developed to overcome this effect. The results
obtained by using the RPM technique are shown to be better than other
techniques that have been proposed before.
o The responses generated from ROPUFs after applying the RPM technique
passed most of the NIST statistical test for randomness.
110
The ROPUF is applied as the hardware-oriented security-based authentication for
advanced metering infrastructure (AMI). The authentication system is developed
based on ROPUF and is targeted for AMI.
o ROPUF is used to generate the unique ID for the devices involved in the
AMI for the authentication.
o This system is designed to fit in the current AMI system with low cost
implementation.
o Details of the implementation cost are shown in this work as the proof of
concept.
o The security of ROPUF system used is also tested using support vector
machine (SVM). The SVM is trained using a large data set and challenges
are fed into the SVM to predict the response sets.
o Results obtained show that SVM failed to predict ROPUF responses based
on the challenges, thus lending credence to the security offered by the
proposed authentication system.
9.3 Future Works
The research work done in this dissertation can be further extended by performing
the following:
Implementation of the ROPUF authentication scheme for the AMI network using
a simulation software such as NS-2 to analyze its performance.
111
An efficient error correction circuit can be developed to improve the ROPUF
security.
A ROPUF scheme can be developed for cryptography
A hardware Trojan detection technique can be designed using the RO.
112
References
[1] The Bureau Of National Affairs, INC, “Counterfeit Electronic Parts: What to do
Before the Regulations (and Regulators) come?” ISSN 0014-9063 2012.
[2] Ryan Kastner and Ted Huffmire, “Threats and Challenges in Reconfigurable
Hardware Security,” California University San Diego La Jolla Department of
Computer Science and Engineering, July 2008.
[3] C.E. Yin and Q. Gang, “Improving PUF Security with Regression-based Distiller,”
Design Automation Conference (DAC), pp. 1-6, Jun 2013.
[4] A. Maiti, V. Gunreddy, and P. Schaumont, "A Systematic Method to Evaluate and
Compare the Performance of Physical Unclonable Functions," Chapter 11 in
"Embedded System Design with FPGAs," Eds. P. Athanas, D. Pnevmatikatos, N.
Sklavos, Springer 2012, ISBN 978-1-4614-1361-5.
[5] B. Gassend, D. E. Clarke, M. Van Dijk and S. Devadas, “Silicon physical unknown
functions,” in ACM Conference on Computer and Communications Security (CCS)
2002, pp. 148–160.
113
[6] G.E. Suh and S. Devadas, “Physical unclonable functions for device authentication
and secret key generation,” in Proc. 44th Design Automation Conf. (DAC 07), ACM
Press, pp. 9–14.
[7] Susana Eiroa and Iluminada Baturone, “An analysis of ring oscillator PUF behavior
on FPGAs,” in Int. Conf. on Field-Programmable Technology (FPT) 2011.
[8] A. Maiti, and P. Schaumont, “Improved ring oscillator PUF: an FPGAfriendly secure
primitive,” Journ. of Cryptology, Vol. 4 (2), pp. 375-397, April 2011.
[9] C.E.D. Yin and Q. Gang, “LISA: Maximizing RO PUF’s secretextraction,” In HOST
2010, pp. 100-105.
[10] A. Maiti, and P. Schaumont, “Improving the quality of a physical unclonable
function using configurable ring oscillators,” in FPL 2009, pp.703-707.
[11] A. Maiti, J. Casarona, L. McHale, and P. Schaumont, ”A large characterization of
RO-PUF,” HOST 2010, pp. 66-71, 2010.
[12] H. Yu, P.H.W. Leong, H. Hinkelmann, L. Moller, M. Glesner, and P. Zipf,"Towards
a unique FPGA-based identification circuit using process variations," International
Conference on Field Programmable Logic and Application, pp.397,402, Aug. 31
2009-Sept. 2 2009.
[13] P. Sedcole and P. Y. K. Cheung, “Within-die delay variability in 90nmFPGAs and
beyond,” Proc. FPT 2006, pp. 97-104.
[14] D. Lim, J.W. Lee, B. Gassend, M. Van Dijk, and S. Devadas, “Extracting secret keys
from integrated circuits,” IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, 2005.
114
[15] S.S. Kumar, J. Guajardo, R. Maes, G.J. Schrijen, and P. Tuyls, “The Butterfly PUF:
Protecting IP on every FPGA, “IEEE International Workshop on Hardware Oriented
Security and Trust (HOST), 2006.
[16] S. Morozov, A. Maiti, and P. Schaumont, "A Comparative Analysis of Delay Based
PUF Implementations on FPGA," 6th International Symposium on Applied
Reconfigurable Computing, March 2010.
[17] M. Mustapa, M. Niamat, M. Alam and T. Killian, “Frequency Uniqueness in Ring
Oscillator Physical Unclonable Functions on FPGAs,” MWSCAS 2013, pp. 465-468,
Aug. 2013
[18] Q. Liu and S.S. Sapatnekar, “A Framework for scalable post-silicon statistical delay
prediction under process variations,” in IEEE Trans. on CAD of Integrated Circuits
and Systems 2009, IEEE Press, pp. 1201-1212.
[19] Xilinx, “Spartan-II FPGA Family Data Sheet,” DS001 June 13 2008
[20] S. Lopez-Buedo, J. Garrido, and E. Boemo, "Thermal testing on reconfigurable
computers," Design & Test of Computers, IEEE , vol.17, no.1, pp.84,91, Jan-Mar
2000.
[21] National Institute of Standards and Technology, “A Statistical Test Suite for Random
and Pseudorandom Number Generators for Cryptographic Applications,” April 2010.
[22] M. D. Yu and S. Devadas, “Secure and Robust Error Correction for Physical
Unclonable Functions,” IEEE Trans. on Design & Test of Computers, 2010.
[23] Y. Dodis, R. Ostrovsky, L. Reyzin and L. Smith, “Fuzzy Extractors: How to
Generate Strong Keys from Biometrics and Other Noisy Data,” SIAM Journal on
115
Computing 38(1), 97-139, 2008.
[24] R. Maes, A. V. Herrewege and I. Verbauwhede, “PUFKY: A Fully Functional PUF-
based Cryptographic Key Generator,” Cryptographic Hardware and Embedded
Systems CHES 2012, pp 302-319, 2012.
[25] C. E. Yin and G. Qu, “Temperature-Aware Cooperative Ring Oscillator PUF,”
Proceedings of 2 nd
IEEE International Workshop on Hardware Oriented Security
and Trust, Jun 2009.
[26] C C. E. Yin, G. Qu and Q. Zhou, “Design and Implementation of a Group-based RO
PUF,” Design, Automation Test (DATE13), March 2013.
[27] B. E. Stine, D. S. Boning, and J. E. Chung, “Analysis and decomposition of spatial
variation in integrated circuit processes and devices,” IEEE Transactions on
Semiconductor Manufacturing, Vol. 10, Issue 1, pp. 24-91, Feb 1997.
[28] K. Bernstein, D. J. Frank, A. E. Gattiker, W. Haensch, B. L. Ji, S. R. Nassif, E. J.
Nowak, D. J. Pearson, and N. J. Rohrer, “High-performance cmos variability in the
65-nm regime and beyond,” IBM Journal of Research and Development, vol. 50, no.
4.5, pp. 433-449, 2006.
[29] B. E. Stine, T. Maung, R. Divecha, and et al, “Using a statistical metrology
framework to identify systematic and random sources of die and wafer-levelild
thickness variation in cmp processes,” International Electron Devices Meeting, pp.
499-502, Dec. 1995.
116
[30] J.R. Celaya, P. Wysocki, V. Vashchenko, S. Saha and K. Goebel, "Accelerated aging
system for prognostics of power semiconductor devices," AUTOTESTCON, 2010
IEEE , vol., no., pp.1,6, 13-16 Sept. 2010.
[31] J. Keane and C.H. Kim,“An odomoeter for CPUs," Spectrum, IEEE , vol.48, no.5,
pp.28,33, May 2011.
[32] A. Maiti, L. McDougall, and P. Schaumont, "The Impact of Aging on an FPGA-
Based Physical Unclonable Function," Field Programmable Logic and Applications
(FPL), 2011 International Conference on , vol., no., pp.151,156, 5-7 Sept. 2011.
[33] D. Ganta and L. Nazhandali, "Study of IC aging on ring oscillator physical
unclonable functions," Quality Electronic Design (ISQED), 2014 15th International
Symposium on , vol., no., pp.461,466, 3-5 March 2014.
[34] M. Mustapa, and M. Niamat, “Relationship between Number of Stages in ROPUF
and CRP Generation on FPGA,” The 2014 International Conference on Security and
Management (SAM’14), 21-24 July 2014.
[35] W. Wang, and Z. Li, “Cyber security in the Smart Grid: Survey and challenges”,
Computer Networks, Volume 57, Issue 5, Pages 1344-1371 April 2013.
[36] H. Khurana, R. Bobba, T. Yardley, P. Agarwal, and E. Heine, "Design Principles for
Power Grid Cyber-Infrastructure Authentication Protocols," System Sciences
(HICSS), 2010 43rd Hawaii International Conference on , vol., no., pp.1,10, 5-8 Jan.
2010
117
[37] MM. Fouda, Z.M. Fadlullah, N. Kato, Rongxing Lu, and Xuemin Shen, "A
Lightweight Message Authentication Scheme for Smart Grid
Communications," Smart Grid, IEEE Transactions on , vol.2, no.4, pp.675,685, Dec.
2011
[38] Li Hongwei, Lu Rongxing, Liang Zhou, Bo Yang, and Xuemin Shen, "An Efficient
Merkle-Tree-Based Authentication Scheme for Smart Grid," Systems Journal, IEEE ,
vol.8, no.2, pp.655,663, June 2014
[39] H. Nicanfar, P. Jokar, K. Beznosov, and V.C.M. Leung, V, "Efficient Authentication
and Key Management Mechanisms for Smart Grid Communications," Systems
Journal, IEEE , vol.8, no.2, pp.629,640, June 2014
[40] N. Kuntze, C.Rudolph, I. Bente, J. Vieweg, and J. Von Helden, "Interoperable device
identification in Smart-Grid environments," Power and Energy Society General
Meeting, 2011 IEEE , vol., no., pp.1,7, 24-29 July 2011
[41] M. Nabeel, S. Kerr, Xiaoyu Ding, and E. Bertino, "Authentication and key
management for Advanced Metering Infrastructures utilizing physically unclonable
functions," Smart Grid Communications (SmartGridComm), 2012 IEEE Third
International Conference on , vol., no., pp.324,329, 5-8 Nov. 2012.
[42] R. Lehrbaum. (2013, Sept. 30). Smart grid data concentrator dev kit runs Linux
[online]. Available: http://linuxgizmos.com
[43] J. Villasenor and M. Tehranipoor, "Chop shop electronics," Spectrum, IEEE , vol.50,
no.10, pp.41,45, October 2013.
118
[44] S. Morozov, A. Maiti, and P. Schaumont, “An Analysis of Delay Based PUF
Implementations of FPGA,” 6 th
International Symposium ARC 2010, Bangkok,
Thailand, pp. 382-387, March 17-19 2010.
[45] Xilinx, “Virtex-6 FPGA Configuration,” User Guide, Aug. 2014.
[46] A. Maiti, L. McDougall, and P. Schaumont, "The Impact of Aging on an FPGA-
Based Physical Unclonable Function," Field Programmable Logic and Applications
(FPL), 2011 International Conference on , vol., no., pp.151,156, 5-7 Sept. 2011.
[47] D. Ganta and L. Nazhandali, "Study of IC aging on ring oscillator physical
unclonable functions," Quality Electronic Design (ISQED), 2014 15th International
Symposium on , vol., no., pp.461,466, 3-5 March 2014.
[48] Y. Hori, T. Yoshida, T. Katashita, and A. Satoh, “Quantitative and Statistical
Performance Evaluation of Arbiter Physical Unclonable Functions on FPGAs,”
International Conference on Reconfigurable Computingand FPGAs (ReConFig)
2010, pp 298-303, December 2010. [49] M. Mustapa and M. Niamat, “Novel RPM Technique to Dismiss Systematic
Variation for ROPUF on FPGA,” IEEE National Aerospace & Electronics
Conference (NAECON 2014), 25-27 June 2014.
sources/162/Duncan et al. - 2019 - FPGA Bitstream Security A Day in the Life.pdf
FPGA Bitstream Security: A Day in the Life Adam Duncan∗, Fahim Rahman†, Andrew Lukefahr∗, Farimah Farahmandi†, Mark Tehranipoor†
∗Intelligent Systems Engineering, Indiana University, Bloomington, Indiana 47401 USA †Electrical and Computer Engineering, University of Florida, Gainesville, Florida 32611 USA
Email: [email protected]
Abstract—Security concerns for field-programmable gate array (FPGA) applications and hardware are evolving as FPGA designs grow in complexity, involve sophisticated intellectual properties (IPs), and pass through more entities in the design and implementation flow. FPGAs are now routinely found integrated into system-on-chip (SoC) platforms, cloud-based shared computing resources, and in commercial and government systems. The IPs included in FPGAs are sourced from multiple origins and passed through numerous entities (such as design house, system integrator, and users) through the lifecycle. This paper thoroughly examines the interaction of these entities from the perspective of the bitstream file responsible for the actual hardware configuration of the FPGA. Five stages of the bitstream lifecycle are introduced to analyze this interaction: 1) bitstream-generation, 2) bitstream-at-rest, 3) bitstream-loading, 4) bitstream-running, and 5) bitstream-end-of-life. Potential threats and vulnerabilities are discussed at each stage, and both vendor-offered and academic countermeasures are highlighted for a robust and comprehensive security assurance.
Keywords—FPGA Security, Encryption, Bitstream Protection
I. INTRODUCTION
A field-programmable gate array (FPGA) is an integrated circuit with post-fabrication hardware programming capa- bilities used to implement custom functionality on a ded- icated hardware platform [1]. Products ranging from low- cost consumer electronics to high-end commercial systems use FPGAs for reconfigurability, low development cost, and high-performance [2]. The specific hardware functionality pro- grammed into an FPGA is defined by a binary file commonly known as a bitstream which is generated following a rigorous design, synthesis, and validation process. FPGAs are typically classified by the type of on-chip configuration memory used to store this bitstream file, with common examples being static random access memory (SRAM), Flash, and antifuse. Each configuration memory variant has associated performance, fab- rication, and security tradeoffs as discussed in [2]. However, in each FPGA type, the primary FPGA-specific security concern eventually simplifies down to protecting the bitstream from either tampering or intellectual property (IP) piracy. Tampering an FPGA bitstream can compromise the root of trust, and thus the security, of an entire system. Just as consequential, IP piracy conducted at the bitstream level can have an enormous financial impact for the design house and system manufacturer.
A simplified design flow illustrating the loading of a bitstream into an FPGA is depicted in Figure 1(a). The FPGA manufacturer, such as Xilinx, Intel, or Microsemi, first
produces the FPGA integrated circuit (IC), along with the proprietary bitstream development software. The user loads the design into the bitstream development software to generate the bitstream file. The bitstream is then loaded into the FPGA configuration memory when the device is powered on for functional operation.
Early FPGAs could only hold simple designs, e.g., 1000 ASIC equivalent gates for Xilinx XC2064 [3], making this de- sign flow tractable. However, FPGA technology has matured, and the size and the complexity of the FPGA have grown over time. The Xilinx VU19P device released in 2019 contains over 9-million logic cells, or roughly 90-million ASIC gates [4]. A design utilizing a significant portion of these logic resources often requires a large team of designers, incorporating multiple third party IP (3PIP) blocks and legacy designs. The VU19P, like most recent FPGAs, also allows for partial reconfiguration, that is allowing a system programmer to reconfigure the FPGA while operating in the field with partial bitstream updates. Fig- ure 1(b) shows the modern-day FPGA design flow with these additional entities interacting with each other and highlights their connection paths to the final bitstream responsible for the FPGA hardware configuration. As each entity shares a connection to the bitstream, they also pose a potential security threat to the authenticity, integrity, and confidentiality of the bitstream.
In this paper, we explore the journey an FPGA bitstream takes from conception to FPGA-based system obsolescence and present a comprehensive threat taxonomy to guide the reader. Industry and academic countermeasures are then pre- sented to illustrate defenses against each threat. The re- viewed protection mechanisms are composed of five stages: 1) bitstream-generation, 2) bitstream-at-rest, 3) bitstream- loading, 4) bitstream-running, and lastly, 5) bitstream-end-of- life (EOL). Our main contribution in this paper is to provide a comprehensive security assessment of the bitstream as it travels between these stages.
The rest of the paper is organized as follows: Related work and additional background information is provided in Section II. We introduce our bitstream lifecycle stages and present the threat taxonomy in Section III. Security threats and vulnerabil- ities, along with selected countermeasures, associated with the bitstream-generation stage are discussed in Section IV. Similar analysis is provided for the subsequent stages – bitstream-at- rest, bitstream-loading, bitstream-running, and bitstream-end- of-life – in Sections V, VI, VII, and VIII, respectively. Finally,
Security Invited 1.1 978-1-7281-4823-6/19/$31.00 c©2019 IEEE
INTERNATIONAL TEST CONFERENCE 1
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 1: a) Classical view of the FPGA design flow. b) Modern FPGA design flow involving multiple entities.
the paper is concluded in Section IX.
II. BACKGROUND
Different entities involved in a modern and complex FPGA design flow are highlighted in Figure 1(b). The 3PIP design house produces generic or client-specific IPs for the system integrator. The system integrator obtains and integrates the 3PIPs with in-house IPs to produce the actual bitstream for the FPGA. The system programmer represents the entity in charge of loading the bitstream into the FPGA. Lastly, in-field is used in reference of the FPGA operating in the field, such as inside a computer networking router, with its bitstream loaded into its physical configuration memory.
The physical configuration memory that stores the bitstream in an FPGA has a direct impact on the security and accessibil- ity of a bitstream. SRAM-based FPGAs are the most common FPGA type, using volatile SRAM-based latches to store the bitstream. They are fabricated using standard state-of-the-art manufacturing processes allowing for high-performance and high-density [5]. However, they require off-chip bitstream storage, and must transmit the bitstream into the FPGA after it is powered on. Hence it is possible for an attacker to intercept the unprotected bitstream at the board level [5].
There also exist non-volatile FPGAs, such as Flash memory and antifuse-based, which store their bitstream inside the FPGA, eliminating the board-level bitstream interception prob- lem. These FPGAs require additional manufacturing process steps and lack in the performance and density metrics of their SRAM-based counterparts [5]. Academic researchers have also proposed FPGA designs utilizing emerging non- volatile memories such as magneto-resistive RAM (MRAM) to produce higher performance non-volatile FPGAs [6].
Irrespective to complexity and memory architecture, FPGA security issues eventually simplify down to unauthorized ac- cess and tampering to the FPGA bitstream. For example, con- cerns may include attackers performing reverse engineering on proprietary IP or may involve the loading of an unauthorized design into an FPGA-based system to alter intended system be- havior. Specific threats and countermeasures will be discussed throughout subsequent sections of this paper.
FPGA vendors have included bitstream protection features dating back to the earliest FPGAs. Xilinx published an applica- tion note in 1997 to program the FPGA at a secure facility and use a battery to maintain power throughout the lifetime of the system, preventing an attacker from intercepting the bitstream [7]. In 2001, Xilinx introduced bitstream encryption into their Virtex-II devices using the Data Encryption Standard (DES) [8]. Here, the bitstream is encrypted with an encryption key that is stored securely within the FPGA. Without knowledge of the encryption key, an adversary cannot reverse engineer or copy the bitstream. Other FPGA manufacturers have since included bitstream encryption in their devices, with encryption standards eventually migrating to include variants of the newer Advanced Encryption Standard (AES) [5].
In 2009, on-chip bitstream authentication was included by Xilinx in their Virtex-6 devices [5]. This authentication imple- ments a keyed-Hash Message Authentication Code (HMAC) algorithm in hardware to compute the hash digest of a bit- stream. The digest is compared to a pre-computed reference digest before bitstream loading, and the loading is aborted upon a mismatch. In 2015, Microsemi included physically unclonable function (PUF) protection to their bitstream en- cryption keys in their IGLOO2 and Smartfusion2 devices [9]. The PUF uses the inherent physical properties of the IC to generate a device-specific digital signature generated at run- time by the chip. This PUF value is then incorporated into the bitstream encryption scheme so that an attacker cannot thwart the encryption protection by obtaining the on-chip encryption key alone. Xilinx and Intel also offer similar solutions for their Ultrascale+ and Stratix-10 devices, respectively [10], [11].
III. THREAT MODEL
The modern FPGA design flow experiences complex in- teractions among multiple involved entities as discussed in Section II. We present our threat taxonomy in Figure 2 to explore the threats and vulnerabilities facing the bitstream as it travels amongst these different FPGA entities. The top flow of Figure 2 illustrates the Design Flow Entities involved: 1) 3PIP Design House, 2) System Integrator, 3) System Programmer, 4) System in-Field, and lastly 5) Recycler.
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 2
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 2: A taxonomy of the different threats facing a bitstream as it traverses through a modern FPGA design flow composed of multiple entities.
Our five Bitstream Stages are located below these entities and describe the different points involved in the journey of a bitstream. Bitstream-Generation refers to the stage where the bitstream is physically being generated by either FPGA design tools or other means. Bitstream-at-Rest defines the stage where a bitstream has been generated and is stored either on a computer, in a cloud repository, or in a non- volatile memory that is not currently configuring the FPGA. Bitstream-Loading describes the physical act of loading the bitstream from its resting state into the FPGA configuration memory. Bitstream-Running is the state where a bitstream has been loaded into the configuration memory, and the FPGA is operating according to its programmed hardware configuration. Lastly, Bitstream-EOL is used to describe the decommissioning of the bitstream as well as physical FPGA- related threats tangentially related to the FPGA bitstream.
The interaction between the design flow entities and bit- stream stages illustrates the complexity involved in modern FPGA security. The first observation is that each design flow entity has a connection to more than one bitstream stage. For example, the in-field system may contain a bitstream stored in a non-volatile memory on a PCB, categorized as bitstream- at-rest. After the system powers up, it enters the bitstream- loading stage, and transitions into the bitstream-running stage after the bitstream reaches the FPGA configuration memory.
Several threat categories are provided for each bitstream stage as shown in the bottom of Figure 2. The taxonomy also lists two examples for threat category. The subsequent sections of this paper will discuss these threats and associated countermeasures in detail with respect to each bitstream stage.
IV. BITSTREAM GENERATION
The life a bitstream begins with an intended hardware design specification that is targeted towards the FPGA. The
design specification is then translated into a complete design IP that is either developed by the user, outsourced as 3PIP, includes other licensed IP, or is a combination of all. At this point, the FPGA design software takes the IP and synthesizes it into FPGA resources according to the targeted FPGA models and specifications. These resources are then placed within the FPGA fabric and routed together to create a final configuration. This final configuration is ultimately specified as the bitstream. This bitstream generation process can be seen in Figure 3. Two important things can be observed from this figure. First, the design flow is similar to that of an ASIC, and as such, the design specification and IP steps share the same threats and countermeasures found in the ASIC literature. Second, the synthesis, place and route, and bitstream generation steps have distinct differences compared to an ASIC, due to their reconfigurability and the fact that the physical FPGA fabric, including the model-specific hardware information, is often public and known to an attacker.
Our taxonomy in Figure 2 divides threats in the bitstream- generation phase into two categories: malicious intent and non-malicious intent. Malicious intent refers to an attacker deliberately performing an attack, such as Trojan insertion or IP overuse during the generation of a bitstream. Non-malicious intent is presented to cover the expanding threat space where vulnerabilities are unintentionally introduced by the complex FPGA design tools generating final bitstreams.
Fig. 3: A simplified view of the FPGA bitstream generation flow.
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 3
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
A. Malicious Design Flow Threats
Trojan Attacks: Attacks on the design IP within the bitstream generation flow are often very similar to attacks on design IPs in an ASIC design flow. For example, hardware Trojan insertion [12] shares the same basic attack principles for register transfer level (RTL) design IP, independent of whether the IP is targeting an FPGA or an ASIC. Once the design IP has been synthesized into the specific design elements inside the FPGA, the synthesized blocks become vulnerable to Trojan insertion attack. Mal-Sarkar et al. discussed FPGA- specific post-synthesis threat vectors [13] as illustrated in Figure 4. Here, logic blocks contain programmable lookup tables (LUTs), combinational logic primitives such as adders, and sequential elements in the form of flip-flops and latches. These elements are combined to implement combinational and sequential logic functions. There are also routing elements: local interconnects, connection boxes, and switch boxes which are used to route outputs between logic blocks.
An attacker, such as a rogue employee with access to the design in this post-synthesis state can change the properties of logic blocks to introduce a Trojan or modify the design functionality in some way. After synthesis, logic blocks go through a place and route step where they are placed within the FPGA fabric at specific locations. Similarly, routing ele- ments are placed and configured to achieve the desired design functionality. An attacker here can potentially modify the placement of the blocks or add additional blocks to the design.
Lastly, the generated bitstream provides the correlation between the configuration memory inside the FPGA and the behavior of the logic blocks and routing elements. The bitstream can be attacked directly to modify the configuration memory which in turn modifies the functionality of FPGA, as will be discussed in Section V. During the synthesis and place and route steps, the designs are often checkpointed by the bitstream development software. These design checkpoints allow for the possibility of an insider threat to modify or insert elements in the design by the editing of the intermediate software file or by even creating a malicious modification to the FPGA design software [14].
Trojan Countermeasures: Borrowing from ASIC Trojan detection work, techniques presented by Salmani et al. [15] can be used to detect Trojans in FPGA designs at the IP and synthesis levels. These techniques operate on the principle that the triggering of a Trojan is likely a rare occurrence, and thus potentially identified by profiling and/or simulating a design to probe for rarely activated logic. At the place and route level, techniques have been presented for ASICs using the built-in self-authentication (BISA) [16] approach to add a test infrastructure inside a design to test for the placement of additional malicious logic. Khaleghi et al. extended this concept into the FPGA space to fill unused Logic Blocks and routing elements with a test-verifiable dummy design to prevent attackers from utilizing unused FPGA resources to insert Trojans [17].
IP Piracy Attacks: IP piracy-based threats in the bitstream-
Fig. 4: The fundamental building blocks of the FPGA with high- lighted Trojan insertion points [13].
generation phase [18] involve common threats familiar to ASIC and software IP piracy. IP overuse refers to implement- ing more instances of an IP than specified by the IP licensing agreement, and it is becoming a larger threat as the market for FPGA IP grows. IPs without specific licensing protection are vulnerable to an attacker generating more bitstreams than allowed. IP theft, IP reuse, and IP reverse engineering are also a growing concern as techniques have been published discussing tool flows to convert between intermediate formats in the FPGA design flow cycle [19]. The specific issues regarding the direct manipulation of the bitstream at the end of the FPGA design flow are discussed in Section V.
IP Piracy Countermeasures: To detect the instances of IP piracy, watermarks [20] can be inserted by the user in the IP design stage and then evaluated at a later time to provide a proof of authorship. FPGA vendor software packages currently offer the distribution of third-party IP encrypted using IEEE standard p1735 [21] to protect against reverse engineering activities. To further protect against IP overuse, researchers have proposed methodologies incorporating a PUF response from the chip into the licensing to generate a device-specific key to enable design functionality within a given device [22]. The general concept is shown in Figure 5, where a locked bitstream component containing the IP is stored alongside a challenge in a non-volatile memory. At runtime, the PUF is evaluated, and its response is used to unlock the IP to enable its design functionality. Two-party variants of these licensing schemes have been proposed as well to improve efficiency [23], [24]. Logic obfuscation is another powerful technique used at the design level to defend against IP piracy [25]. Typical logic obfuscation schemes integrate logic locking gates into a design to disable normal functionality unless correct values are applied to the logic locking gate inputs.
B. Non-Malicious Threats
Attacks: Traditionally, the vendor bitstream generation tools do not inherently offer security checking while they implement the design flow. Consequently, there may be un- intended security vulnerabilities introduced during the trans-
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 4
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 5: A PUF-based licensing scheme binding IP to specific FPGA devices to prevent FPGA IP piracy and overuse [22].
lation of a design into the logic blocks inside the FPGA. High-level synthesis (HLS) specifically creates a higher level of design abstraction for logic block representation, which may increase the probability of unintended vulnerabilities. An example for this may be an AES encryption engine implemen- tation, which is cryptographically secure at the C language abstraction but leaks information when it is synthesized to a hardware logic for an FPGA implementation [26]. Research findings compromising FPGA bitstream generation tools at various stages have also been published [14].
Countermeasures: The defense against tool-induced vul- nerabilities first begins by adhering to proven best software security processes and verifying the tool authenticity through the use of a trusted vendor-provided hash during tool download and installation. At the design level, researchers have proposed a moving target defense to defend against attacks originating from malicious FPGA software tooling [27]. The movement of the target in this situation is the randomization of the synthesis and place and route operations by the vendor tools so that the attacker cannot predict the necessary information from a user design required to conduct a meaningful attack. Researchers have addressed high-level Trojan insertion by proposing Trojan-aware HLS [26]. Here, equivalence checking is performed between the original higher-level code and the lower-level code during the design space exploration (DSE) of the HLS operation. Moreover, a set of security properties can be developed to be used in formal verification tools to ensure the safe design translation. FPGA software vendors can also provide design checkpoint hashes which should be used along with proven best practices for software security during the design development.
V. BITSTREAM-AT-REST
After a bitstream has been generated, it needs to be stored someplace so that it can eventually be loaded onto the FPGA on its quest to perform its intended function. We use the term bitstream-at-rest to define this storage state. Bitstream storage locations for this stage can include multiple loca- tions, including the hard disk of the computer used to run the FPGA development software during bitstream generation.
Other storage locations can include a non-volatile memory used to configure the FPGA upon the application of power, or even a software repository of system-level firmware images containing an FPGA bitstream. Bitstreams in this state may be stored in their encrypted or plaintext versions, and are vulnerable to tampering or IP extraction.
In order to develop an attack against a bitstream or to extract its IP, a relationship between the bitstream and its hardware behavior must be established. The format of a bitstream for Xilinx, Intel, and Microsemi FPGAs is vendor proprietary and often serves as the first line of defense against such threats. However, multiple researchers have published techniques to successfully reverse engineer a vendor-proprietary plaintext bitstream into a netlist [29], [28]. The flow by Zhang et al. [28] depicted in Figure 6 is an example of one such technique where the bitstream is parsed into a functioning netlist that can be simulated and analyzed. Once a netlist has been obtained, it can be used to analyze a design for Trojans [28], or used for malicious activities such as IP piracy or tampering.
Our threat taxonomy in Figure 2 divides threats in the bitstream-at-rest stage into bitstream tampering and IP piracy categories. Example attacks and countermeasures are pre- sented below.
A. Bitstream Tampering
Attacks: Chakraborty et al. first introduced the concept of Trojan insertion using plaintext bitstream manipulations in 2013 [30]. Since then, researchers have increased the sophis- tication of bitstream manipulation attacks to create automated blind attacks on soft IP blocks within an FPGA, such as soft-encryption cores [31]. Bitstream reverse engineering can further refine the scope of bitstream-based attacks to target specific soft IP blocks inside the FPGA fabric [32].
Countermeasures: Techniques proposed by Kamali et al. [33] and Karam et al. [34] help defend against tampering attacks by applying logic locking at the bitstream level to help obfuscate the netlist functionality from the attacker. To accomplish the logic locking, the authors construct keys at runtime using PUFs implemented within the FPGA fabric. The PUF responses for each device are then connected to lookup tables (LUTs) within the user design to act as a key so that
Fig. 6: A reverse engineering workflow translating a decrypted bitstream into a netlist [28].
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 5
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
correct design functionality will only occur if the correct key value is applied. Hence, if the attacker does not know the correct key values, the attacker will not be able to extract the correct functional netlist and as a consequence, will not be able to find the desired node to tamper.
B. IP Piracy
Attacks: Another attack goal may be to extract proprietary IP from a bitstream. Motivations for this may include not paying for IP and then subsequently using the IP illegally in a design, or reverse engineering an IP to extract proprietary information that may be used commercially.
Countermeasures: Inserting watermarks at the bitstream level has been proposed by Schmid et al. [35]. In contrast, RTL watermarking, this technique directly embeds a watermark into the LUT contents of a design. Watermark extraction and comparison is then performed at the bitstream level to determine authorship.
C. Single Key Encryption
Countermeasures: Modern Xilinx, Intel, and Microsemi FPGAs offer encrypted versions of the generated bitstreams to increase resistance to bitstream tampering, piracy, and reverse engineering activities. In fact, modern Microsemi FPGAs such as the Polarfire, Igloo2, and SmartFusion2 series, only store encrypted versions of their bitstream [36]. While researchers have demonstrated bitstream tampering attacks on encrypted bitstreams [31] that have an observable behavior, encryption makes targeted tampering attacks infeasible without knowl- edge of the encryption key.
D. Red/Black Encryption
Countermeasures: The concept of a red/black encryption scheme has been adopted by Intel, Xilinx, and Microsemi in their respective Stratix 10, Ultrascale+, and Polarfire product lines. The basic concept is outlined in Figure 7 for the Xilinx Ultrascale+ Zynq [37]. Here, the red key used to decrypt the
Fig. 7: A red/black encryption key flow where the key decrypting the bitstream at runtime is obfuscated from the key stored inside the FPGA [37].
bitstream is not stored directly inside the FPGA. Instead, a device-specific PUF is used to generate a black key that is stored inside the FPGA. Upon power-up, the PUF is exercised, and its response (black key) is used to generate the red key for decrypting the bitstream used to populate the FPGA fabric. As the black key is not directly used to encrypt the bitstream, it cannot be used by itself to decrypt the bitstream by an attacker.
VI. BITSTREAM-LOADING
The Bitstream-Loading stage loads the bitstream into the configuration memory of the FPGA. The specifics of this stage vary depending upon the configuration memory variant from different vendors and FPGA device models. Non-volatile memory-based FPGAs, such as Flash-based or antifuse-based, only experience this stage when loading a new bitstream. SRAM-based FPGAs require the bitstream to be loaded every time power is applied to the FPGA to turn it on for functional application. A set of on-chip authentication and decryption circuitry is often employed by the FPGA during this stage to both authenticate and decrypt the bitstream before loading it into the configuration memory. FPGA-based system-on-chip devices, such as the Xilinx Zynq family, add additional fea- tures to the bitstream loading process in terms of bootloaders as well as physical processor cores implemented on the same silicon. Attacks considered within this stage originate from unintended side channels as well as loading outdated, and potentially vulnerable, bitstream versions.
Our threat taxonomy in Figure 2 divides threats in the bitstream-loading stage into two categories. First, we discuss side channel threats which involve the extraction of sensitive on-chip information during the loading of the bitstream. Sec- ond, we discuss replay attacks, where older, or potentially unauthorized versions of a bitstream are loaded into the FPGA.
A. Side Channel Threats
Attacks: Side channel attacks (SCA) have been applied to earlier generations of FPGAs to extract encryption keys by collecting information through unintended side channels. Encryption keys have been extracted in early generations of FPGAs by researchers analyzing the power consumption during the decryption process [38]. Similarly, information from a user design running in the fabric has been shown to leak in the electromagnetic spectrum [39]. Recently, laser- based approaches have shown the ability to read out on-chip information [40].
Countermeasures: FPGA vendors have addressed these attacks by implementing side-channel defenses [36], [10], [11] in their latest products to eliminate the leakage of key material. Key rolling limits the amount of time an attacker has to extract a given key, defeating attacks that rely upon multiple samples such as differential power analysis (DPA). The black/red scheme discussed in section V-D reduces the impact of a key being extracted as well since the key stored in the non-volatile memory of the FPGA is not the final key used to encrypt or decrypt the bitstream. In addition, the academic research
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 6
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 8: Xilinx secure boot process flow where user code first stage bootloader (FSBL) loads the bitstream into the programmable logic of an FPGA-based SoC [37].
community has proposed physical techniques such as nanopy- ramids [41] to defend against these attacks. Nanopyramids are intended to be inserted in the device manufacturing flow to introduce random changes in the optical reflectance properties of silicon when conducting optical probing attacks, preventing an attacker using reflectance information to reveal information about the corresponding circuit storing key material.
B. Bitstream Replay Threats
Attacks: Bitstream versioning refers to the concept of having multiple versions of a bitstream for a given FPGA- based system. Analogous to the software world, a vulnerability can be discovered within an FPGA bitstream, requiring an updated bitstream to be loaded into the device. However, if the FPGA has already been deployed in the field, an adversary can potentially downgrade it to use the original bitstream containing the vulnerability [36]. Classical encryption and authentication techniques do not protect against this concern as the original vulnerable bitstream was encrypted and authen- ticated with the same encryption key as the updated bitstream. The act of securely transmitting an updated bitstream to the FPGA is another security concern.
Countermeasures: Microsemi addresses replay attacks by implementing a versioning control in their bitstream, combined with setting different non-volatile version control bits within their FPGA [36]. Xilinx offers QuickBoot [42] as a solution to load different bitstream versions in different non-volatile memory locations depending upon bits set in the bitstream. Researchers have addressed possible security concerns with the concept of transmitting an updated bitstream to a device by implementing an authenticated station-to-station protocol [43] or implementing custom protocols within user logic [44].
C. FPGA-based SoCs
Attacks: The introduction of the SoC-based FPGAs such as the Xilinx Zynq [37] and Microsemi SmartFusion [9] adds additional steps to the loading of the bitstream. SoC- based FPGAs introduce the concept of a first stage bootloader (FSBL), which is a user code designed to facilitate the loading of the bitstream as well as to configure the non-FPGA aspects of the SoC, such as the processor and other hard IP blocks. Figure 8 shows the Xilinx “secure boot” implementation where immutable BootROM code is used to boot the SoC and run the user code within the FSBL, which eventually loads the bitstream [37]. Attacks in this boot process can thus result from running a malicious FSBL code in the SoC processor.
Countermeasures: To protect the privacy and integrity of the FSBL, SoCs typically implement a FSBL authentication
Fig. 9: Xilinx Zynq first stage bootloader (FSBL) authentication process [45].
scheme, such as the RSA-based authentication applied to the Xilinx Zynq series shown in Figure 9. Here, a public/private key pair is used to compare a hash signature on FSBL code with a hash signature stored in the FPGA’s non-volatile memory to only run authenticated FSBL code [45].
VII. BITSTREAM-RUNNING
The Bitstream-Running stage defines the stage when the bitstream has been loaded into the configuration memory, and the FPGA is operating according to its hardware configuration. As shown in our threat taxonomy in Figure 2, this stage is vulnerable to fault injection and run-time threats originat- ing within the fabric. Faults injected into the configuration memory, or directly into logic blocks and routing resources, can modify the functionality of the FPGA and are a primary concern in this stage. To correct faults in the configuration memory, and to provide the FPGA designer with more flex- ibility, modern FPGAs also include a partial reconfiguration framework within their architectures to allow for the bitstream to change dynamically at run-time. Partial reconfiguration allows the FPGA design itself to update portions of the design at run-time while keeping the remainder of the design intact.
We divide threats in the bitstream-running stage into two categories according to our threat model in Figure 2. First, we discuss fault injection on a running FPGA design. Next, we discuss the emerging topic of run-time attacks.
A. Fault Injection
Threats: Faults may be injected into an FPGA running a de- sign through a variety of means, such as clock glitches, power glitches, electromagnetic pulses, laser exposure, or ionizing ra- diation [46]. The physical mechanisms behind fault injections have root in the physical transistors themselves. Therefore, threats to integrated circuits can be considered applicable to FPGAs. For example, random bit flips in the configuration memory caused by atmospheric single event upsets (SEUs) [47] are more of a concern as technology feature sizes shrink and thus more of a concern for newer FPGAs fabricated in state-of-the-art manufacturing processes. Targeted laser-based fault injection [48] has also been discussed by researchers, primarily as a means to replicate SEUs for hardening designs to space radiation effects.
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 7
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Countermeasures: Partial reconfiguration cores, such as the internal configuration access port (ICAP) for Xilinx FP- GAs, are included in modern FPGAs. Providing a user design with access to the partial reconfiguration core in an FPGA has been largely regarded as a security vulnerability [5] as the user design can then have the capability to read and write any area of the configuration memory. However, partial reconfiguration is also used as a mechanism to detect and correct for inadvertent bit flips in the configuration memory, such as those caused by radiation-induced single event upsets [49]. Several academic papers have proposed the use of the Xilinx ICAP to read the configuration memory of an FPGA at run-time and generate a hash for comparison against an expected hash in order to detect run-time tampering [50], [51]. In any usage of a partial reconfiguration core, proper safeguards must be put in place.
B. Run-time Attacks
The massively parallel nature of FPGAs has lent themselves to inclusion in data centers where users can purchase comput- ing time. Amazon offers fee-based access to its FPGAs in the cloud through their Amazon Web Service (AWS) program [52]. Here, a shell architecture is described to abstract away communication links and create a separate application area in the FPGA. First, a design is created containing an ‘AWS partial reconfigurable (PR) shell’ to facilitate the loading of a user design. Next, the user design is loaded by the AWS PR shell to fit into the designated user ‘custom PR logic’ section of the floorplan. The shell protects a Peripheral Component Interconnect (PCI) Express connection in the ‘static’ region, manages clocking for the user region, and monitors activity elsewhere in the FPGA [53].
Threats: The shared FPGA computing resources described above have enabled a new class of remote side-channel attack, where one bitstream can leak or corrupt information in another bitstream, from a remote location. The example attack typi- cally runs a user design, such as an oscillator-based array, on a shared FPGA fabric in order to affect another user’s design [54], [55]. In Figure 10(a), Schellenberg et al. showed that a malicious bitstream can extract secrets from a victim bitstream sharing the same FPGA fabric [54] by corrupting the power in the shared power distribution network (PDN). It is also shown that malicious code running in the FPGA SoC can corrupt the PDN in a similar way to extract secrets from a victim FPGA design (see Figure 10(b)). Similarly, research has been performed to illustrate the possibility of a shared system-wide resource like the printed circuit board (PCB) PDN being used to affect an FPGA design. Here, the PCB PDN is corrupted by another PCB component to induce faults or leak information from the FPGA [55].
Countermeasures: Vendor-provided defenses to this type of attack include a bitstream-level screening of tenant bit- streams to check for suspicious functionality, such as multiple parallel ring-oscillator arrays that could potentially create power glitching. Modern FPGAs also incorporate on-chip voltage and temperature sensors to allow for the detection of
Fig. 10: Remote side-channel attack where user bitstream information can be extracted by a) malicious bitstreams on the same FPGA, or b) malicious CPU code on the same FPGA SoC [54].
anomalies in a shared FPGA resource, like the PDN. Vendor- provided soft-core defenses, such as the Xilinx Security Mon- itor (SecMon) core, implement the reading of on-chip voltage and temperature sensors, as well as the configuration memory health, to detect anomalous behavior and implement tampering penalties such the zeroization of the configuration memory, AES keys, or asserting the global reset of the FPGA [56].
VIII. BITSTREAM-END-OF-LIFE
The last stage in our bitstream’s lifecycle is defined as bitstream-end-of-life. We use this term to represent both the end-of-life (EOL) of a bitstream as well as a stage to capture threats to the physical FPGA device that are tangentially related to the bitstream. EOL in this context refers to when the FPGA running a bitstream has been decommissioned. This could refer to a formal decommission and destruction of a high-value proprietary system or the casual disposal of an FPGA-based networking router to a public trashcan. For other threats related to the bitstream, we briefly address FPGA device counterfeiting and reverse engineering.
We again refer to our threat taxonomy in Figure 2 and focus our discussion on two categories: data remanence and FPGA device counterfeiting.
A. Bitstream Remanence
Threats: As FPGA-based systems reach EOL, their bit- streams are also retired from use. A bitstream residing in an on-board non-volatile memory chip initially designed to program an SRAM-based FPGA may remain on that board in- definitely, remaining vulnerable to potential bitstream reverse engineering activities. Similarly, a bitstream stored in a Flash or antifuse-based FPGA may remain on the FPGA after the system has been disposed, creating a potential opportunity for bitstream extraction.
Countermeasures: FPGA vendors have incorporated ze- roization mechanisms into their on-chip security features that allow for certain information stored within an FPGA to be deleted by a user or as a tamper penalty. This information space may include the original bitstream, or any other volatile and non-volatile information inside the FPGA. The Microsemi PolarFire FPGA family offers three levels of zeroization: like-new, recoverable, and unrecoverable [36]. The like-new option deletes user data and keys and returns the device to its factory state. Recoverable is more comprehensive and places the device in a state that is only recoverable by a Microsemi
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 8
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
Fig. 11: programmable ICs (FPGAs) were the most reported coun- terfeit device type reported by ERAI in 2017 [22].
factory programming file. The unrecoverable option is the most thorough, incorporating the destruction of all on-chip data. Serialization certificates are provided by the device in all three cases via a JTAG/SPI instruction to prove that the operation was successful. Similar procedures are provided by other manufacturers as well [56].
B. FPGA Device Counterfeiting
Threats: FPGAs are frequently among the most popular counterfeit IC device types. As shown in Figure 11, ERAI listed FPGAs as occupying approximately 20 percent of their reported counterfeit part instances [57]. An example FPGA counterfeiting technique involves the selling of used devices as new devices. Remarking devices to represent more expensive devices is also a concern, as legacy and industrial/military grade FPGAs sell for a significant premium compared to their standard counterparts. These counterfeits FPGAs pose a threat to systems as their electrical and mechanical specifications as well as reliability are subject to compromise.
Countermeasures: FPGA vendors have addressed counter- feiting by both offering newer FPGA product lines designed to serve as drop-in replacements for legacy FPGAs and making their devices more difficult to counterfeit. Xilinx offers pin- compatible devices in their newer Ultrascale+ product lines that can replace older Ultrascale devices [58]. The major FPGA vendors also offer device-specific markings, such as unique packaging lid shapes, that defend against simple package remarking attacks [59]. Academic researchers have proposed the electrical characterization of oscillator structures programmed into the FPGAs to tease out reliability physics mechanism responses such as negative bias temperature insta- bility (NBTI) and hot carrier injection (HCI) that can indicate whether an FPGA has had previous usage [60].
FPGA vendors have begun incorporating fabric-accessible mask-level device serial numbers and lot numbers, such as the DeviceDNA information found in Xilinx FPGAs, to al- low users to determine the authenticity of a given FPGAs. DeviceID information indicating an FPGA product family line is often accessible through the IEEE joint test action group (JTAG) interface as well. Other counterfeit detection techniques designed for ASICs are also applicable to FPGAs and considered out of the scope for this paper [61].
IX. CONCLUSION
Our journey through the life of a bitstream has now come to an end. We have ventured through five different stages within the bitstream lifecycle: 1) bitstream-generation, 2) bitstream-at-rest, 3) bitstream-loading, 4) bitstream-running, and 5) bitstream-end-of-life. Each stage offered a connection to different entities in the FPGA design flow and contained unique threats along with countermeasures available from both FPGA vendors and academia. A threat taxonomy was introduced to capture the complex interactions between the bitstream stages and the design flow entities and highlight stage-specific threats.
Our threat taxonomy divided threats into two broad cate- gories according to each distinct bitstream stage. More specific threats and countermeasures were discussed in each threat category to help inform the reader of the current state of the art. As with any security-based research, a holistic approach towards security is recommended for each design flow entity to identify the pertinent threats and implement appropriate countermeasures.
REFERENCES [1] Microsemi, “Field-programmable gate array technology..” Norwell, MA,
USA:Kluwer, 1994. [2] S. Trimberger, “Three ages of fpgas: A retrospective on the first thirty
years of fpga technology,” Proceedings of the IEEE, vol. 103, no. 3, pp. 318–331, 2015.
[3] W. Carter, K. Duong, R. H. Freeman, H. Hsieh, J. Y. Ja, J. E. Mahoney, L. T. Ngo, and S. L. Sze, “A user programmable reconfigurable gate array,” in Proceedings Custom Integrated Circuits Conference, pp. 233– 235, IEEE, 1986.
[4] Xilinx, “Ultrascale fpga product tables and product selection guide..” Xilinx, 2016.
[5] S. M. Trimberger and J. J. Moore, “Fpga security: Motivations, features, and applications,” Proceedings of the IEEE, vol. 102, no. 8, pp. 1248– 1265, 2014.
[6] W. Zhao, E. Belhaire, C. Chappert, and P. Mazoyer, “Spin transfer torque (stt)-mram–based runtime reconfiguration fpga circuit,” ACM Transactions on Embedded Computing Systems (TECS), vol. 9, no. 2, p. 14, 2009.
[7] Xilinx, “Configuration issues: Power-up, volatility, security, battery back-up.” Xilinx, Appl. Note XAPP092, 1997.
[8] Xilinx, “Method and apparatus for protecting proprietary configuration data for programmable logic devices.” U.S. Patent 6 654 889, 2003.
[9] Microsemi, “Ug0443 user guide smartfusion2 and igloo2 fpga security and best practices.” 2015.
[10] E. Peterson, “Xapp1098 (v1.3): Developing tamper-resistant designs with ultrascale and ultrascale+ fpgas,” 2018.
[11] Intel, “Ug-s10security:intel stratix 10 device security user guide,” 2019. [12] M. Tehranipoor and F. Koushanfar, “A survey of hardware trojan
taxonomy and detection,” IEEE design & test of computers, vol. 27, no. 1, pp. 10–25, 2010.
[13] S. Mal-Sarkar, A. Krishna, A. Ghosh, and S. Bhunia, “Hardware trojan attacks in fpga devices: threat analysis and effective counter measures,” in Proceedings of the 24th Edition of the Great Lakes Symposium on VLSI, pp. 287–292, ACM, 2014.
[14] C. Krieg, C. Wolf, and A. Jantsch, “Malicious lut: a stealthy fpga trojan injected and triggered by the design flow,” in Proceedings of the 35th International Conference on Computer-Aided Design, p. 43, ACM, 2016.
[15] H. Salmani and M. Tehranipoor, “Analyzing circuit vulnerability to hard- ware trojan insertion at the behavioral level,” in 2013 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS), pp. 190–195, IEEE, 2013.
[16] K. Xiao and M. Tehranipoor, “Bisa: Built-in self-authentication for preventing hardware trojan insertion,” in 2013 IEEE international sym- posium on hardware-oriented security and trust (HOST), pp. 45–50, IEEE, 2013.
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 9
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
[17] B. Khaleghi, A. Ahari, H. Asadi, and S. Bayat-Sarmadi, “Fpga-based protection scheme against hardware trojan horse insertion using dummy logic,” IEEE Embedded Systems Letters, vol. 7, no. 2, pp. 46–50, 2015.
[18] A. Lesea, “Ip security in fpgas,” Xilinx http://direct. xilinx. com/bvdoc- s/whitepapers/wp261. pdf, 2007.
[19] N. Steiner, A. Wood, H. Shojaei, J. Couch, P. Athanas, and M. French, “Torc: towards an open-source tool flow,” in Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays, pp. 41–44, ACM, 2011.
[20] A. K. Jain, L. Yuan, P. R. Pari, and G. Qu, “Zero overhead watermarking technique for fpga designs,” in Proceedings of the 13th ACM Great Lakes symposium on VLSI, pp. 147–152, ACM, 2003.
[21] IEEE, “Ieee recommended practice for encryption and management of electronic design intellectual property (ip).ieee sa-1735-2014.” 2014.
[22] J. Zhang, Y. Lin, Y. Lyu, and G. Qu, “A puf-fsm binding scheme for fpga ip protection and pay-per-device licensing,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 6, pp. 1137–1150, 2015.
[23] M. T. Rahman, D. Forte, Q. Shi, G. K. Contreras, and M. Tehranipoor, “Csst: an efficient secure split-test for preventing ic piracy,” in 2014 IEEE 23rd North Atlantic Test Workshop, pp. 43–47, IEEE, 2014.
[24] D. B. Roy, S. Bhasin, I. Nikolić, and D. Mukhopadhyay, “Combining puf with rluts: A two-party pay-per-device ip licensing scheme on fpgas,” ACM Transactions on Embedded Computing Systems (TECS), vol. 18, no. 2, p. 12, 2019.
[25] J. Rajendran, Y. Pino, O. Sinanoglu, and R. Karri, “Security analysis of logic obfuscation,” in Proceedings of the 49th Annual Design Automa- tion Conference, pp. 83–89, ACM, 2012.
[26] A. Sengupta, S. Bhadauria, and S. P. Mohanty, “Tl-hls: methodology for low cost hardware trojan security aware scheduling with optimal loop unrolling factor during high level synthesis,” IEEE Transactions on computer-aided design of integrated circuits and systems, vol. 36, no. 4, pp. 655–668, 2016.
[27] Z. Zhang, Q. Yu, L. Njilla, and C. Kamhoua, “Fpga-oriented moving target defense against security threats from malicious fpga tools,” in 2018 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), pp. 163–166, IEEE, 2018.
[28] T. Zhang, J. Wang, S. Guo, and Z. Chen, “A comprehensive fpga reverse engineering tool-chain: From bitstream to rtl code,” IEEE Access, vol. 7, pp. 38379–38389, 2019.
[29] F. Benz, A. Seffrin, and S. A. Huss, “Bil: A tool-chain for bitstream reverse-engineering,” in 22nd International Conference on Field Pro- grammable Logic and Applications (FPL), pp. 735–738, IEEE, 2012.
[30] R. S. Chakraborty, I. Saha, A. Palchaudhuri, and G. K. Naik, “Hardware trojan insertion by direct modification of fpga configuration bitstream,” IEEE Design & Test, vol. 30, no. 2, pp. 45–54, 2013.
[31] P. Swierczynski, G. T. Becker, A. Moradi, and C. Paar, “Bitstream fault injections (bifi)–automated fault attacks against sram-based fpgas,” IEEE Transactions on Computers, vol. 67, no. 3, pp. 348–360, 2017.
[32] M. Ender, P. Swierczynski, S. Wallat, M. Wilhelm, P. M. Knopp, and C. Paar, “Insights into the mind of a trojan designer: the challenge to integrate a trojan into the bitstream,” in Proceedings of the 24th Asia and South Pacific Design Automation Conference, pp. 112–119, ACM, 2019.
[33] H. M. Kamali, K. Z. Azar, K. Gaj, H. Homayoun, and A. Sasan, “Lut-lock: A novel lut-based logic obfuscation for fpga-bitstream and asic-hardware protection,” in Proceedings VLSI (ISVLSI) 2018 IEEE Computer Society Annual Symposium on. EH-2001, pp. 405–410, IEEE, 2018.
[34] R. Karam, T. Hoque, S. Ray, M. Tehranipoor, and S. Bhunia, “Ro- bust bitstream protection in fpga-based systems through low-overhead obfuscation,” in 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp. 1–8, IEEE, 2016.
[35] M. Schmid, D. Ziener, and J. Teich, “Netlist-level ip protection by watermarking for lut-based fpgas,” in 2008 International Conference on Field-Programmable Technology, pp. 209–216, IEEE, 2008.
[36] Microsemi, “User guide polarfire fpga security.” Microsemi, User Guide UG07532, 2018.
[37] E. Peterson, “Xapp1323 (v1.1): Developing tamper-resistant designs with zynq ultrascale+ devices,” 2018.
[38] A. Moradi, A. Barenghi, T. Kasper, and C. Paar, “On the vulnerability of fpga bitstream encryption against power analysis attacks: extracting keys from xilinx virtex-ii fpgas,” in Proceedings of the 18th ACM conference on Computer and communications security, pp. 111–124, ACM, 2011.
[39] E. De Mulder, P. Buysschaert, S. Ors, P. Delmotte, B. Preneel, G. Van- denbosch, and I. Verbauwhede, “Electromagnetic analysis attack on an fpga implementation of an elliptic curve cryptosystem,” in EUROCON 2005-The International Conference on” Computer as a Tool”, vol. 2, pp. 1879–1882, IEEE, 2005.
[40] S. Tajik, H. Lohrke, J.-P. Seifert, and C. Boit, “On the power of optical contactless probing: Attacking bitstream encryption of fpgas,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1661–1674, ACM, 2017.
[41] H. Shen, N. Asadizanjani, M. Tehranipoor, and D. Forte, “Nanopyramid: An optical scrambler against backside probing attacks,” in ISTFA 2018: Proceedings from the 44th International Symposium for Testing and Failure Analysis, p. 280, ASM International, 2018.
[42] Xilinx, “Quickboot method for fpga design remote update.” Xilinx, Appl. Note XAPP1081, 2014.
[43] J. Vliegen, N. Mentens, and I. Verbauwhede, “Secure, remote, dynamic reconfiguration of fpgas,” ACM Transactions on Reconfigurable Tech- nology and Systems (TRETS), vol. 7, no. 4, p. 35, 2015.
[44] S. Drimer and M. G. Kuhn, “A protocol for secure remote updates of fpga configurations,” in International Workshop on Applied Reconfig- urable Computing, pp. 50–61, Springer, 2009.
[45] E. Peterson, “Wp468 (v1.0): Leveraging asymmetric authentication to enhance security-critical applications using zynq-7000 all programmable socs,” Retrieved October, 2015.
[46] H. Li, G. Du, C. Shao, L. Dai, G. Xu, and J. Guo, “Heavy-ion microbeam fault injection into sram-based fpga implementations of cryptographic circuits,” IEEE Transactions on Nuclear Science, vol. 62, no. 3, pp. 1341–1348, 2015.
[47] A. Lesea, S. Drimer, J. J. Fabula, C. Carmichael, and P. Alfke, “The rosetta experiment: atmospheric soft error rate testing in differing tech- nology fpgas,” IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 317–328, 2005.
[48] V. Pouget, A. Douin, G. Foucard, P. Peronnard, D. Lewis, P. Fouillat, and R. Velazco, “Dynamic testing of an sram-based fpga by time-resolved laser fault injection,” in 2008 14th IEEE International On-Line Testing Symposium, pp. 295–301, IEEE, 2008.
[49] J. Heiner, B. Sellers, M. Wirthlin, and J. Kalb, “Fpga partial reconfigu- ration via configuration scrubbing,” in 2009 International Conference on Field Programmable Logic and Applications, pp. 99–104, IEEE, 2009.
[50] T. Güneysu, I. Markov, and A. Weimerskirch, “Securely sealing multi- fpga systems,” in International Symposium on Applied Reconfigurable Computing, pp. 276–289, Springer, 2012.
[51] D. Owen Jr, D. Heeger, C. Chan, W. Che, F. Saqib, M. Areno, and J. Plusquellic, “An autonomous, self-authenticating, and self-contained secure boot process for field-programmable gate arrays,” Cryptography, vol. 2, no. 3, p. 15, 2018.
[52] D. Pellerin, “Announcing amazon ec2 fi instances with custom fpgas.” ”https://www.slideshare.netlAmazonWebServices/ announcing-amazon-ec2-fl-instances-with-custom-fpgas, retrieved,April13,2017”.
[53] S. Trimberger and S. McNeil, “Security of fpgas in data centers,” in 2017 IEEE 2nd International Verification and Security Workshop (IVSW), pp. 117–122, IEEE, 2017.
[54] F. Schellenberg, D. R. Gnad, A. Moradi, and M. B. Tahoori, “An inside job: Remote power analysis attacks on fpgas,” in 2018 Design, Automa- tion & Test in Europe Conference & Exhibition (DATE), pp. 1111–1116, IEEE, 2018.
[55] M. Zhao and G. E. Suh, “Fpga-based remote power side-channel attacks,” in 2018 IEEE Symposium on Security and Privacy (SP), pp. 229–244, IEEE, 2018.
[56] Xilinx, “Security monitor ip core product brief.” Xilinx, Product Brief, 2015.
[57] D. Akhoundov, “2017 erai reported parts analysis.” ”http: //www.erai.com/ERAI Blog/3139/Damir Akhoundov 2017 ERAI Reported Parts Analysis”.
[58] Xilinx, “Ultrascale architecture and product data sheet: Overview.” Xilinx, Datasheet DS890 (v3.10), 2019.
[59] Xilinx, “Xq ultrascale architecture data sheet: Overview.” Xilinx, Datasheet DS895 (v2.0), 2018.
[60] M. M. Alam, M. Tehranipoor, and D. Forte, “Recycled fpga detection using exhaustive lut path delay characterization,” in 2016 IEEE Inter- national test conference (ITC), pp. 1–10, IEEE, 2016.
[61] M. M. Tehranipoor, U. Guin, and D. Forte, “Counterfeit integrated circuits,” in Counterfeit Integrated Circuits, pp. 15–36, Springer, 2015.
Security Invited 1.1 INTERNATIONAL TEST CONFERENCE 10
Authorized licensed use limited to: University College London. Downloaded on May 23,2020 at 18:03:46 UTC from IEEE Xplore. Restrictions apply.
sources/166/Trimberger and Moore - 2014 - FPGA Security From Features to Capabilities to Tr.pdf
FPGA Security: From Features to Capabilities to Trusted Systems
Steve Trimberger Xilinx
2100 Logic Dr. San Jose, CA 95124 USA
Jason Moore Xilinx
5051 Journal Center Boulevard NE. Albuquerque, NM 87109 USA
ABSTRACT FPGA devices provide a range of security features which can provide powerful security capabilities. This paper describes many security features included in present-day FPGAs including bitstream authenticated encryption, configuration scrubbing, voltage and temperature sensors and JTAG-intercept. The paper explains the role of these features in providing security capabilities such as privacy, anti-tamper and protection of data handled by the FPGA. The paper concludes with an example of a single-chip cryptographic system, a trusted system built with these components.
Categories and Subject Descriptors B.7.1. Integrated Circuits, Types and Design Styles. FPGA
General Terms Design, Security
Keywords FPGA, Trusted Design, Bitstream Encryption, Cryptography
1. INTRODUCTION As FPGAs have grown in capability, the value of the applications in the FPGA has grown accordingly. Starting in the early 2000s, SRAM FPGA vendors offered bitstream encryption to protect their customers’ bitstreams from reverse-engineering. The usage of FPGAs has continued to grow into applications such as digital cinema, where the data handled by the FPGAs must be protected as well. Further, attacks on the operating FPGA device have grown in sophistication, leading FPGA vendors to provide additional security features. Today, FPGAs provide a large number of features to support secure configuration and operation.
2. FPGAS AND THE MANUFACTURING FLOW The FPGA lifecycle includes two design flows: the base array design and the application design (figure 1), and security must be maintained through both[8]. The base array design is a standard integrated circuit development flow controlled by the FPGA manufacturer. The base array is designed using commercial design tools and libraries, manufactured at a foundry and tested. It is then typically sent to another facility for packaging and final test. The resulting base array is shipped to a customer or authorized distributor. The base array design is subject to all the supply chain trust and security concerns as any other integrated circuit, including questions about tampering with tools, supply- chain control and reverse-engineering. Large FPGA manufacturers maintain a close watch on their supply chain, tracking every device through to final customer delivery or destruction. In addition, they audit their suppliers’ systems and processes. As the security issues associated with the design and manufacture of the base array are no different than those of other semiconductor devices, this paper does not focus on the base array design and manufacture, but instead focuses on the security concerns that arise from the need to protect the application design.
The application design also has a design phase, typically performed with FPGA vendors’ tools, but often augmented with commercial EDA tools. The application developer integrates design information from a number of sources into an FPGA application: original and re-used HDL code, libraries from the FPGA vendor and other parties and software for soft and hard microprocessors. The FPGA vendor’s tools compile the application design into a bitstream, the programming of the FPGA base array to realize the application function. As with any design process, the design itself can be carried out in a secure location, with validated IP and tools. Protection of IP during the design phase is no different for FPGAs than it is for ASICs or microprocessors. Therefore, this paper does not address design-
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. DAC '14, June 01 - 05 2014, San Francisco, CA, USA Copyright 2014 ACM 978-1-4503-2730-5/14/06…$15.00. http://dx.doi.org/10.1145/2593069.2602555
Figure 1. FPGA lifecycle flows. Left: base array. Right: application
phase security.
3. SECURITY IN CONFIGURATION A non-volatile FPGA, such as a flash or anti-fuse FPGA, may be programmed before it is shipped. An SRAM FPGA is typically shipped with a separate non-volatile memory containing the programming, and when power is applied, the FPGA loads its programming from the non-volatile memory. This programming step was identified early as a potential security problem.
3.1 Bitstream Encryption Xilinx introduced bitstream encryption in 2001 in Virtex-II devices address the problem of cloning, unauthorized copy of the bitstream as it is loaded into the FPGA from external memory[6][7]. Since that time, other FPGA vendors have added encrypted-bitstream capability.
Preventing unauthorized copy does not strictly require encryption, since the task from a cryptographic point of view is to determine if the bitstream is authorized to operate in the FPGA. This fundamentally requires authentication, not confidentiality: a device could verify a message authentication code on the bitstream. However, a conceptually-simple attack involves reverse-engineering the bitstream and recompilation[8]. Therefore, reverse-engineering must also be prevented, so confidentiality of the bitstream became a requirement for preventing cloning.
3.2 Bitstream Authentication Encryption protects only the design, not the data handled by the design. Without some way to deter tampering with an encrypted design, one cannot guarantee that an adversary has not compromised the design to the point where he can extract data from the FPGA. The 32-bit data integrity check on the FPGA bitstream is insufficient to address this attack.
Although there have been no reports of such tampering of FPGAs, Xilinx integrated strong authentication in Virtex-6 devices and 7- series to address concerns of targeted tampering with encrypted bitstreams and the inherent cryptographic weaknesses of a CRC intended only for data integrity[9].
Virtex-6 and subsequent Xilinx FPGAs authenticate using the Secure Hash Algorithm (SHA-256) to compute a 256-bit Keyed
Hashed MAC (HMAC)[1][9]. The MAC result cannot be computed without knowing the secret hash key, thereby authenticating the identity of the sender as well as verifying that the message has not been altered. The 256-bit hash size ensures that any tampering with the bitstream will be detected with high probability. HMAC with SHA-256 makes tampering with the bitstream as computationally difficult as guessing the encryption key, which is also 256 bits.
The authentication feature provides resistance to design tampering, which assures the privacy of data inside the FPGA. Privacy of data handled by the FPGA is important in a large number of applications, including digital cinema, network communications and secure database access.
One particularly useful type of data handled by the FPGA is bitstream data. An FPGA with an authenticated encrypted bitstream can reconfigure using the internal configuration access port (ICAP) and still maintain privacy and integrity of design data, basing it on the original bitstream root of trust.
3.3 Configuration Options and Restrictions Manufacturing tests for SRAM FPGAs require that the configuration data be read back and verified, so this feature is part of the FPGA base array. To prevent theft of the application, readback is disabled when the FPGA is programmed with an encrypted bitstream. Other restrictions include prevention of mixing encrypted and non-encrypted data in a single application, since the non-encrypted application piece might be a Trojan inserted by an adversary. This restriction need only apply to external configuration. A secure application that takes control of its own programming may apply other restrictions on partial configuration, such as restricting the region for the new partial design.
As manufactured, SRAM FPGAs can be programmed with either an encrypted or unencrypted bitstream. Xilinx provides a non- volatile E-fuse that, when programmed, restricts the FPGA to accept only a secured bitstream, preventing a potential adversary from inserting a Trojan design into the system of which the FPGA is a part. Of course, an adversary can still substitute a new, un- programmed FPGA into the system, but this substitution is difficult to carry out in practice.
4. FEATURES FOR AN OPERATING FPGA Modern FPGAs include security features available to applications operating inside the FPGA. These features are selected by the FPGA application designer and included in the FPGA application design. They make the FPGA application an active agent in device security.
4.1 Device DNA Device DNA is a term used by Xilinx to refer a unique identifier for each FPGA manufactured. Device DNA is programmed into the chip during device manufacture by setting one-time- programmable E-fuses. The Device DNA field is typically 56 or 64 bits long, depending on the FPGA family. It is not secret. Anyone can read the device DNA field. The small size and lack of confidentiality of Device DNA preclude its use as a decryption key. Rather, Device DNA may be used to uniquely identify a specific
Secret FPGA/SoC
Secret “Red” AES Key
Secret “R d” AES K
User Encrypted,
Authenticated File
User Design
NV Memor
Un-Encrypted and Authenticated Configuration
AES-CBC/HMAC AES-CBC/HMAC
IMPACT SW via JTAG
Vivado/ISE
Fielded System
Figure 2. Xilinx 7-series FPGA Secure Bitstream Flows
FPGA device or a range of devices, and restrict the application to function only in those few devices.
4.2 Physically Unclonable Function (PUF) A Physically Unclonable Function (PUF) is an identifier derived from physical attributes of a specific manufactured device[2]. Like Device DNA, a PUF can uniquely identify a device. A PUF has advantages of privacy and possibly immutability and pamper- resistance. Typically, PUFs are built from FPGA fabric so they can be built of arbitrary size. A PUF may reside anywhere in the FPGA and be unidentifiable by an adversary. However, PUFs are not stable over the lifetime of an integrated circuit. Therefore, to use a PUF as a decryption key, a significant amount of ECC “helper data” is required to ensure a stable key value. The company Intrinsic-ID used a soft PUF structure to uniquely identify FPGAs for their metered-IP solution. [5]
4.3 Bitstream Scrubbing An adversary may attempt to change individual bits in the FPGA’s stored configuration data by focused radiation or power adjustment. Xilinx FPGAs include bitstream “scrubbing” hardware that includes ECC bits for each FPGA configuration data frame. When enabled, scrubbing monitors configuration data and corrects errant bits. Scrubbing has a power cost, so it is not active on all designs. An application may include an enhancement to the standard scrubbing algorithm by building the scrubbing function using the ICAP to access configuration data. Since the error correction is done in the FPGA application, the application developer selects the number of bits to correct and the encoding of the correction data.
4.4 Program Intercept As with any complex system, FPGAs include buffers, caches and other temporary data storage locations that aren’t explicitly cleared when the device is reprogrammed. This lingering temporary data may divulge sensitive information should an adversary interrupt and re-program the FPGA while it is operating. To address this, the FPGA reprogram signal can be intercepted by the operating application. The application can hold off reprogramming while it clears sensitive temporary data or terminates communications.
4.5 JTAG Intercept JTAG scan chains are useful in debugging, but problematic for security because they provide access to data and functions throughout the FPGA. In secure applications, an adversary must not have access to the JTAG scan chain. Microsemi and Xilinx provide mechanisms to permanently disable the JTAG interface as well as monitor it internally for activity. Activity on a test port such as a scan chain may indicate an attack in progress. Altera restricts the executable JTAG commands in secured application to a bare minimum.
4.6 Voltage and Temperature Monitors Xilinx recently added internal monitors on voltage and temperature to its FPGA. These monitors can be used to identify possible environmental attacks on an operating design.
4.7 Key Clear and Device Clear When an attack is detected internal signals allow the operating application to clear key data or the entire programmed configuration.
5. FROM FEATURES TO CAPABILITIES Encryption, authentication, Device DNA identifiers, PUFs, bitstream scrubbing, temperature and voltage sensors. These all are features of a security system. But the value is not in the features themselves, but in the security capabilities they provide. These capabilities include prevention of theft of the application design, prevention of tampering with an application before loading, privacy of data handled by the application, both before and during operation and metered IP. Multiple features may be required to provide a capability: depending on the expected attack, authentication alone may not guarantee the privacy of data inside the FPGA. Additional active features may be required. FPGA bitstream privacy and tamper-resistance provide the basis of further FPGA security capabilities for an application. In the FPGA environment, it is incumbent on the designer of the FPGA application to apply those and other features to achieve security capabilities. The application designer decides whether or not to defend against radiation attacks on the programming of the device. If so, the designer may activate bitstream scrubbing. Similarly, it is the application developer who integrates queries of the on-chip temperature and voltage sensors. Further, the application developer decides what results indicate an attack. Finally, the application developer decides what action to take when an attack is detected.
6. SINGLE-CHIP CRYPTOGRAPHY This section gives an overview of a security-sensitive FPGA application called Single-Chip Cryptography (SCC). SCC demands many security capabilities, which are built upon the features discussed in this paper. SCC combines algorithms and data of different levels of secrecy or control in a single device. The device must not only protect programs during loading, it must also defend against attacks from outside and attacks while operating, including leakage of protected information across internal boundaries.
SCC uses the authenticated encryption capability to load a boot loader. The boot loader manages further FPGA configuration, including software for on-chip processors and data handling. Because it was authenticated and encrypted, the boot loader is known to be unaltered by potential adversaries or accidental bit errors. In addition, sensitive data, such as session keys buried in the boot loader, are known to be kept secret. Authenticated encryption permits trust in the boot loader. That trust can be further applied to additional configuration data handled by the boot loader.
The boot loader accepts further partial device configurations through normal FPGA I/O. It
Figure 3: Notional Floorplan of a design into five Isolated Regions
(boot loader) ICAP
authenticates and decrypts using algorithms in the boot loader itself, constructed using FPGA fabric, rather than using the dedicated FPGA functions. This permits the boot loader greater flexibility in choosing algorithms and in key handling. The boot loader loads various isolated regions through the Xilinx Internal Configuration Access Port (ICAP). The configuration data never leaves the FPGA and if it is authenticated by the FPGA boot loader, it is known to be un-tampered.
To ensure no internal leakage of information between regions, SCC implements the fences of isolation design flow IDF [4]. The basic concept is to take a design and separate critical and/or intentionally separate functions physically on the FPGA. This can be accomplished through careful floorplanning and the use of unused logic as “fences”. The empty fence regions are wide enough that a single-bit failure in configuration does not connecting neighboring regions. This separation assures the confidentially of sensitive information even in the presence of accidental or intentional attacks on the fences. Figure 3 shows a block diagram of a design that has been floorplanned with IDF, while figure 4 shows the placed and routed view of the design. Fences are visible as black, unused regions.
In an ideal world, each module would be completely isolated from all others. In practice, some level of communication must exist between isolated regions. Xilinx developed the concept of “Trusted Routing”, restricted use of the FPGA interconnects through the fences, such that the isolation established by the use of fences is not compromised. Loaded as part of the boot loader, bitstream scrubbing, using internal readback, continually monitors the configuration data, in particular the isolation fences, to ensure that changes to the configuration are detected and corrected quickly. SCC can even verify that the Device DNA of the chip, ensuring operation on the proper individual chip.
Without loss of security, the boot loader is itself one of the isolated regions in the device. Any attempt to configure the device from an external source triggers the program signal that is caught internally by the boot loader, which initiates a zeroization of the application inside the FPGA before permitting the re- programming to occur.
Originally conceptualized and developed in cooperation with government authorities for FPGAs [3], the application provides additional value in All- Programmable SoCs such as Zynq. Zynq includes both a programmable logic subsystem (PL) that comprises hundreds of thousands of gates of logic, and a processor subsystem (PS) that includes a multi-core ARM processor, caches, memories and peripherals, connected to one another and to
the PL using an AXI bus. The Zynq device boots securely, using authenticated encryption capabilities like those described for FPGAs. Zynq provides asymmetric and symmetric authentication, confidentiality and integrity. Leveraging this root- of-trust, applications can implement crypto-processors or systems performing cryptographic functions in the combination of processor and FPGA with confidence that they have not been compromised.
In Zynq, the Processing Subsystem (PS) is known to be isolated from the Programmable Logic (PL). Within the PL, isolated regions ensure separation of sensitive data spatially. Within the PS, known software methods, such as hypervisors and/or ARM Trustzone technology isolate sensitive software processes from other processes. The trusted boot loader decrypts and authenticates all configuration data and software, potentially using session keys and custom algorithms implemented in the FPGA fabric or the ARM processors.
The spectrum of isolation capabilities is suitable to support applications such as the separation of red and black data processing, key management and other high-reliability functions. Partial Reconfiguration is further enhanced. The entire Zynq PL can be reconfigured, or even powered down, controlled by the PS. Alternatively, portions of the PL can be partially reconfigured for applications that require algorithm agility. Decryption and authentication of partial configuration files can be performed by either the PS or PL, allowing users the flexibility to choose their own authentication and decryption algorithms as well as perform functions such as Authenticate before Decryption to aid in defense against side channel attacks.
Starting with the root-of-trust, followed by the power and flexibility of both hardware and software, coupled with the application of isolation technologies and PR, a system that would typically have been developed through the use of multiple devices now could be integrated into just one with no loss of security.
7. REFERENCES [1] FIPS, “The Keyed-Hash Message Authentication Code (HMAC)”,
FIPS PUB 198; March 6, 2002, http://csrc.nist.gov/publications/fips/fips198-1/FIPS-198-1_final.pdf
[2] J. Guajardo, et. al., “Physical Unclonable Functions and Public-Key Crypto for FPGA IP Protection, FPL 2007, IEEE
[3] M. McLean and J. Moore, “FPGA-Based Single Chip Cryptographic Solution,” Military Embedded Systems, 2007. http://www.mil- embedded.com/pdfs/NSA.Mar07.pdf.
[4] E. Peterson., “Developing Tamper Resistant Designs with Xilinx Virtex-6 and 7 Series FPGAs,” Xilinx Application Note XAPP1084, Xilinx 2012.
[5] Intrinsic-ID, “Quiddikey-Flex,” http://www.intrinsic- id.com/products/quiddikey-flex, 2013
[6] A. Telikepalli, “Is Your Design Secure?,” Xcell, Xilinx 2003. http://www.xilinx.com/publications/archives/xcell/Xcell47.pdf.
[7] S. Trimberger, “Method and apparatus for protecting proprietary configuration data for programmable logic devices,” US Patent 6654889 2003.
[8] S. Trimberger, Trusted Design in FPGAs”, Proceedings of the ACM/IEEE Design Automation Conference, 2007.
[9] S. Trimberger, J. Moore, W. Lu, “Authenticated Encryption of FPGA Bitstreams,” , FPGA 2011, ACM
Figure 4: FPGA Editor view of a SCC design with IDF
<< /ASCII85EncodePages false /AllowTransparency false /AutoPositionEPSFiles true /AutoRotatePages /None /Binding /Left /CalGrayProfile (Gray Gamma 2.2) /CalRGBProfile (sRGB IEC61966-2.1) /CalCMYKProfile (U.S. Web Coated \050SWOP\051 v2) /sRGBProfile (sRGB IEC61966-2.1) /CannotEmbedFontPolicy /Error /CompatibilityLevel 1.7 /CompressObjects /Off /CompressPages true /ConvertImagesToIndexed true /PassThroughJPEGImages true /CreateJobTicket false /DefaultRenderingIntent /Default /DetectBlends true /DetectCurves 0.0000 /ColorConversionStrategy /LeaveColorUnchanged /DoThumbnails false /EmbedAllFonts true /EmbedOpenType false /ParseICCProfilesInComments true /EmbedJobOptions true /DSCReportingLevel 0 /EmitDSCWarnings false /EndPage -1 /ImageMemory 1048576 /LockDistillerParams true /MaxSubsetPct 100 /Optimize true /OPM 0 /ParseDSCComments false /ParseDSCCommentsForDocInfo false /PreserveCopyPage true /PreserveDICMYKValues true /PreserveEPSInfo false /PreserveFlatness true /PreserveHalftoneInfo true /PreserveOPIComments false /PreserveOverprintSettings true /StartPage 1 /SubsetFonts true /TransferFunctionInfo /Remove /UCRandBGInfo /Preserve /UsePrologue false /ColorSettingsFile () /AlwaysEmbed [ true /AbadiMT-CondensedLight /ACaslon-Italic /ACaslon-Regular /ACaslon-Semibold /ACaslon-SemiboldItalic /AdobeArabic-Bold /AdobeArabic-BoldItalic /AdobeArabic-Italic /AdobeArabic-Regular /AdobeHebrew-Bold /AdobeHebrew-BoldItalic /AdobeHebrew-Italic /AdobeHebrew-Regular /AdobeHeitiStd-Regular /AdobeMingStd-Light /AdobeMyungjoStd-Medium /AdobePiStd /AdobeSansMM /AdobeSerifMM /AdobeSongStd-Light /AdobeThai-Bold /AdobeThai-BoldItalic /AdobeThai-Italic /AdobeThai-Regular /AGaramond-Bold /AGaramond-BoldItalic /AGaramond-Italic /AGaramond-Regular /AGaramond-Semibold /AGaramond-SemiboldItalic /AgencyFB-Bold /AgencyFB-Reg /AGOldFace-Outline /AharoniBold /Algerian /Americana /Americana-ExtraBold /AndaleMono /AndaleMonoIPA /AngsanaNew /AngsanaNew-Bold /AngsanaNew-BoldItalic /AngsanaNew-Italic /AngsanaUPC /AngsanaUPC-Bold /AngsanaUPC-BoldItalic /AngsanaUPC-Italic /Anna /ArialAlternative /ArialAlternativeSymbol /Arial-Black /Arial-BlackItalic /Arial-BoldItalicMT /Arial-BoldMT /Arial-ItalicMT /ArialMT /ArialMT-Black /ArialNarrow /ArialNarrow-Bold /ArialNarrow-BoldItalic /ArialNarrow-Italic /ArialRoundedMTBold /ArialUnicodeMS /ArrusBT-Bold /ArrusBT-BoldItalic /ArrusBT-Italic /ArrusBT-Roman /AvantGarde-Book /AvantGarde-BookOblique /AvantGarde-Demi /AvantGarde-DemiOblique /AvantGardeITCbyBT-Book /AvantGardeITCbyBT-BookOblique /BakerSignet /BankGothicBT-Medium /Barmeno-Bold /Barmeno-ExtraBold /Barmeno-Medium /Barmeno-Regular /Baskerville /BaskervilleBE-Italic /BaskervilleBE-Medium /BaskervilleBE-MediumItalic /BaskervilleBE-Regular /Baskerville-Bold /Baskerville-BoldItalic /Baskerville-Italic /BaskOldFace /Batang /BatangChe /Bauhaus93 /Bellevue /BellGothicStd-Black /BellGothicStd-Bold /BellGothicStd-Light /BellMT /BellMTBold /BellMTItalic /BerlingAntiqua-Bold /BerlingAntiqua-BoldItalic /BerlingAntiqua-Italic /BerlingAntiqua-Roman /BerlinSansFB-Bold /BerlinSansFBDemi-Bold /BerlinSansFB-Reg /BernardMT-Condensed /BernhardModernBT-Bold /BernhardModernBT-BoldItalic /BernhardModernBT-Italic /BernhardModernBT-Roman /BiffoMT /BinnerD /BinnerGothic /BlackadderITC-Regular /Blackoak /blex /blsy /Bodoni /Bodoni-Bold /Bodoni-BoldItalic /Bodoni-Italic /BodoniMT /BodoniMTBlack /BodoniMTBlack-Italic /BodoniMT-Bold /BodoniMT-BoldItalic /BodoniMTCondensed /BodoniMTCondensed-Bold /BodoniMTCondensed-BoldItalic /BodoniMTCondensed-Italic /BodoniMT-Italic /BodoniMTPosterCompressed /Bodoni-Poster /Bodoni-PosterCompressed /BookAntiqua /BookAntiqua-Bold /BookAntiqua-BoldItalic /BookAntiqua-Italic /Bookman-Demi /Bookman-DemiItalic /Bookman-Light /Bookman-LightItalic /BookmanOldStyle /BookmanOldStyle-Bold /BookmanOldStyle-BoldItalic /BookmanOldStyle-Italic /BookshelfSymbolOne-Regular /BookshelfSymbolSeven /BookshelfSymbolThree-Regular /BookshelfSymbolTwo-Regular /Botanical /Boton-Italic /Boton-Medium /Boton-MediumItalic /Boton-Regular /Boulevard /BradleyHandITC /Braggadocio /BritannicBold /Broadway /BrowalliaNew /BrowalliaNew-Bold /BrowalliaNew-BoldItalic /BrowalliaNew-Italic /BrowalliaUPC /BrowalliaUPC-Bold /BrowalliaUPC-BoldItalic /BrowalliaUPC-Italic /BrushScript /BrushScriptMT /CaflischScript-Bold /CaflischScript-Regular /Calibri /Calibri-Bold /Calibri-BoldItalic /Calibri-Italic /CalifornianFB-Bold /CalifornianFB-Italic /CalifornianFB-Reg /CalisMTBol /CalistoMT /CalistoMT-BoldItalic /CalistoMT-Italic /Cambria /Cambria-Bold /Cambria-BoldItalic /Cambria-Italic /CambriaMath /Candara /Candara-Bold /Candara-BoldItalic /Candara-Italic /Carta /CaslonOpenfaceBT-Regular /Castellar /CastellarMT /Centaur /Centaur-Italic /Century /CenturyGothic /CenturyGothic-Bold /CenturyGothic-BoldItalic /CenturyGothic-Italic /CenturySchL-Bold /CenturySchL-BoldItal /CenturySchL-Ital /CenturySchL-Roma /CenturySchoolbook /CenturySchoolbook-Bold /CenturySchoolbook-BoldItalic /CenturySchoolbook-Italic /CGTimes-Bold /CGTimes-BoldItalic /CGTimes-Italic /CGTimes-Regular /CharterBT-Bold /CharterBT-BoldItalic /CharterBT-Italic /CharterBT-Roman /CheltenhamITCbyBT-Bold /CheltenhamITCbyBT-BoldItalic /CheltenhamITCbyBT-Book /CheltenhamITCbyBT-BookItalic /Chiller-Regular /Cmb10 /CMB10 /Cmbsy10 /CMBSY10 /CMBSY5 /CMBSY6 /CMBSY7 /CMBSY8 /CMBSY9 /Cmbx10 /CMBX10 /Cmbx12 /CMBX12 /Cmbx5 /CMBX5 /Cmbx6 /CMBX6 /Cmbx7 /CMBX7 /Cmbx8 /CMBX8 /Cmbx9 /CMBX9 /Cmbxsl10 /CMBXSL10 /Cmbxti10 /CMBXTI10 /Cmcsc10 /CMCSC10 /Cmcsc8 /CMCSC8 /Cmcsc9 /CMCSC9 /Cmdunh10 /CMDUNH10 /Cmex10 /CMEX10 /CMEX7 /CMEX8 /CMEX9 /Cmff10 /CMFF10 /Cmfi10 /CMFI10 /Cmfib8 /CMFIB8 /Cminch /CMINCH /Cmitt10 /CMITT10 /Cmmi10 /CMMI10 /Cmmi12 /CMMI12 /Cmmi5 /CMMI5 /Cmmi6 /CMMI6 /Cmmi7 /CMMI7 /Cmmi8 /CMMI8 /Cmmi9 /CMMI9 /Cmmib10 /CMMIB10 /CMMIB5 /CMMIB6 /CMMIB7 /CMMIB8 /CMMIB9 /Cmr10 /CMR10 /Cmr12 /CMR12 /Cmr17 /CMR17 /Cmr5 /CMR5 /Cmr6 /CMR6 /Cmr7 /CMR7 /Cmr8 /CMR8 /Cmr9 /CMR9 /Cmsl10 /CMSL10 /Cmsl12 /CMSL12 /Cmsl8 /CMSL8 /Cmsl9 /CMSL9 /Cmsltt10 /CMSLTT10 /Cmss10 /CMSS10 /Cmss12 /CMSS12 /Cmss17 /CMSS17 /Cmss8 /CMSS8 /Cmss9 /CMSS9 /Cmssbx10 /CMSSBX10 /Cmssdc10 /CMSSDC10 /Cmssi10 /CMSSI10 /Cmssi12 /CMSSI12 /Cmssi17 /CMSSI17 /Cmssi8 /CMSSI8 /Cmssi9 /CMSSI9 /Cmssq8 /CMSSQ8 /Cmssqi8 /CMSSQI8 /Cmsy10 /CMSY10 /Cmsy5 /CMSY5 /Cmsy6 /CMSY6 /Cmsy7 /CMSY7 /Cmsy8 /CMSY8 /Cmsy9 /CMSY9 /Cmtcsc10 /CMTCSC10 /Cmtex10 /CMTEX10 /Cmtex8 /CMTEX8 /Cmtex9 /CMTEX9 /Cmti10 /CMTI10 /Cmti12 /CMTI12 /Cmti7 /CMTI7 /Cmti8 /CMTI8 /Cmti9 /CMTI9 /Cmtt10 /CMTT10 /Cmtt12 /CMTT12 /Cmtt8 /CMTT8 /Cmtt9 /CMTT9 /Cmu10 /CMU10 /Cmvtt10 /CMVTT10 /ColonnaMT /Colossalis-Bold /ComicSansMS /ComicSansMS-Bold /Consolas /Consolas-Bold /Consolas-BoldItalic /Consolas-Italic /Constantia /Constantia-Bold /Constantia-BoldItalic /Constantia-Italic /CooperBlack /CopperplateGothic-Bold /CopperplateGothic-Light /Copperplate-ThirtyThreeBC /Corbel /Corbel-Bold /Corbel-BoldItalic /Corbel-Italic /CordiaNew /CordiaNew-Bold /CordiaNew-BoldItalic /CordiaNew-Italic /CordiaUPC /CordiaUPC-Bold /CordiaUPC-BoldItalic /CordiaUPC-Italic /Courier /Courier-Bold /Courier-BoldOblique /CourierNewPS-BoldItalicMT /CourierNewPS-BoldMT /CourierNewPS-ItalicMT /CourierNewPSMT /Courier-Oblique /CourierStd /CourierStd-Bold /CourierStd-BoldOblique /CourierStd-Oblique /CourierX-Bold /CourierX-BoldOblique /CourierX-Oblique /CourierX-Regular /CreepyRegular /CurlzMT /David-Bold /David-Reg /DavidTransparent /Dcb10 /Dcbx10 /Dcbxsl10 /Dcbxti10 /Dccsc10 /Dcitt10 /Dcr10 /Desdemona /DilleniaUPC /DilleniaUPCBold /DilleniaUPCBoldItalic /DilleniaUPCItalic /Dingbats /DomCasual /Dotum /DotumChe /EdwardianScriptITC /Elephant-Italic /Elephant-Regular /EngraversGothicBT-Regular /EngraversMT /EraserDust /ErasITC-Bold /ErasITC-Demi /ErasITC-Light /ErasITC-Medium /ErieBlackPSMT /ErieLightPSMT /EriePSMT /EstrangeloEdessa /Euclid /Euclid-Bold /Euclid-BoldItalic /EuclidExtra /EuclidExtra-Bold /EuclidFraktur /EuclidFraktur-Bold /Euclid-Italic /EuclidMathOne /EuclidMathOne-Bold /EuclidMathTwo /EuclidMathTwo-Bold /EuclidSymbol /EuclidSymbol-Bold /EuclidSymbol-BoldItalic /EuclidSymbol-Italic /EucrosiaUPC /EucrosiaUPCBold /EucrosiaUPCBoldItalic /EucrosiaUPCItalic /EUEX10 /EUEX7 /EUEX8 /EUEX9 /EUFB10 /EUFB5 /EUFB7 /EUFM10 /EUFM5 /EUFM7 /EURB10 /EURB5 /EURB7 /EURM10 /EURM5 /EURM7 /EuroMono-Bold /EuroMono-BoldItalic /EuroMono-Italic /EuroMono-Regular /EuroSans-Bold /EuroSans-BoldItalic /EuroSans-Italic /EuroSans-Regular /EuroSerif-Bold /EuroSerif-BoldItalic /EuroSerif-Italic /EuroSerif-Regular /EuroSig /EUSB10 /EUSB5 /EUSB7 /EUSM10 /EUSM5 /EUSM7 /FelixTitlingMT /Fences /FencesPlain /FigaroMT /FixedMiriamTransparent /FootlightMTLight /Formata-Italic /Formata-Medium /Formata-MediumItalic /Formata-Regular /ForteMT /FranklinGothic-Book /FranklinGothic-BookItalic /FranklinGothic-Demi /FranklinGothic-DemiCond /FranklinGothic-DemiItalic /FranklinGothic-Heavy /FranklinGothic-HeavyItalic /FranklinGothicITCbyBT-Book /FranklinGothicITCbyBT-BookItal /FranklinGothicITCbyBT-Demi /FranklinGothicITCbyBT-DemiItal /FranklinGothic-Medium /FranklinGothic-MediumCond /FranklinGothic-MediumItalic /FrankRuehl /FreesiaUPC /FreesiaUPCBold /FreesiaUPCBoldItalic /FreesiaUPCItalic /FreestyleScript-Regular /FrenchScriptMT /Frutiger-Black /Frutiger-BlackCn /Frutiger-BlackItalic /Frutiger-Bold /Frutiger-BoldCn /Frutiger-BoldItalic /Frutiger-Cn /Frutiger-ExtraBlackCn /Frutiger-Italic /Frutiger-Light /Frutiger-LightCn /Frutiger-LightItalic /Frutiger-Roman /Frutiger-UltraBlack /Futura-Bold /Futura-BoldOblique /Futura-Book /Futura-BookOblique /FuturaBT-Bold /FuturaBT-BoldItalic /FuturaBT-Book /FuturaBT-BookItalic /FuturaBT-Medium /FuturaBT-MediumItalic /Futura-Light /Futura-LightOblique /GalliardITCbyBT-Bold /GalliardITCbyBT-BoldItalic /GalliardITCbyBT-Italic /GalliardITCbyBT-Roman /Garamond /Garamond-Bold /Garamond-BoldCondensed /Garamond-BoldCondensedItalic /Garamond-BoldItalic /Garamond-BookCondensed /Garamond-BookCondensedItalic /Garamond-Italic /Garamond-LightCondensed /Garamond-LightCondensedItalic /Gautami /GeometricSlab703BT-Light /GeometricSlab703BT-LightItalic /Georgia /Georgia-Bold /Georgia-BoldItalic /Georgia-Italic /GeorgiaRef /Giddyup /Giddyup-Thangs /Gigi-Regular /GillSans /GillSans-Bold /GillSans-BoldItalic /GillSans-Condensed /GillSans-CondensedBold /GillSans-Italic /GillSans-Light /GillSans-LightItalic /GillSansMT /GillSansMT-Bold /GillSansMT-BoldItalic /GillSansMT-Condensed /GillSansMT-ExtraCondensedBold /GillSansMT-Italic /GillSans-UltraBold /GillSans-UltraBoldCondensed /GloucesterMT-ExtraCondensed /Gothic-Thirteen /GoudyOldStyleBT-Bold /GoudyOldStyleBT-BoldItalic /GoudyOldStyleBT-Italic /GoudyOldStyleBT-Roman /GoudyOldStyleT-Bold /GoudyOldStyleT-Italic /GoudyOldStyleT-Regular /GoudyStout /GoudyTextMT-LombardicCapitals /GSIDefaultSymbols /Gulim /GulimChe /Gungsuh /GungsuhChe /Haettenschweiler /HarlowSolid /Harrington /Helvetica /Helvetica-Black /Helvetica-BlackOblique /Helvetica-Bold /Helvetica-BoldOblique /Helvetica-Condensed /Helvetica-Condensed-Black /Helvetica-Condensed-BlackObl /Helvetica-Condensed-Bold /Helvetica-Condensed-BoldObl /Helvetica-Condensed-Light /Helvetica-Condensed-LightObl /Helvetica-Condensed-Oblique /Helvetica-Fraction /Helvetica-Narrow /Helvetica-Narrow-Bold /Helvetica-Narrow-BoldOblique /Helvetica-Narrow-Oblique /Helvetica-Oblique /HighTowerText-Italic /HighTowerText-Reg /Humanist521BT-BoldCondensed /Humanist521BT-Light /Humanist521BT-LightItalic /Humanist521BT-RomanCondensed /Imago-ExtraBold /Impact /ImprintMT-Shadow /InformalRoman-Regular /IrisUPC /IrisUPCBold /IrisUPCBoldItalic /IrisUPCItalic /Ironwood /ItcEras-Medium /ItcKabel-Bold /ItcKabel-Book /ItcKabel-Demi /ItcKabel-Medium /ItcKabel-Ultra /JasmineUPC /JasmineUPC-Bold /JasmineUPC-BoldItalic /JasmineUPC-Italic /JoannaMT /JoannaMT-Italic /Jokerman-Regular /JuiceITC-Regular /Kartika /Kaufmann /KaufmannBT-Bold /KaufmannBT-Regular /KidTYPEPaint /KinoMT /KodchiangUPC /KodchiangUPC-Bold /KodchiangUPC-BoldItalic /KodchiangUPC-Italic /KorinnaITCbyBT-Regular /KozGoProVI-Medium /KozMinProVI-Regular /KristenITC-Regular /KunstlerScript /Latha /LatinWide /LetterGothic /LetterGothic-Bold /LetterGothic-BoldOblique /LetterGothic-BoldSlanted /LetterGothicMT /LetterGothicMT-Bold /LetterGothicMT-BoldOblique /LetterGothicMT-Oblique /LetterGothic-Slanted /LetterGothicStd /LetterGothicStd-Bold /LetterGothicStd-BoldSlanted /LetterGothicStd-Slanted /LevenimMT /LevenimMTBold /LilyUPC /LilyUPCBold /LilyUPCBoldItalic /LilyUPCItalic /Lithos-Black /Lithos-Regular /LotusWPBox-Roman /LotusWPIcon-Roman /LotusWPIntA-Roman /LotusWPIntB-Roman /LotusWPType-Roman /LucidaBright /LucidaBright-Demi /LucidaBright-DemiItalic /LucidaBright-Italic /LucidaCalligraphy-Italic /LucidaConsole /LucidaFax /LucidaFax-Demi /LucidaFax-DemiItalic /LucidaFax-Italic /LucidaHandwriting-Italic /LucidaSans /LucidaSans-Demi /LucidaSans-DemiItalic /LucidaSans-Italic /LucidaSans-Typewriter /LucidaSans-TypewriterBold /LucidaSans-TypewriterBoldOblique /LucidaSans-TypewriterOblique /LucidaSansUnicode /Lydian /Magneto-Bold /MaiandraGD-Regular /Mangal-Regular /Map-Symbols /MathA /MathB /MathC /Mathematica1 /Mathematica1-Bold /Mathematica1Mono /Mathematica1Mono-Bold /Mathematica2 /Mathematica2-Bold /Mathematica2Mono /Mathematica2Mono-Bold /Mathematica3 /Mathematica3-Bold /Mathematica3Mono /Mathematica3Mono-Bold /Mathematica4 /Mathematica4-Bold /Mathematica4Mono /Mathematica4Mono-Bold /Mathematica5 /Mathematica5-Bold /Mathematica5Mono /Mathematica5Mono-Bold /Mathematica6 /Mathematica6Bold /Mathematica6Mono /Mathematica6MonoBold /Mathematica7 /Mathematica7Bold /Mathematica7Mono /Mathematica7MonoBold /MatisseITC-Regular /MaturaMTScriptCapitals /Mesquite /Mezz-Black /Mezz-Regular /MICR /MicrosoftSansSerif /MingLiU /Minion-BoldCondensed /Minion-BoldCondensedItalic /Minion-Condensed /Minion-CondensedItalic /Minion-Ornaments /MinionPro-Bold /MinionPro-BoldIt /MinionPro-It /MinionPro-Regular /MinionPro-Semibold /MinionPro-SemiboldIt /Miriam /MiriamFixed /MiriamTransparent /Mistral /Modern-Regular /MonotypeCorsiva /MonotypeSorts /MSAM10 /MSAM5 /MSAM6 /MSAM7 /MSAM8 /MSAM9 /MSBM10 /MSBM5 /MSBM6 /MSBM7 /MSBM8 /MSBM9 /MS-Gothic /MSHei /MSLineDrawPSMT /MS-Mincho /MSOutlook /MS-PGothic /MS-PMincho /MSReference1 /MSReference2 /MSReferenceSansSerif /MSReferenceSansSerif-Bold /MSReferenceSansSerif-BoldItalic /MSReferenceSansSerif-Italic /MSReferenceSerif /MSReferenceSerif-Bold /MSReferenceSerif-BoldItalic /MSReferenceSerif-Italic /MSReferenceSpecialty /MSSong /MS-UIGothic /MT-Extra /MT-Symbol /MT-Symbol-Italic /MVBoli /Myriad-Bold /Myriad-BoldItalic /Myriad-Italic /MyriadPro-Black /MyriadPro-BlackIt /MyriadPro-Bold /MyriadPro-BoldIt /MyriadPro-It /MyriadPro-Light /MyriadPro-LightIt /MyriadPro-Regular /MyriadPro-Semibold /MyriadPro-SemiboldIt /Myriad-Roman /Narkisim /NewCenturySchlbk-Bold /NewCenturySchlbk-BoldItalic /NewCenturySchlbk-Italic /NewCenturySchlbk-Roman /NewMilleniumSchlbk-BoldItalicSH /NewsGothic /NewsGothic-Bold /NewsGothicBT-Bold /NewsGothicBT-BoldItalic /NewsGothicBT-Italic /NewsGothicBT-Roman /NewsGothic-Condensed /NewsGothic-Italic /NewsGothicMT /NewsGothicMT-Bold /NewsGothicMT-Italic /NiagaraEngraved-Reg /NiagaraSolid-Reg /NimbusMonL-Bold /NimbusMonL-BoldObli /NimbusMonL-Regu /NimbusMonL-ReguObli /NimbusRomDGR-Bold /NimbusRomDGR-BoldItal /NimbusRomDGR-Regu /NimbusRomDGR-ReguItal /NimbusRomNo9L-Medi /NimbusRomNo9L-MediItal /NimbusRomNo9L-Regu /NimbusRomNo9L-ReguItal /NimbusSanL-Bold /NimbusSanL-BoldCond /NimbusSanL-BoldCondItal /NimbusSanL-BoldItal /NimbusSanL-Regu /NimbusSanL-ReguCond /NimbusSanL-ReguCondItal /NimbusSanL-ReguItal /Nimrod /Nimrod-Bold /Nimrod-BoldItalic /Nimrod-Italic /NSimSun /Nueva-BoldExtended /Nueva-BoldExtendedItalic /Nueva-Italic /Nueva-Roman /NuptialScript /OCRA /OCRA-Alternate /OCRAExtended /OCRB /OCRB-Alternate /OfficinaSans-Bold /OfficinaSans-BoldItalic /OfficinaSans-Book /OfficinaSans-BookItalic /OfficinaSerif-Bold /OfficinaSerif-BoldItalic /OfficinaSerif-Book /OfficinaSerif-BookItalic /OldEnglishTextMT /Onyx /OnyxBT-Regular /OzHandicraftBT-Roman /PalaceScriptMT /Palatino-Bold /Palatino-BoldItalic /Palatino-Italic /PalatinoLinotype-Bold /PalatinoLinotype-BoldItalic /PalatinoLinotype-Italic /PalatinoLinotype-Roman /Palatino-Roman /PapyrusPlain /Papyrus-Regular /Parchment-Regular /Parisian /ParkAvenue /Penumbra-SemiboldFlare /Penumbra-SemiboldSans /Penumbra-SemiboldSerif /PepitaMT /Perpetua /Perpetua-Bold /Perpetua-BoldItalic /Perpetua-Italic /PerpetuaTitlingMT-Bold /PerpetuaTitlingMT-Light /PhotinaCasualBlack /Playbill /PMingLiU /Poetica-SuppOrnaments /PoorRichard-Regular /PopplLaudatio-Italic /PopplLaudatio-Medium /PopplLaudatio-MediumItalic /PopplLaudatio-Regular /PrestigeElite /Pristina-Regular /PTBarnumBT-Regular /Raavi /RageItalic /Ravie /RefSpecialty /Ribbon131BT-Bold /Rockwell /Rockwell-Bold /Rockwell-BoldItalic /Rockwell-Condensed /Rockwell-CondensedBold /Rockwell-ExtraBold /Rockwell-Italic /Rockwell-Light /Rockwell-LightItalic /Rod /RodTransparent /RunicMT-Condensed /Sanvito-Light /Sanvito-Roman /ScriptC /ScriptMTBold /SegoeUI /SegoeUI-Bold /SegoeUI-BoldItalic /SegoeUI-Italic /Serpentine-BoldOblique /ShelleyVolanteBT-Regular /ShowcardGothic-Reg /Shruti /SimHei /SimSun /SimSun-PUA /SnapITC-Regular /StandardSymL /Stencil /StoneSans /StoneSans-Bold /StoneSans-BoldItalic /StoneSans-Italic /StoneSans-Semibold /StoneSans-SemiboldItalic /Stop /Swiss721BT-BlackExtended /Sylfaen /Symbol /SymbolMT /Tahoma /Tahoma-Bold /Tci1 /Tci1Bold /Tci1BoldItalic /Tci1Italic /Tci2 /Tci2Bold /Tci2BoldItalic /Tci2Italic /Tci3 /Tci3Bold /Tci3BoldItalic /Tci3Italic /Tci4 /Tci4Bold /Tci4BoldItalic /Tci4Italic /TechnicalItalic /TechnicalPlain /Tekton /Tekton-Bold /TektonMM /Tempo-HeavyCondensed /Tempo-HeavyCondensedItalic /TempusSansITC /Times-Bold /Times-BoldItalic /Times-BoldItalicOsF /Times-BoldSC /Times-ExtraBold /Times-Italic /Times-ItalicOsF /TimesNewRomanMT-ExtraBold /TimesNewRomanPS-BoldItalicMT /TimesNewRomanPS-BoldMT /TimesNewRomanPS-ItalicMT /TimesNewRomanPSMT /Times-Roman /Times-RomanSC /Trajan-Bold /Trebuchet-BoldItalic /TrebuchetMS /TrebuchetMS-Bold /TrebuchetMS-Italic /Tunga-Regular /TwCenMT-Bold /TwCenMT-BoldItalic /TwCenMT-Condensed /TwCenMT-CondensedBold /TwCenMT-CondensedExtraBold /TwCenMT-CondensedMedium /TwCenMT-Italic /TwCenMT-Regular /Univers-Bold /Univers-BoldItalic /UniversCondensed-Bold /UniversCondensed-BoldItalic /UniversCondensed-Medium /UniversCondensed-MediumItalic /Univers-Medium /Univers-MediumItalic /URWBookmanL-DemiBold /URWBookmanL-DemiBoldItal /URWBookmanL-Ligh /URWBookmanL-LighItal /URWChanceryL-MediItal /URWGothicL-Book /URWGothicL-BookObli /URWGothicL-Demi /URWGothicL-DemiObli /URWPalladioL-Bold /URWPalladioL-BoldItal /URWPalladioL-Ital /URWPalladioL-Roma /USPSBarCode /VAGRounded-Black /VAGRounded-Bold /VAGRounded-Light /VAGRounded-Thin /Verdana /Verdana-Bold /Verdana-BoldItalic /Verdana-Italic /VerdanaRef /VinerHandITC /Viva-BoldExtraExtended /Vivaldii /Viva-LightCondensed /Viva-Regular /VladimirScript /Vrinda /Webdings /Westminster /Willow /Wingdings2 /Wingdings3 /Wingdings-Regular /WNCYB10 /WNCYI10 /WNCYR10 /WNCYSC10 /WNCYSS10 /WoodtypeOrnaments-One /WoodtypeOrnaments-Two /WP-ArabicScriptSihafa /WP-ArabicSihafa /WP-BoxDrawing /WP-CyrillicA /WP-CyrillicB /WP-GreekCentury /WP-GreekCourier /WP-GreekHelve /WP-HebrewDavid /WP-IconicSymbolsA /WP-IconicSymbolsB /WP-Japanese /WP-MathA /WP-MathB /WP-MathExtendedA /WP-MathExtendedB /WP-MultinationalAHelve /WP-MultinationalARoman /WP-MultinationalBCourier /WP-MultinationalBHelve /WP-MultinationalBRoman /WP-MultinationalCourier /WP-Phonetic /WPTypographicSymbols /XYATIP10 /XYBSQL10 /XYBTIP10 /XYCIRC10 /XYCMAT10 /XYCMBT10 /XYDASH10 /XYEUAT10 /XYEUBT10 /ZapfChancery-MediumItalic /ZapfDingbats /ZapfHumanist601BT-Bold /ZapfHumanist601BT-BoldItalic /ZapfHumanist601BT-Demi /ZapfHumanist601BT-DemiItalic /ZapfHumanist601BT-Italic /ZapfHumanist601BT-Roman /ZWAdobeF ] /NeverEmbed [ true ] /AntiAliasColorImages false /CropColorImages true /ColorImageMinResolution 200 /ColorImageMinResolutionPolicy /OK /DownsampleColorImages true /ColorImageDownsampleType /Bicubic /ColorImageResolution 300 /ColorImageDepth -1 /ColorImageMinDownsampleDepth 1 /ColorImageDownsampleThreshold 2.00333 /EncodeColorImages true /ColorImageFilter /DCTEncode /AutoFilterColorImages true /ColorImageAutoFilterStrategy /JPEG /ColorACSImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageDict << /QFactor 1.30 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000ColorACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 10 >> /JPEG2000ColorImageDict << /TileWidth 256 /TileHeight 256 /Quality 10 >> /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 200 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 2.00333 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict << /QFactor 0.76 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /GrayImageDict << /QFactor 1.30 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /JPEG2000GrayACSImageDict << /TileWidth 256 /TileHeight 256 /Quality 10 >> /JPEG2000GrayImageDict << /TileWidth 256 /TileHeight 256 /Quality 10 >> /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 400 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.00167 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict << /K -1 >> /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False /CreateJDFFile false /Description << /ARA <FEFF06270633062A062E062F0645002006470630064700200627064406250639062F0627062F0627062A002006440625064606340627062100200648062B062706260642002000410064006F00620065002000500044004600200645062A064806270641064206290020064406440639063106360020063906440649002006270644063406270634062900200648064506460020062E06440627064400200631063306270626064400200627064406280631064A062F002006270644062506440643062A063106480646064A00200648064506460020062E064406270644002006350641062D0627062A0020062706440648064A0628061B0020064A06450643064600200641062A062D00200648062B0627062606420020005000440046002006270644064506460634062306290020062806270633062A062E062F062706450020004100630072006F0062006100740020064800410064006F006200650020005200650061006400650072002006250635062F0627063100200035002E0030002006480627064406250635062F062706310627062A0020062706440623062D062F062B002E> /BGR <FEFF04180437043f043e043b043704320430043904420435002004420435043704380020043d0430044104420440043e0439043a0438002c00200437043000200434043000200441044a0437043404300432043004420435002000410064006f00620065002000500044004600200434043e043a0443043c0435043d04420438002c0020043c0430043a04410438043c0430043b043d043e0020043f044004380433043e04340435043d04380020043704300020043f043e043a0430043704320430043d04350020043d043000200435043a04400430043d0430002c00200435043b0435043a04420440043e043d043d04300020043f043e044904300020043800200418043d044204350440043d04350442002e002000200421044a04370434043004340435043d043804420435002000500044004600200434043e043a0443043c0435043d044204380020043c043e0433043004420020043404300020044104350020043e0442043204300440044f0442002004410020004100630072006f00620061007400200438002000410064006f00620065002000520065006100640065007200200035002e00300020043800200441043b0435043404320430044904380020043204350440044104380438002e> /CHS <FEFF4f7f75288fd94e9b8bbe5b9a521b5efa7684002000410064006f006200650020005000440046002065876863900275284e8e5c4f5e55663e793a3001901a8fc775355b5090ae4ef653d190014ee553ca901a8fc756e072797f5153d15e03300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c676562535f00521b5efa768400200050004400460020658768633002> /CHT <FEFF4f7f752890194e9b8a2d7f6e5efa7acb7684002000410064006f006200650020005000440046002065874ef69069752865bc87a25e55986f793a3001901a904e96fb5b5090f54ef650b390014ee553ca57287db2969b7db28def4e0a767c5e03300260a853ef4ee54f7f75280020004100630072006f0062006100740020548c002000410064006f00620065002000520065006100640065007200200035002e003000204ee553ca66f49ad87248672c4f86958b555f5df25efa7acb76840020005000440046002065874ef63002> /CZE <FEFF005400610074006f0020006e006100730074006100760065006e00ed00200070006f0075017e0069006a007400650020006b0020007600790074007600e101590065006e00ed00200064006f006b0075006d0065006e0074016f002000410064006f006200650020005000440046002c0020006b00740065007200e90020007300650020006e0065006a006c00e90070006500200068006f006400ed002000700072006f0020007a006f006200720061007a006f007600e1006e00ed0020006e00610020006f006200720061007a006f007600630065002c00200070006f007300ed006c00e1006e00ed00200065002d006d00610069006c0065006d00200061002000700072006f00200069006e007400650072006e00650074002e002000200056007900740076006f01590065006e00e900200064006f006b0075006d0065006e007400790020005000440046002000620075006400650020006d006f017e006e00e90020006f007400650076015900ed007400200076002000700072006f006700720061006d0065006300680020004100630072006f00620061007400200061002000410064006f00620065002000520065006100640065007200200035002e0030002000610020006e006f0076011b006a016100ed00630068002e> /DAN <FEFF004200720075006700200069006e0064007300740069006c006c0069006e006700650072006e0065002000740069006c0020006100740020006f007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400650072002c0020006400650072002000620065006400730074002000650067006e006500720020007300690067002000740069006c00200073006b00e60072006d007600690073006e0069006e0067002c00200065002d006d00610069006c0020006f006700200069006e007400650072006e00650074002e0020004400650020006f007000720065007400740065006400650020005000440046002d0064006f006b0075006d0065006e0074006500720020006b0061006e002000e50062006e00650073002000690020004100630072006f00620061007400200065006c006c006500720020004100630072006f006200610074002000520065006100640065007200200035002e00300020006f00670020006e0079006500720065002e> /DEU <FEFF00560065007200770065006e00640065006e0020005300690065002000640069006500730065002000450069006e007300740065006c006c0075006e00670065006e0020007a0075006d002000450072007300740065006c006c0065006e00200076006f006e002000410064006f006200650020005000440046002d0044006f006b0075006d0065006e00740065006e002c00200064006900650020006600fc00720020006400690065002000420069006c006400730063006800690072006d0061006e007a0065006900670065002c00200045002d004d00610069006c0020006f006400650072002000640061007300200049006e007400650072006e00650074002000760065007200770065006e006400650074002000770065007200640065006e00200073006f006c006c0065006e002e002000450072007300740065006c006c007400650020005000440046002d0044006f006b0075006d0065006e007400650020006b00f6006e006e0065006e0020006d006900740020004100630072006f00620061007400200075006e0064002000410064006f00620065002000520065006100640065007200200035002e00300020006f0064006500720020006800f600680065007200200067006500f600660066006e00650074002000770065007200640065006e002e> /ESP <FEFF005500740069006c0069006300650020006500730074006100200063006f006e0066006900670075007200610063006900f3006e0020007000610072006100200063007200650061007200200064006f00630075006d0065006e0074006f00730020005000440046002000640065002000410064006f0062006500200061006400650063007500610064006f007300200070006100720061002000760069007300750061006c0069007a00610063006900f3006e00200065006e002000700061006e00740061006c006c0061002c00200063006f007200720065006f00200065006c006500630074007200f3006e00690063006f0020006500200049006e007400650072006e00650074002e002000530065002000700075006500640065006e00200061006200720069007200200064006f00630075006d0065006e0074006f00730020005000440046002000630072006500610064006f007300200063006f006e0020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e003000200079002000760065007200730069006f006e0065007300200070006f00730074006500720069006f007200650073002e> /ETI <FEFF004b00610073007500740061006700650020006e0065006900640020007300e400740074006500690064002000730065006c006c0069007300740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740069006400650020006c006f006f006d006900730065006b0073002c0020006d0069007300200073006f006200690076006100640020006b00f500690067006500200070006100720065006d0069006e006900200065006b007200610061006e0069006c0020006b007500760061006d006900730065006b0073002c00200065002d0070006f0073007400690067006100200073006100610074006d006900730065006b00730020006a006100200049006e007400650072006e00650074006900730020006100760061006c00640061006d006900730065006b0073002e00200020004c006f006f0064007500640020005000440046002d0064006f006b0075006d0065006e00740065002000730061006100740065002000610076006100640061002000700072006f006700720061006d006d006900640065006700610020004100630072006f0062006100740020006e0069006e0067002000410064006f00620065002000520065006100640065007200200035002e00300020006a00610020007500750065006d006100740065002000760065007200730069006f006f006e00690064006500670061002e> /FRA <FEFF005500740069006c006900730065007a00200063006500730020006f007000740069006f006e00730020006100660069006e00200064006500200063007200e900650072002000640065007300200064006f00630075006d0065006e00740073002000410064006f006200650020005000440046002000640065007300740069006e00e90073002000e000200049006e007400650072006e00650074002c002000e0002000ea007400720065002000610066006600690063006800e90073002000e00020006c002700e9006300720061006e002000650074002000e0002000ea00740072006500200065006e0076006f007900e9007300200070006100720020006d006500730073006100670065007200690065002e0020004c0065007300200064006f00630075006d0065006e00740073002000500044004600200063007200e900e90073002000700065007500760065006e0074002000ea0074007200650020006f007500760065007200740073002000640061006e00730020004100630072006f006200610074002c002000610069006e00730069002000710075002700410064006f00620065002000520065006100640065007200200035002e0030002000650074002000760065007200730069006f006e007300200075006c007400e90072006900650075007200650073002e> /GRE <FEFF03a703c103b703c303b903bc03bf03c003bf03b903ae03c303c403b5002003b103c503c403ad03c2002003c403b903c2002003c103c503b803bc03af03c303b503b903c2002003b303b903b1002003bd03b1002003b403b703bc03b903bf03c503c103b303ae03c303b503c403b5002003ad03b303b303c103b103c603b1002000410064006f006200650020005000440046002003c003bf03c5002003b503af03bd03b103b9002003ba03b103c42019002003b503be03bf03c703ae03bd002003ba03b103c403ac03bb03bb03b703bb03b1002003b303b903b1002003c003b103c103bf03c503c303af03b103c303b7002003c303c403b703bd002003bf03b803cc03bd03b7002c002003b303b903b100200065002d006d00610069006c002c002003ba03b103b9002003b303b903b1002003c403bf0020039403b903b1002d03b403af03ba03c403c503bf002e0020002003a403b10020005000440046002003ad03b303b303c103b103c603b1002003c003bf03c5002003ad03c703b503c403b5002003b403b703bc03b903bf03c503c103b303ae03c303b503b9002003bc03c003bf03c103bf03cd03bd002003bd03b1002003b103bd03bf03b903c703c403bf03cd03bd002003bc03b5002003c403bf0020004100630072006f006200610074002c002003c403bf002000410064006f00620065002000520065006100640065007200200035002e0030002003ba03b103b9002003bc03b503c403b103b303b503bd03ad03c303c403b503c103b503c2002003b503ba03b403cc03c303b503b903c2002e> /HEB <FEFF05D405E905EA05DE05E905D5002005D105D405D205D305E805D505EA002005D005DC05D4002005DB05D305D9002005DC05D905E605D505E8002005DE05E105DE05DB05D9002000410064006F006200650020005000440046002005D405DE05D505EA05D005DE05D905DD002005DC05EA05E605D505D205EA002005DE05E105DA002C002005D305D505D005E8002005D005DC05E705D805E805D505E005D9002005D505D405D005D905E005D805E805E005D8002E002005DE05E105DE05DB05D90020005000440046002005E905E005D505E605E805D5002005E005D905EA05E005D905DD002005DC05E405EA05D905D705D4002005D105D005DE05E605E205D505EA0020004100630072006F006200610074002005D5002D00410064006F00620065002000520065006100640065007200200035002E0030002005D505D205E805E105D005D505EA002005DE05EA05E705D305DE05D505EA002005D905D505EA05E8002E002D0033002C002005E205D905D905E005D5002005D105DE05D305E805D905DA002005DC05DE05E905EA05DE05E9002005E905DC0020004100630072006F006200610074002E002005DE05E105DE05DB05D90020005000440046002005E905E005D505E605E805D5002005E005D905EA05E005D905DD002005DC05E405EA05D905D705D4002005D105D005DE05E605E205D505EA0020004100630072006F006200610074002005D5002D00410064006F00620065002000520065006100640065007200200035002E0030002005D505D205E805E105D005D505EA002005DE05EA05E705D305DE05D505EA002005D905D505EA05E8002E> /HRV <FEFF005a00610020007300740076006100720061006e006a0065002000500044004600200064006f006b0075006d0065006e0061007400610020006e0061006a0070006f0067006f0064006e0069006a006900680020007a00610020007000720069006b0061007a0020006e00610020007a00610073006c006f006e0075002c00200065002d0070006f0161007400690020006900200049006e007400650072006e0065007400750020006b006f00720069007300740069007400650020006f0076006500200070006f0073007400610076006b0065002e00200020005300740076006f00720065006e0069002000500044004600200064006f006b0075006d0065006e007400690020006d006f006700750020007300650020006f00740076006f00720069007400690020004100630072006f00620061007400200069002000410064006f00620065002000520065006100640065007200200035002e0030002000690020006b00610073006e0069006a0069006d0020007600650072007a0069006a0061006d0061002e> /HUN <FEFF00410020006b00e9007000650072006e00790151006e0020006d00650067006a0065006c0065006e00ed007400e9007300680065007a002c00200065002d006d00610069006c002000fc007a0065006e006500740065006b00620065006e002000e90073002000200049006e007400650072006e006500740065006e0020006800610073007a006e00e1006c00610074006e0061006b0020006c006500670069006e006b00e1006200620020006d0065006700660065006c0065006c0151002000410064006f00620065002000500044004600200064006f006b0075006d0065006e00740075006d006f006b0061007400200065007a0065006b006b0065006c0020006100200062006500e1006c006c00ed007400e10073006f006b006b0061006c0020006b00e90073007a00ed0074006800650074002e0020002000410020006c00e90074007200650068006f007a006f00740074002000500044004600200064006f006b0075006d0065006e00740075006d006f006b00200061007a0020004100630072006f006200610074002000e9007300200061007a002000410064006f00620065002000520065006100640065007200200035002e0030002c0020007600610067007900200061007a002000610074007400f3006c0020006b00e9007301510062006200690020007600650072007a006900f3006b006b0061006c0020006e00790069007400680061007400f3006b0020006d00650067002e> /ITA <FEFF005500740069006c0069007a007a006100720065002000710075006500730074006500200069006d0070006f007300740061007a0069006f006e00690020007000650072002000630072006500610072006500200064006f00630075006d0065006e00740069002000410064006f00620065002000500044004600200070006900f9002000610064006100740074006900200070006500720020006c0061002000760069007300750061006c0069007a007a0061007a0069006f006e0065002000730075002000730063006800650072006d006f002c0020006c006100200070006f00730074006100200065006c0065007400740072006f006e0069006300610020006500200049006e007400650072006e00650074002e0020004900200064006f00630075006d0065006e007400690020005000440046002000630072006500610074006900200070006f00730073006f006e006f0020006500730073006500720065002000610070006500720074006900200063006f006e0020004100630072006f00620061007400200065002000410064006f00620065002000520065006100640065007200200035002e003000200065002000760065007200730069006f006e006900200073007500630063006500730073006900760065002e> /JPN <FEFF753b97624e0a3067306e8868793a3001307e305f306f96fb5b5030e130fc30eb308430a430f330bf30fc30cd30c330c87d4c7531306790014fe13059308b305f3081306e002000410064006f0062006500200050004400460020658766f8306e4f5c6210306b9069305730663044307e305930023053306e8a2d5b9a30674f5c62103055308c305f0020005000440046002030d530a130a430eb306f3001004100630072006f0062006100740020304a30883073002000410064006f00620065002000520065006100640065007200200035002e003000204ee5964d3067958b304f30533068304c3067304d307e305930023053306e8a2d5b9a3067306f30d530a930f330c8306e57cb30818fbc307f3092884c306a308f305a300130d530a130a430eb30b530a430ba306f67005c0f9650306b306a308a307e30593002> /KOR <FEFFc7740020c124c815c7440020c0acc6a9d558c5ec0020d654ba740020d45cc2dc002c0020c804c7900020ba54c77c002c0020c778d130b137c5d00020ac00c7a50020c801d569d55c002000410064006f0062006500200050004400460020bb38c11cb97c0020c791c131d569b2c8b2e4002e0020c774b807ac8c0020c791c131b41c00200050004400460020bb38c11cb2940020004100630072006f0062006100740020bc0f002000410064006f00620065002000520065006100640065007200200035002e00300020c774c0c1c5d0c11c0020c5f40020c2180020c788c2b5b2c8b2e4002e> /LTH <FEFF004e006100750064006f006b0069007400650020016100690075006f007300200070006100720061006d006500740072007500730020006e006f0072011700640061006d00690020006b0075007200740069002000410064006f00620065002000500044004600200064006f006b0075006d0065006e007400750073002c0020006b00750072006900650020006c0061006200690061007500730069006100690020007000720069007400610069006b00790074006900200072006f006400790074006900200065006b00720061006e0065002c00200065006c002e002000700061016100740075006900200061007200200069006e007400650072006e0065007400750069002e0020002000530075006b0075007200740069002000500044004600200064006f006b0075006d0065006e007400610069002000670061006c006900200062016b007400690020006100740069006400610072006f006d00690020004100630072006f006200610074002000690072002000410064006f00620065002000520065006100640065007200200035002e0030002000610072002000760117006c00650073006e0117006d00690073002000760065007200730069006a006f006d00690073002e> /LVI <FEFF0049007a006d0061006e0074006f006a00690065007400200161006f00730020006900650073007400610074012b006a0075006d00750073002c0020006c0061006900200076006500690064006f00740075002000410064006f00620065002000500044004600200064006f006b0075006d0065006e007400750073002c0020006b006100730020006900720020012b00700061016100690020007000690065006d01130072006f007400690020007201010064012b01610061006e0061006900200065006b00720101006e0101002c00200065002d00700061007300740061006d00200075006e00200069006e007400650072006e006500740061006d002e00200049007a0076006500690064006f006a006900650074002000500044004600200064006f006b0075006d0065006e007400750073002c0020006b006f002000760061007200200061007400760113007200740020006100720020004100630072006f00620061007400200075006e002000410064006f00620065002000520065006100640065007200200035002e0030002c0020006b0101002000610072012b00200074006f0020006a00610075006e0101006b0101006d002000760065007200730069006a0101006d002e> /NLD (Gebruik deze instellingen om Adobe PDF-documenten te maken die zijn geoptimaliseerd voor weergave op een beeldscherm, e-mail en internet. De gemaakte PDF-documenten kunnen worden geopend met Acrobat en Adobe Reader 5.0 en hoger.) /NOR <FEFF004200720075006b00200064006900730073006500200069006e006e007300740069006c006c0069006e00670065006e0065002000740069006c002000e50020006f0070007000720065007400740065002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e00740065007200200073006f006d00200065007200200062006500730074002000650067006e0065007400200066006f007200200073006b006a00650072006d007600690073006e0069006e0067002c00200065002d0070006f007300740020006f006700200049006e007400650072006e006500740074002e0020005000440046002d0064006f006b0075006d0065006e00740065006e00650020006b0061006e002000e50070006e00650073002000690020004100630072006f00620061007400200065006c006c00650072002000410064006f00620065002000520065006100640065007200200035002e003000200065006c006c00650072002000730065006e006500720065002e> /POL <FEFF0055007300740061007700690065006e0069006100200064006f002000740077006f0072007a0065006e0069006100200064006f006b0075006d0065006e007400f300770020005000440046002000700072007a0065007a006e00610063007a006f006e00790063006800200064006f002000770079015b0077006900650074006c0061006e006900610020006e006100200065006b00720061006e00690065002c0020007700790073007901420061006e0069006100200070006f0063007a0074010500200065006c0065006b00740072006f006e00690063007a006e01050020006f00720061007a00200064006c006100200069006e007400650072006e006500740075002e002000200044006f006b0075006d0065006e0074007900200050004400460020006d006f017c006e00610020006f007400770069006500720061010700200077002000700072006f006700720061006d006900650020004100630072006f00620061007400200069002000410064006f00620065002000520065006100640065007200200035002e0030002000690020006e006f00770073007a0079006d002e> /PTB <FEFF005500740069006c0069007a006500200065007300730061007300200063006f006e00660069006700750072006100e700f50065007300200064006500200066006f0072006d00610020006100200063007200690061007200200064006f00630075006d0065006e0074006f0073002000410064006f0062006500200050004400460020006d00610069007300200061006400650071007500610064006f00730020007000610072006100200065007800690062006900e700e3006f0020006e0061002000740065006c0061002c0020007000610072006100200065002d006d00610069006c007300200065002000700061007200610020006100200049006e007400650072006e00650074002e0020004f007300200064006f00630075006d0065006e0074006f00730020005000440046002000630072006900610064006f007300200070006f00640065006d0020007300650072002000610062006500720074006f007300200063006f006d0020006f0020004100630072006f006200610074002000650020006f002000410064006f00620065002000520065006100640065007200200035002e0030002000650020007600650072007300f50065007300200070006f00730074006500720069006f007200650073002e> /RUM <FEFF005500740069006c0069007a00610163006900200061006300650073007400650020007300650074010300720069002000700065006e007400720075002000610020006300720065006100200064006f00630075006d0065006e00740065002000410064006f006200650020005000440046002000610064006500630076006100740065002000700065006e0074007200750020006100660069015f006100720065006100200070006500200065006300720061006e002c0020007400720069006d0069007400650072006500610020007000720069006e00200065002d006d00610069006c0020015f0069002000700065006e00740072007500200049006e007400650072006e00650074002e002000200044006f00630075006d0065006e00740065006c00650020005000440046002000630072006500610074006500200070006f00740020006600690020006400650073006300680069007300650020006300750020004100630072006f006200610074002c002000410064006f00620065002000520065006100640065007200200035002e00300020015f00690020007600650072007300690075006e0069006c006500200075006c0074006500720069006f006100720065002e> /RUS <FEFF04180441043f043e043b044c04370443043904420435002004340430043d043d044b04350020043d0430044104420440043e0439043a043800200434043b044f00200441043e043704340430043d0438044f00200434043e043a0443043c0435043d0442043e0432002000410064006f006200650020005000440046002c0020043c0430043a04410438043c0430043b044c043d043e0020043f043e04340445043e0434044f04490438044500200434043b044f0020044d043a04400430043d043d043e0433043e0020043f0440043e0441043c043e044204400430002c0020043f0435044004350441044b043b043a04380020043f043e0020044d043b0435043a04420440043e043d043d043e04390020043f043e044704420435002004380020044004300437043c043504490435043d0438044f0020043200200418043d044204350440043d043504420435002e002000200421043e043704340430043d043d044b04350020005000440046002d0434043e043a0443043c0435043d0442044b0020043c043e0436043d043e0020043e0442043a0440044b043204300442044c002004410020043f043e043c043e0449044c044e0020004100630072006f00620061007400200438002000410064006f00620065002000520065006100640065007200200035002e00300020043800200431043e043b043504350020043f043e04370434043d043804450020043204350440044104380439002e> /SKY <FEFF0054006900650074006f0020006e006100730074006100760065006e0069006100200070006f0075017e0069007400650020006e00610020007600790074007600e100720061006e0069006500200064006f006b0075006d0065006e0074006f0076002000410064006f006200650020005000440046002c0020006b0074006f007200e90020007300610020006e0061006a006c0065007001610069006500200068006f0064006900610020006e00610020007a006f006200720061007a006f00760061006e006900650020006e00610020006f006200720061007a006f0076006b0065002c00200070006f007300690065006c0061006e0069006500200065002d006d00610069006c006f006d002000610020006e006100200049006e007400650072006e00650074002e00200056007900740076006f00720065006e00e900200064006f006b0075006d0065006e007400790020005000440046002000620075006400650020006d006f017e006e00e90020006f00740076006f00720069016500200076002000700072006f006700720061006d006f006300680020004100630072006f00620061007400200061002000410064006f00620065002000520065006100640065007200200035002e0030002000610020006e006f0076016100ed00630068002e> /SLV <FEFF005400650020006e006100730074006100760069007400760065002000750070006f0072006100620069007400650020007a00610020007500730074007600610072006a0061006e006a006500200064006f006b0075006d0065006e0074006f0076002000410064006f006200650020005000440046002c0020006b006900200073006f0020006e0061006a007000720069006d00650072006e0065006a016100690020007a00610020007000720069006b0061007a0020006e00610020007a00610073006c006f006e0075002c00200065002d0070006f01610074006f00200069006e00200069006e007400650072006e00650074002e00200020005500730074007600610072006a0065006e006500200064006f006b0075006d0065006e0074006500200050004400460020006a00650020006d006f0067006f010d00650020006f0064007000720065007400690020007a0020004100630072006f00620061007400200069006e002000410064006f00620065002000520065006100640065007200200035002e003000200069006e0020006e006f00760065006a01610069006d002e> /SUO <FEFF004b00e40079007400e40020006e00e40069007400e4002000610073006500740075006b007300690061002c0020006b0075006e0020006c0075006f00740020006c00e400680069006e006e00e40020006e00e40079007400f60073007400e40020006c0075006b0065006d0069007300650065006e002c0020007300e40068006b00f60070006f0073007400690069006e0020006a006100200049006e007400650072006e0065007400690069006e0020007400610072006b006f006900740065007400740075006a0061002000410064006f0062006500200050004400460020002d0064006f006b0075006d0065006e007400740065006a0061002e0020004c0075006f0064007500740020005000440046002d0064006f006b0075006d0065006e00740069007400200076006f0069006400610061006e0020006100760061007400610020004100630072006f0062006100740069006c006c00610020006a0061002000410064006f00620065002000520065006100640065007200200035002e0030003a006c006c00610020006a006100200075007500640065006d006d0069006c006c0061002e> /SVE <FEFF0041006e007600e4006e00640020006400650020006800e4007200200069006e0073007400e4006c006c006e0069006e006700610072006e00610020006f006d002000640075002000760069006c006c00200073006b006100700061002000410064006f006200650020005000440046002d0064006f006b0075006d0065006e007400200073006f006d002000e400720020006c00e4006d0070006c0069006700610020006600f6007200200061007400740020007600690073006100730020007000e500200073006b00e40072006d002c0020006900200065002d0070006f007300740020006f006300680020007000e500200049006e007400650072006e00650074002e002000200053006b006100700061006400650020005000440046002d0064006f006b0075006d0065006e00740020006b0061006e002000f600700070006e00610073002000690020004100630072006f0062006100740020006f00630068002000410064006f00620065002000520065006100640065007200200035002e00300020006f00630068002000730065006e006100720065002e> /TUR <FEFF0045006b00720061006e002000fc0073007400fc0020006700f6007200fc006e00fc006d00fc002c00200065002d0070006f00730074006100200076006500200069006e007400650072006e006500740020006900e70069006e00200065006e00200075007900670075006e002000410064006f006200650020005000440046002000620065006c00670065006c0065007200690020006f006c0075015f007400750072006d0061006b0020006900e70069006e00200062007500200061007900610072006c0061007201310020006b0075006c006c0061006e0131006e002e00200020004f006c0075015f0074007500720075006c0061006e0020005000440046002000620065006c00670065006c0065007200690020004100630072006f0062006100740020007600650020004100630072006f006200610074002000520065006100640065007200200035002e003000200076006500200073006f006e0072006100730131006e00640061006b00690020007300fc007200fc006d006c00650072006c00650020006100e70131006c006100620069006c00690072002e> /UKR <FEFF04120438043a043e0440043804410442043e043204430439044204350020044604560020043f043004400430043c043504420440043800200434043b044f0020044104420432043e04400435043d043d044f00200434043e043a0443043c0435043d044204560432002000410064006f006200650020005000440046002c0020044f043a0456043d04300439043a04400430044904350020043f045604340445043e0434044f0442044c00200434043b044f0020043f0435044004350433043b044f043404430020043700200435043a04400430043d044300200442043000200406043d044204350440043d043504420443002e00200020042104420432043e04400435043d045600200434043e043a0443043c0435043d0442043800200050004400460020043c043e0436043d04300020043204560434043a0440043804420438002004430020004100630072006f006200610074002004420430002000410064006f00620065002000520065006100640065007200200035002e0030002004300431043e0020043f04560437043d04560448043e04570020043204350440044104560457002e> /ENU (Use these settings to create Adobe PDF documents best suited for on-screen display, e-mail, and the Internet. Created PDF documents can be opened with Acrobat and Adobe Reader 5.0 and later.) >> /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ << /AsReaderSpreads false /CropImagesToFrames true /ErrorControl /WarnAndContinue /FlattenerIgnoreSpreadOverrides false /IncludeGuidesGrids false /IncludeNonPrinting false /IncludeSlug false /Namespace [ (Adobe) (InDesign) (4.0) ] /OmitPlacedBitmaps false /OmitPlacedEPS false /OmitPlacedPDF false /SimulateOverprint /Legacy >> << /AddBleedMarks false /AddColorBars false /AddCropMarks false /AddPageInfo false /AddRegMarks false /ConvertColors /ConvertToRGB /DestinationProfileName (sRGB IEC61966-2.1) /DestinationProfileSelector /UseName /Downsample16BitImages true /FlattenerPreset << /PresetSelector /MediumResolution >> /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles true /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /NA /PreserveEditing false /UntaggedCMYKHandling /UseDocumentProfile /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ] >> setdistillerparams << /HWResolution [600 600] /PageSize [612.000 792.000] >> setpagedevice
HistoryItem_V1 TrimAndShift Range: all pages Trim: fix size 8.500 x 11.000 inches / 215.9 x 279.4 mm Shift: none Normalise (advanced option): 'original' 32 D:20120516081844 792.0000 US Letter Blank 612.0000 Tall 1 0 No 675 320 None Up 0.0000 0.0000 Both AllDoc PDDoc Uniform 0.0000 Top QITE_QuiteImposingPlus2 Quite Imposing Plus 2.9 Quite Imposing Plus 2 1 4 3 4 1 HistoryList_V1 qi2base