Impact of Cross-standard Cell Libraries on Machine Learning based

Hardware Trojan Detection

Shang-Wen Chen, Jian-Wei Liao, Chia-Wei Tien

and Jung-Hsin Hsiao

Cybersecurity Technology Institute, Institute for Information Industry, Taipei, Taiwan R.O.C.

Keywords: Hardware Trojan, Hardware Security, Machine Learning.

Abstract: Hardware Trojans (HTs) have become a new threat that owns a huge possibility to widespread into over the

world because of its unique characteristic. Hardware Trojan is dependent on the invaded hardware, and if the

invading success such that the devices which use the invaded hardware will spread to customers of hardware

vendors all over the world. Thus, how to detect HT exists in our devices or not becomes an important issue.

There are already some researches to try to solve this problem and acquire good results. The common premise

of these researches is that the adopted standard cell library in model and testing set is the same. However,

there is no good performance to detect HT with machine learning in reality under the above premise. The

possible thinking is that adopted standard cell libraries of model and testing set are different in real case and

it cause the bad result of machine learning. We experiment and verify this view. That is, we prove that the

impact of cross-standard cell library on machine learning in hardware Trojan detection exists.

1 INTRODUCTION

There has a new category of hardware attack for

integrated circuits (ICs) called hardware Trojan (HT)

that appears in recent years. HTs have been known

that they can trigger several serious attacks, though

the number of reported HT is much less than software

Trojans until now. However, the effects of HT that

have already proved by the occurred events and the

amount of HT will increase as time goes on.

Therefore, the threat of HT is also rising with the

time.

According to the related researches (R.S.

Chakraborty et al., 2013) (Mukhopadhyay &

Chakraborty, 2011), HT owns several properties.

These properties contain many aspects such as

disabling or altering the functionality of the IC,

decreasing in reliability and expected lifetime of

industry control system (ICS) (TrendMicro, n.d.),

leaking sensitive user information through convert

communication channels and bypassing the software

security facilities and spy the users and so on. There

are some cases that are caused by the front properties.

For example, a Syrian radar failed to warn of an

incoming air strike in 2007. The occurred reason is

doubted as the potential backdoor built into system’s

chips (Mitra et al., 2015). Moreover, in 2014, the New

York Times (David,E Sanger & Thom Shanker,

2014) reported that there is a program of US National

Security Agency (NSA) called Quantum program

which plans to implant HT circuitry into USB

communication protocol or USB port. Besides, there

are the other reports (Markoff, 2009) (Ellis, 2012) can

display the influence of HT.

So far, there are some researches (Agrawal et al.,

2007) (Danesh et al., 2014) (S. Jha & S. K. Jha, 2008)

(Chakraborty et al., 2009) (Alkabani & Koushanfar,

2009) (Hasegawa et al., 2016) (Iwase et al., 2015) that

try to solve the HT problem. They tried several

different static methods such as IC fingerprints

(Agrawal et al., 2007), side-channel analysis (Danesh

et al., 2014) (Alkabani & Koushanfar, 2009), logic

testing (S. Jha & S. K. Jha, 2008) (Chakraborty et al.,

2009) and the other static analysis (Hasegawa et al.,

2016) (Iwase et al., 2015) and acquired good results.

There are brief introductions about the above papers

in Table 1. Besides, the premise of these researches

that use machine learning is that the used standard cell

library of these gate-level netlists is the same one.

However, there has a large difference between the

premise and the reality. This could lead that the real

performance of methods decreases in reality, and we

try to prove it.

420

Chen, S., Liao, J., Tien, C. and Hsiao, J.

Impact of Cross-standard Cell Libraries on Machine Learning based Hardware Trojan Detection.

DOI: 10.5220/0010858800003120

In Proceedings of the 8th International Conference on Information Systems Security and Privacy (ICISSP 2022), pages 420-425

ISBN: 978-989-758-553-1; ISSN: 2184-4356

Table 1: Related researches about solving HT problem.

Pape

Metho

Description

(Agrawal et al., 2007) IC fingerprints 1. Using noise modeling to construct a set of

fingerprints for an IC family

2. The fingerprints utilizes side-channel information

such as power, temperature, and electromagnetic (EM)

rofiles

(Danesh et al., 2014) Side-channel analysis Exploiting the special power characteristics of

differential cascade voltage switch logic (DCVSL) to

detect HTs at runtime

(Alkabani & Koushanfar,

2009)

Side-channel analysis 1. New Trojan detection method based on nonintrusive

external IC quiescent current measurements

2. Using consistency, which is a new self-defined

metric, and properties of function to detect Tro

ans

(S. Jha & S. K. Jha, 2008) Logic testing 1. A randomization based technique to verify whether

a manufactured chip is infected by Trojan

2. If infected, then this result and its fingerprint input

attern will be reporte

(Chakraborty et al., 2009) Logic testing 1. A test pattern generation technique based on

multiple excitation of rare logic conditions at internal

nodes

2. Increasing triggered and detected probability of

Tro

ans and the sensitivit

of Tro

an detection

The contribution of this study to literture is two-

fold. First, we propose an new idea about that there

could be a bias of result of machine learning in

Hardware Trojan detection between real case and

related researches because of different premises.

Second, we experiment and acquire the result of at

least 10% decreasing of machine learning detection in

Recall and F1-score. Thus, the correctness of our idea

is verified. This provides an important new premise

for next researchers.

The rest of the paper is organized as follows:

Section 2 will describe the detail of problem

encountered in reality and the difficulty in cross-

library machine learning detection. Section 3

introduces our proposed method. Section 4 displays

our experiment result of cross-standard cell library

machine learning detection to prove the influence of

different premises in reality. Section 5 describes our

conclusion and future work.

2 BACKGROUND

In this chapter, we describe that the composition of

hardware in cell’s view and the reason of why the

commonly used features are the parameters generated

by standard cell library. Most important of all, the

difficulty of cross-library hardware Trojan detection

in machine learning.

Hardware Trojan is a kind of malware that

launches its attack through hardware. Each hardware

owns control chips which are composed of many

cells. Each cell owns many control parameters like

leakage power, area, footprint, details values of each

pin, timing and so on. As the other perspective, a cell

can be represented as a parameters pair. That is, the

hardware can be viewed as the collection of

parameters pairs.

These values of parameter in each cell is decided

by the using standard cell library adopted by

hardware, and the decision of adopted category of

standard cell library is judged by hardware vendor.

There are many different categories of standard cell

library, and hardware vendors select their adopted

standard cell library which is suitable for the working

environment of vendors. For security, hardware

vendors will not leak their using standard cell library

and related information about their products. Thus,

the parameters pairs of hardware are not only useful

data but also the common used features that can

acquire easily.

As the front mentioned, vendors will not reveal

their adopted standard cell libraries for security.

Besides, there exist many different standard cell

libraries corresponding to their own specific

situations. That is, it is almost impossible to

encounter that there are two different companies

which use the same standard cell library because the

specific environment of each company is different.

Impact of Cross-standard Cell Libraries on Machine Learning based Hardware Trojan Detection

421

This fact indicates that the premise, which we

mentioned in last chapter, is almost wrong in reality.

However, there almost has not any research to

describe how to detect hardware Trojan in hardware

which uses different standard cell libraries.

If we want to use machine learning to detect

hardware Trojan in hardware with different standard

cell libraries, there exist some foreseeable issues in it.

First, the parameters pairs of cells in hardware are the

common features and the values of parameter are

decided by adopted standard cell library. Because

every company uses different standard cell library,

and different libraries are independent with each

other. However, the basic hardware cells are the

same. That is, even if the used standard cell libraries

are different, there should be some relations between

different parameters pairs generated by different

standard cell libraries. However, we cannot confirm

the relationship between parameters pairs that are

decided by different standard cell libraries. There is

no such definition to quantify this relationship.

Second, if we train a model with the parameters’ pairs

decided by A standard cell library, and then use the

parameters pairs which is generated by B standard

cell library as testing set. Although the basic hardware

cells are the same, after the values of parameters are

processed by differently independent standard cell

libraries. The performance of model is decided by

how much relationship remains between these

parameters pairs.

To evaluate how a model performance will be

affected by parameters pairs generated by different

cell libraries, we make an experiment of machine

learning detection though using different standard

cell libraries. Besides, this condition of experiment is

closed to real situation. That is, the result of

experiment owns reliability.

3 PROPOSED METHOD

In this chapter, we will introduce the process of our

proposed method. Moreover, we also explain the

meaning of parameter entered into program.

In our proposed method, the execution process

can divide into two stages. The first stage is pre-

processing stage, and the second stage is processing

stage. The former will generate training set and

testing set for machine learning according to input

parameters. The latter will use the output of the first

stage to train a model and output the result of

hardware Trojan detection. Figure 1 shows the

process diagram of our experiment and the detailed

descriptions of this diagram will be state as follows.

In the first stage, we have to execute pre-

processing twice to acquire the necessary training set

and testing set used in machine learning. Then we use

training set to generate a model that will be used in

next stage. In each pre-processing, we have to provide

six parameters to program. The parameters are circuit

name, standard cell library, mapping on/off, mapping

mode, filter threshold one and filter threshold zero.

The detailed descriptions are described in Table 2. It

is worth to notice that only the value of mapping

on/off is changed between first and second pre-

processing and the others are the same. After

execution pre-processing twice, we can acquire

training set and testing set which are used to generate

the model of machine learning and test the

performance of it. Moreover, the algorithms we used

in model are Support Vector Machine (SVM) (Noble,

2006) and random forest (RF) (Breiman, 2001). After

generating model and testing, we can acquire several

evaluation metrics like accuracy, precision, recall,

F1-score and etc.

In the second stage, we will firstly repeat the same

process of first stage from beginning to generating

training set and testing set. In this part, the only

difference is that the used standard cell library of

parameters is different from the one used in first

stage. Then we can acquire new training set and

testing set which are generated by new standard cell

library. We use this new testing set to test the model

generated in first stage and acquire the new outputs of

evaluation metrics mentioned in last paragraph.

4 EXPERIMENTS

In this chapter, we will describe our experiment to

display the comparison of the results of machine

learning with different standard cell libraries.

We make an experiment to prove that the features

compiled by different standard cell libraries will

affect the performance of machine learning detection.

We describe the dataset used in this experiment at

first. In total, we collected 199 different netlists from

various sources including public and private ones.

However, there is a problem of non-disclosure

agreement if we used the netlists collected from

private source. Thus, we selected the 88 netlists

collected from public source, Trust-Hub (Trust-

Hub.org, n.d) (Salmani et al., 2013) (Shakya et al.,

2017). Moreover, we collected 144 different standard

cell libraries from private source and randomly chose

two libraries as the compiler of training set and testing

set.

ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

422

Figure 1: A processing diagram of experiment.

Table 2: The description of parameter that used in first stage

of execution process.

Parameter name Description

circuit name Name of gate-level

netlist waiting to

execute pre-

rocessin

standard cell library The path of standard

cell library used in

composition sta

mapping on/off Decision of adding time

order information into

path features or not in

this execution

mapping mode Decision of allowing or

disallowing that path

exists error or not when

adding time order

information

filter threshold one Threshold of

percentage of signal

which indicates 1 in

ath filte

filter threshold zero Threshold of

percentage of signal

which indicates 0 in

ath filte

For comparing the results of machine learning

with different libraries in the same gate-level netlist,

we compared the outputs of evaluation metrics of first

stage and second stage. In the first stage, we choose

one of our collected standard cell libraries called

“saed32rvt_tt1p05v25c.” Besides, we adopt the other

standard cell library called “

saed32rvt_tt1p05v125c” in the second stage. The

comparison result displays in Figure 2. In Figure 2,

we can observe that no matter what algorithm the

model uses, the result of testing set generated by

“saed32rvt_tt1p05v125c” library shows a large

difference in several metrics such as TNR, FNR,

recall and F1-score. There exist at least 12% decrease

in the TPR, recall and F1-score metric when model is

used to detect the features generated by the second

standard cell library. On the other hand, there exist at

least 12% increase in the FPR metric when model is

used to detect the features generated by the second

standard cell library.

Based on the experiment result, we can confirm

that there exists a great difference to the detection

result of hardware Trojan in machine learning when

adopting different cell libraries so that the detection

of cross-standard cell library hardware Trojan is a

great challenge to machine learning.

5 CONCLUSIONS

In this paper, we introduce a new category of threat

called hardware Trojan and its possible serious effect

to all over the world. Although there have been

existed several researches to discuss this issue and

acquire some good results. However, they focused on

detecting hardware Trojan that used the same

standard cell library. On the other hand, we research

on the problem of the detection of cross-standard cell

library hardware Trojan that is the common case in

reality. This is a new research field but less people to

Impact of Cross-standard Cell Libraries on Machine Learning based Hardware Trojan Detection

423

Figure 2: Comparison of testing results between different libraries.

research it because of its difficulty. We hope this

study can be the prior knowledge of the follow-up

investigators.

REFERENCES

R.S. Chakraborty, I.Saha, A.Palchaudhuri, G.K.Naik,

(2013). “Hardware Trojan insertion by direct

modification of FPGA configuration bitstream”,

IEEEDes. Test30 (2), pp. 45–54.

D. Mukhopadhyay and R. S. Chakraborty, (2011).

‘‘Testability of cryptographic hardware and detection

of hardware Trojans,’’ in Proc. IEEE Asian Test Symp.

(ATS’11), pp. 517–524.

S. Mitra, H.S.P.Wong, S.Wong, (2015) “The Trojan-

proofchip”, Spectr.IEEE52 (2), pp. 46–51

D. Agrawal, S. Baktir, D. Karakoyunlu, P. Rohatgi, and B.

Sunar, (2007). “Trojan Detection using IC

Fingerprinting,” in Security and Privacy, SP ’07. IEEE

Symposium on, pp. 296–310

W. Danesh, J. Dofe and Q. Yu, (2014) "Efficient hardware

Trojan detection with differential cascade voltage

switch logic", Proc. VLSI Des., pp. 1-10.

S. Jha and S. K. Jha., (2008). “Randomization Based

Probabilistic Approach to Detect Trojan Circuits”, in

Proc. IEEE High Assurance Systems Engineering

Symposium − HASE, pp. 117–124

R. S. Chakraborty, F. Wolff, S. Paul, C. Papachristou, and

S. Bhunia., (2009). “MERO: A Statistical Approach for

Hardware Trojan Detection”, in Proc. Cryptographic

Hardware and Embedded Systems − CHES, volume

5747, pp. 396–410

Y. Alkabani and F. Koushanfar., (2009). “Consistency-

based Characterization for IC Trojan Detection”, in

Proc. IEEE International Conference on Computer-

Aided Design − ICCAD, pp. 123–127.

K. Hasegawa, M. Oya, M. Yanagisawa, and N. Togawa,

(2016) “Hardware Trojans classification for gate-level

netlists based on machine learning,” in Proc. IEEE

Symposium on On-Line Testing and Robust System

Design (IOLTS), pp. 203–206

T. Iwase, Y. Nozaki, M. Yoshikawa, and T. Kumaki, (2015)

“Detection technique for hardware Trojans using

machine learning in frequency domain,” in 2015 IEEE

4th Global Conference on Consumer Electronics

(GCCE), pp. 185–186

David,E Sanger, Thom Shanker, (2014, January 14)

“N.S.A. Devises Radio Pathway Into Computers”

http://www.nytimes.com/2014/01/15/us/nsa-effort-

pries-open-computers-not-connected-to-internet.html

TrendMicro. (n.d.). Industrial Control System. Retrieved

October 10, 2021, from https://www.trendmicro.com/

vinfo/us/security/definition/industrial-control-system

J. Markoff, (2009, October 26). “Old Trick Threatens the

Newest Weapons,” http://www.nytimes.com/2009/10/

27/science/27trojan.html?pagewanted=all& r=1&

J. Ellis, (2012, February 27). “Trojan integrated circuits,”

http://chipsecurity.org/2012/02/trojan-circuit/.

Noble, W. S., (2006). “What is a support vector machine?”

Nature Biotech, 24 (12), 1565–1567

Breiman L, (2001). Random forests. Machine Learning,

45(1): 5–32

Trust-Hub.org. (n.d.). Chip-level Trojan Benchmarks.

Retrieved from https://www.trust-hub.org/#/

benchmarks/chip-level-trojan

H. Salmani, M. Tehranipoor, and R. Karri, (2013). "On

Design vulnerability analysis and trust benchmark

development," in IEEE Int. Conference on Computer

Design (ICCD)

B. Shakya, T. He, H. Salmani, D. Forte, S. Bhunia, and M.

Tehranipoor, (2017), “ Benchmarking of Hardware

ICISSP 2022 - 8th International Conference on Information Systems Security and Privacy

424

Trojans and Maliciously Affected Circuits,” Journal of

Hardware and Systems Security (HaSS)

Impact of Cross-standard Cell Libraries on Machine Learning based Hardware Trojan Detection

425