Table 1: Fault injection results for each program.
Normal Hardened
Injected 10000 30000
No Effect 2240 6690
Failures 7760 0
Corrected 0 23010
Calc. Time (us) 4.72 15.27
We can also see from the results that of the 77%
of activated faults, 100% of these caused
computation failures in the normal program case.
The hardened case however, detected and corrected
100% of the activated faults.
5 CONCLUSIONS
In this paper, the mechanisms that can lead to
memory corruption in COTS PC control devices
have been considered. A novel approach to software
implemented fault-tolerance has been presented. The
approach, based on an SPMD architecture, can be
used to compliment existing error detection and
SBST techniques for COTS processors used in open
architecture controllers. The approach relies on data
and instruction duplication. It has been shown that
the method is easily applied, results in readable
code, and is able to tolerate 100% of the injected
faults in the benchmark described. Whilst the
application of the techniques provides high levels of
data fault tolerance, there is obviously a trade-off
with increases in the code and data size and task
execution time. Prospective designers must
obviously take these factors into account when
considering the techniques.
ACKNOWLEDGEMENTS
The work described in this paper was supported by
the Leverhulme Trust (Grant F/00 212/D).
REFERENCES
Burn, K., Short, M., Bicker, R., 2003. Adaptive And
Nonlinear Force Control Techniques Applied to
Robots Operating in Uncertain Environments. Journal
of Robotic Systems, Vol. 20, No. 7, pp. 391-400.
Dhillon, B.S., Fashandi, A.R.M., 1997. Safety and
reliability assessment techniques in robotics. Robotica,
Vol. 15, pp. 701-708.
Fu, K.S., Gonzales, R.C., Lee, C.S.G., 1987. Robotics:
Control, Sensing, Vision And Intelligence. McGraw-
Hill International Editions.
Gong, C., Melhem, R., Gupta, R., 1997. On-line error
detection through data duplication in distributed
memory systems. Microprocessors and Microsystems,
Vol. 21, pp. 197-209.
Hamdioui, S., van der Goor, A., Rogers, M., 2002. March
SS: A Test for All Static Simple RAM Faults. In Proc.
Of the 2002 IEEE Intl. Workshop on Memory Tech.,
Design and Testing.
Hong, K.S., Choi, K.H., Kim, J.G., Lee, S., 2001. A PC-
based open robot control system: PC-ORC. Robotics
and ComputerIntegrated Manufacturing, Vol. 17, pp.
355-365.
Lee, C.J., Mavroidis, C., 2000. WinRec V.1: Real-Time
Control Software for Windows NT and its
Applications. In Proc. American Control Conf.,
Chicago, Il., pp. 651-655.
Levenson, N.G., 1995. Safeware: System Safety and
Computers, Reading, M.A., Addison-Wesley.
Messer, A., Bernadat, P., Fu, G., Chen, G., Dimitrijevic,
Z., Lie, D., Mannaru, D.D, Riska, A., Milojicic, D.,
2001. Susceptibility of Modern Systems and Software
to Soft Errors, In Proc. Int. Conf. on Dependable Sys.
And Networks, Goteburg, Sweden.
MIL-HDBK-217F, 1991. Military Handbook of Reliability
Prediction of Electronic Equipment. December 1991.
Normand, E., 1996. Single Event Effects in Avionics,
IEEE Trans. on Nuclear Science, Vol. 43, No. 2.
Oh, N., Shivani, P.P., McCluskey, E.J., 2001. Control
Flow Checking by Software Signature. IEEE Trans.
On Reliability, September 2001.
Ong, H.L.R, Pont, M.J., 2002. The impact of instruction
pointer corruption on program flow: a computational
modelling study. Microprocessors and Microsystems,
25: 409-419.
Rajabzadeh, A., Miremadi, S.G., 2006. Transient detection
in COTS processors using software approach,
Microelectronics Reliability, Vol. 46, pp. 124-133.
Rebaudengo, M., Sonza Reorda, M., Violante, M., 2002.
A new approach to software-implemented fault
tolerance. In Proc. IEEE Latin American Test
Workshop, 2002.
Schofield, S., Wright, P., 1998. Open Architecture
Controllers for Machine Tools, Part 1: Design
Principles. Trans. ASME Journ. of Manufacturing Sci.
& Engineer, Vol. 120, Pt. 2, pp. 417-424.
Short, M., 2003. A Generic Controller Architecture for
Advanced and Intelligent Robots. PhD. Thesis,
University of Sunderland, UK.
Sosnowski, J., 2006. Software-based self-testing of
microprocessors. Journal of Systems Architecture,
Vol. 52, pp. 257-271.
Storey, N., 1996. Safety Critical Computer Systems.
Addison Wesley Publishing.
EFFICIENT IMPLEMENTATION OF FAULT-TOLERANT DATA STRUCTURES IN PC-BASED CONTROL
SOFTWARE
219