The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Fault-tolerant average execution time optimization for general-purpose multi-processor system-on-chips

Author

Summary, in English

Fault-tolerance is due to the semiconductor technology development important, not only for safety-critical systems but also for general-purpose (non-safety critical) systems. However, instead of guaranteeing that deadlines always are met, it is for general-purpose systems important to minimize the average execution time (AET) while ensuring fault-tolerance. For a given job and a soft (transient) error probability, we define mathematical formulas for AET that includes bus communication overhead for both voting (active replication) and rollback-recovery with checkpointing (RRC). And, for a given multi-processor system-on-chip (MPSoC), we define integer linear programming (ILP) models that minimize AET including bus communication overhead when: (1) selecting the number of checkpoints when using RRC, (2) finding the number of processors and job-to-processor assignment when using voting, and (3) defining fault-tolerance scheme (voting or RRC) per job and defining its usage for each job. Experiments demonstrate significant savings in AET.

Publishing year

2009

Language

English

Pages

484-489

Publication/Series

2009 Design, Automation & Test in Europe Conference & Exhibition

Document type

Conference paper

Topic

  • Electrical Engineering, Electronic Engineering, Information Engineering

Conference name

Design Automation and Test in Europe (DATE 2009)

Conference date

2009-04-20 - 2009-04-24

Conference place

Nice, France

Status

Published

ISBN/ISSN/Other

  • ISBN: 978-1-4244-3781-8