A Nearly Optimal Comparison-Based Diagnosis Algorithm for Systems of Arbitrary Topology

Roverli P. Ziwich, Elias P. Duarte Jr.
IEEE Transactions on Parallel and Distributed Systems, Vol. PP, Issue 99, 2016.
(accepted for publication in a future issue of this journal)
 [IEEE Xplore]





Abstract

Comparison-based diagnosis is a practical approach to detect faults in hardware, software, and parallel and distributed systems. Diagnosis is based on the comparison of task outputs returned by pairs of system units. This work introduces a novel diagnosis algorithm to identify faults in $t$-diagnosable systems of arbitrary topology under the MM* model. The complexity of the proposed algorithm is $O(t^2 \Delta N)$ in the worst case for a system with $N$ units, where $t$ denotes the maximum number of faulty units allowed and $\Delta$ corresponds to the maximum degree of a unit in the system. This complexity is nearly optimal in the sense that it is very close to that of traversing the syndrome once. Besides the algorithm specification and correctness proofs, simulations results are also presented, showing the typical performance of the algorithm for different systems.


References

T. Araki and Y. Shibata. Diagnosability of Butterfly Networks Under the Comparison Approach. IEICE Transactions on Fundamentals, E85-A(5), May 2002.

T. Araki and Y. Shibata. Efficient Diagnosis on Butterfly Networks under the Comparison Approach. IEICE Transactions on Fundamentals, E85-A(4), Apr. 2002.

F. Bistouni and M. Jahanshahi. Pars Network: A Multistage Interconnection Network With Fault-Tolerance Capability. Journal of Parallel and Distributed Computing, 75:168–183, Jan. 2015.

D. M. Blough and H. W. Brown. The Broadcast Comparison Model for On-Line Fault Diagnosis in Multicomputer Systems: Theory and Implementation. IEEE Transactions on Computers, 48:470–493, May 1999.

C.-P. Chang, P.-L. Lai, J. J.-M. Tan, and L.-H. Hsu. Diagnosability of t-Connected Networks and Product Networks under the Comparison Diagnosis Model. IEEE Transactions on Computers, 53(12):1582–1590, Dec. 2004

G.-Y. Chang, G.-H. Chen, and G. J. Chang. (t, k)-Diagnosis for Matching Composition Networks under the MM* Model. IEEE Transactions on Computers, 56(1):73–79, Jan. 2007.

N.-W. Chang, W.-H. Deng, and S.-Y. Hsieh. Conditional Diagnosability of (n, k)-Star Networks Under the Comparison Diagnosis Model. IEEE Transactions on Reliability, 64(1):132–143, Sept. 2014.

N.-W. Chang and S.-Y. Hsieh. Structural Properties and Conditional Diagnosability of Star Graphs by Using the PMC Model. IEEE Transactions on Parallel and Distributed Systems, 25(11):3002– 3011, Nov. 2014.

S. Chessa and P. Santi. Comparison-Based System-Level Fault Diagnosis in Ad Hoc Networks. Proc. of the 20th Symp. on Reliable Distributed Systems, LA, USA, pages 257–266, Oct. 2001.

C.-F. Chiang and J. J. M. Tan. A Novel Approach to ComparisonBased Diagnosis for Hypercube-Like Systems. Journal of Information Science and Engineering, 24(1):1–9, Jan. 2008.

C.-F. Chiang and J. J. M. Tan. Using Node Diagnosability to Determine t-Diagnosability under the Comparison Diagnosis Model. IEEE Transactions on Computers, 58(1):251–259, Jan. 2009.

E. P. Duarte Jr., A. Weber, and K. V. O. Fonseca. Distributed Diagnosis of Dynamic Events in Partitionable Arbitrary Topology Networks. IEEE Transactions on Parallel and Distributed Systems, 23(8):1415–1426, Aug. 2012.

E. P. Duarte Jr., R. P. Ziwich, and L. C. P. Albini. A Survey of Comparison-Based System-Level Diagnosis. ACM Computing Surveys, 43(3):22:1–22:56, Apr. 2011.

M. Elhadef. A Modified Hopfield Neural Network for Diagnosing Comparison-Based Multiprocessor Systems Using Partial Syndromes. Proc. of the 17th IEEE Intl. Conf. on Parallel and Distributed Systems, pages 646–653, Dec. 2011.

M. Elhadef, A. Boukerche, and H. Elkadiki. Self-Diagnosing Wireless Mesh and Ad Hoc Networks using an Adaptable Comparison-Based Approach. Proc. of the 2nd Intl. Conf. Availability, Reliability and Security, pages 983–990, Apr. 2007.

M. Elhadef and A. Nayak. A Novel Generalized-ComparisonBased Self-Diagnosis Algorithm for Multiprocessor and Multicomputer Systems Using a Multilayered Neural Network. Proc. of the 13th IEEE Intl. Conf. on Computational Science and Engineering, pages 245–252, Dec. 2010.

M. Elhadef and A. Nayak. Comparison-Based System-Level Fault Diagnosis: A Neural Network Approach. IEEE Transactions on Parallel and Distributed Systems, 23(6):1047–1059, June 2012.

J. Fan. Diagnosability of Crossed Cubes. IEEE Transactions on Computers, 13(10):1099–1104, Oct. 2002.

D. Fussell, M. Malek, and S. Rangarajan. Wafer-Scale Testing/Design for Testability, chapter 9, pages 413–472. Kluwer, 1989.

W.-S. Hong and S.-Y. Hsieh. Strong Diagnosability and Conditional Diagnosability of Augmented Cubes Under the Comparison Diagnosis Model. IEEE Transactions on Reliability, 61(1):140– 148, Mar. 2012.

S.-Y. Hsieh and Y.-S. Chen. Strongly Diagnosable Product Networks Under the Comparison Diagnosis Model. IEEE Transactions on Computers, 57(6):721–732, June 2008.

S.-Y. Hsieh and Y.-S. Chen. Strongly Diagnosable Systems Under the Comparison Model. IEEE Transactions on Computers, 57(12):1720–1725, Dec. 2008.

S.-Y. Hsieh and C.-Y. Kao. Determining the Conditional Diagnosability of k-Ary n-Cubes Under the MM* Model. Lecture Notes in Computer Science, 6796:78–88, June 2011.

S.-Y. Hsieh, C.-Y. Tsai, and C.-A. Chen. Strong Diagnosability and Conditional Diagnosability of Multiprocessor Systems and Folded Hypercubes. IEEE Transactions on Computers, PP(99), May 2012.

M. Jahanshahi and F. Bistouni. A New Approach to Improve Reliability of the Multistage Interconnection Networks. Journal of Computers and Electrical Engineering, 40(8):348–374, Nov. 2014.

P.-L. Lai, J. J. Tan, C.-H. Tsai, and L.-H. Hsu. The Diagnosability of the Matching Composition Netork under the Comparison Diagnosis Model. IEEE Transactions on Computers, 53(8):1064–1069, Aug. 2004.

L. Lin, S. Zhou, L. Xu, and D. Wang. The Extra Connectivity and Conditional Diagnosability of Alternating Group Networks. IEEE Transactions on Parallel and Distributed Systems, PP(99):1–12, Aug. 2014.

J. Maeng and M. Malek. A Comparison Connection Assignment for Self-Diagnosis of Multiprocessor Systems. Proc. of the 11th IEEE Fault-Tolerant Computing Symp., pages 173–175, June 1981.

M. Malek. A Comparison Connection Assignment for Diagnosis of Multiprocessor Systems. Proc. of the 7th Annual Intl. Symp. on Computer Architecture, pages 31–36, May 1980.

F. S. Martins, R. M. Andrade, A. L. Santos, B. Schulze, and J. N. Souza. Detecting Misbehaving Units on Computational Grids. Concurrency and Computation: Practice & Experience, 22(3):329–342, Mar. 2009.

F. Preparata, G. Metze, and R. T. Chien. On the Connection Assignment Problem of Diagnosable Systems. IEEE Transactions on Electronic Computers, 16:848–854, Dec. 1967.

S. Rangarajan, D. Fussell, and M. Malek. Built-in Testing of Integrated Circuits Wafers. IEEE Transactions on Computers, 39(2):195– 205, Feb. 1990.

A. Sengupta and A. T. Dahbura. On Self-Diagnosable Multiprocessor Systems: Diagnosis by the Comparison Approach. IEEE Transactions on Computers, 41(11):1386–1396, Nov. 1992.

D. Wang. Diagnosability of Hipercubes and Enhanced Hypercubes under the Comparison Diagnosis Model. IEEE Transactions on Computers, 48(12):1369–1374, Dec. 1999.

H. Yang and X. Yang. A Fast Diagnosis Algorithm for Locally Twisted Cube Multiprocessor Systems under the MM* Model. Computers & Mathematics with Applications, 53(6):918–926, Mar. 2007.

X. Yang. A Linear Time Fault Diagnosis Algorithm for Hypercube Multiprocessors Under the MM* Model. Proc. of the 12th Asian Test Symp., pages 50–55, Nov. 2003.

X. Yang and Y. Y. Tang. Efficient Fault Identification of Diagnosable Systems under the Comparison Model. IEEE Transactions on Computers, 56(12):1612–1618, Dec. 2007.

X. F. Yang, G. M. Megson, and D. J. Evans. A Comparison-Based Diagnosis Algorithm Tailored for Crossed Cube Multiprocessor Systems. Microprocessors and MicroSystems, 19(4):169–175, May 2005.

T.-L. Ye and S.-Y. Hsieh. A Scalable Comparison-Based Diagnosis Algorithm for Hypercube-Like Networks. IEEE Transactions on Reliability, 62(4):789–799, Oct. 2013.

J. Zheng, S. Latifi, E. Regentova, K. Luo, and X. Wu. Diagnosability of Star Graphs under the Comparison Diagnosis Model. Information Processing Letters, 16(1):73–95, Jan. 2002.

S. Zhou. The Conditional Diagnosability of Crossed Cubes Under the Comparison Model. Intl. Journal of Computer Mathematics, 87(15):3387–3396, Dec. 2010.

R. P. Ziwich, E. P. Duarte Jr., and L. C. P. Albini. Distributed Integrity Checking for System with Replicated Data. Proc. of the 11th IEEE Intl. Conf. on Parallel and Distributed Systems, pages 363– 369, July 2005.