COMPARING TWO VERILOG FAULT SIMULATORS

by Jerry McGoveran

(Originally published in Integrated System Design, October 1995)

Introduction

With circuit complexities increasing and consumer demand for quality unabated, the designer's task of assuring that his circuit is sufficiently tested is a difficult one. This task is made even more difficult when there is not one but several designers involved with responsibilities for separate portions of the chip. When functional tests don't provide adequate coverage and cost goals don't allow for increased die area for scan test circuitry, it is time to consider fault simulation. Armed with this tool, you can identify any holes in the test suite and direct your efforts at correcting those deficiencies.

The importance of thorough testing is illustrated quite effectively by the recent Pentium bug and the resulting public relations nightmare experienced by Intel. While this was caused by a design error, it may just as easily have been a mask defect. A stuck input to a logic gate embedded in an incompletely tested function block can be just as disastrous as an incomplete lookup table in a floating point unit.

Setting Up the Evaluation

With these concerns in mind, my client asked me to conduct an evaluation on fault simulators in order to make a selection for in-house use. We first eliminated simulators that would not handle behavioral models or Verilog netlists. After deciding that we had the time and resources to evaluate only two products, the client focused on Cadence's Verifault version 1.3 and Simucad's Silos III version 94.202. Two of the client's ASIC designs were used as test cases. Circuit "Alpha", consisting of approximately 22K gates, was the primary test case. To evaluate the capability of each simulator to deal effectively with behavioral models, I used a second ASIC design, "Beta", consisting of 60K gates and internal CPU, RAM , and ROM. All tests were executed on a SUN Sparc 20 with 208MB of RAM and plenty of disk space.

The client's selection criteria are listed in Figure 1 along with how each simulator fared on each requirement. Most important was that of Verilog compatibility for both netlist and behavioral models. The goal was to be able to use existing Verilog netlists, libraries and test vectors without extensive modification. Often, fault simulation is not performed until after the ASIC is signed off, and it is important to maintain the integrity of the signed-off netlist. Ideally, you should be able to build a testbench similar to the one used for verification which will instantiate the netlist module and the test vectors without any modification at all.

Figure 1: Selection Criteria
Verifault Silos
1 Simulates a VERILOG netlist with both gate and behavioral models. ( Faults graded only on gate models.) yes yes
2 Does not require any modification of VERILOG netlist or models. yes yes
3 Accepts SDF file for back-annotation. yes partial
4 Has iterative mode to accumulate fault detection statistics across multiple tests. yes yes
5 Scope of fault analysis can be narrowed to include specific portions of the netlist. yes yes
6 Has "Save and Restart" capability under full user control. yes yes
7 Has distributed mode to divide job among multiple workstations. yes no
8 Has additional post-processing capability to enhance analysis. yes no
9 Has clear and understandable report and log files. yes yes
10 Runs without a hardware accelerator. yes yes
11 Has statistical and 100% fault sampling capability. yes yes
12 Is VERILOG-XL compatible. yes partial

The tests used for each circuit are listed in Figure 2. Not all of the tests were completely usable for the purposes of this evaluation. The test A10 causes major problems with both simulators. Verifault takes an order of magnitude more time to simulate it compared with the other tests, and Silos gives a non-convergence error and quits. What investigation I have done suggests there is a great deal of activity possibly caused by an oscillating feedback path. Time limitations prevented me from tracking down the problem and fixing it, so this test was eliminated from the evaluation.

I was able to demonstrate that Verifault correctly handled behavioral models by compiling and running a small sample of faults on the "Beta" design. The CPU, RAM and ROM were modeled behaviorally, and the CPU model was also encrypted with the `protect directive. Since Cadence has not released its encryption scheme to OVI, Silos III cannot support the ` protect directive. Therefore I was unable to run any simulations of "Beta" with Silos III. In order to run this with Silos, we must first get the vendor to encrypt the megacell with Simucad's own encryption method which they were willing to do. However, the vendor's libraries were partially coded behaviorally, and I was satisfied that both products met this criteria.

Figure 2: Test cases
Circuit Test Name Number of patterns
Alpha - 22K gates, 36K prime faults A1 3086
A2 2658
A3 2179
A4 1511
A5 1877
A6 253
A7 2316
A8 905
A9 410
A10 3153
Beta - 60K gates, CPU, RAM, ROM B1 40734

Features

In spite of the price difference (see Figure 3), the two fault simulators compare fairly well. Both are quite able to handle the vendor libraries and Verilog coded stimulus and neither required any modifications to the netlist. Compatibility with Verilog-HDL was exceptionally complete for Verifault with one minor exception, and Silos, while not supporting 100% of OVI constructs, was compatible in all of the essentials. The major incompatibilities will be discussed in detail below.

Each simulator included most of the features that you would want for a task as lengthy as fault grading. Both can narrow the scope of analysis to a specific level of hierarchy or list of faults, and both can choose a statistical sample of the faults for a quick analysis. Either product will allow the user to accumulate results across multiple test patterns, and both provide a method to save the fault simulation state for resumption at a later time.

Though I did not test them, Verifault has two features that are not provided or not fully supported by Silos -- distributed mode simulation and SDF file back-annotation. Distributed mode allows the simulation job to be split among two or more workstations which then work together in a Master-Slave relationship. The Master controls the overall process and distributes the processing tasks to each workstation according to user-defined weighting factors. This is extremely helpful when taking advantage of idle CPU time available for overnight runs. Each Slave must have its own license which is an additional cost. SDF file back-annotation allows the post-layout wire delays to be used in the simulator timing. This is probably important only if you are fault grading vectors that were developed with the post-layout delays. If the vectors were verified with pre-layout delays as well as post-layout delays, then this feature is not likely to affect your results. One exception is if the silicon vendor uses SDF annotation to read pre-layout estimates into the simulation. Cadence provides full SDF compatibility for Verifault, but currently Silos can only support IOPATH formats and will core dump if it encounters PORT or INTERCONNECT formats in the SDF file.

The reporting and log features of each simulator are complete and user friendly. In both cases, you can generate a list of detected and undetected faults, and a report of the coverage statistics. Each tool makes a distinction between a hard detected and a potentially detected fault, i.e., one that causes a transition to "x" or "z" rather than "1" or "0". Both can report faults that cause oscillations in the "bad" machine. In addition, Verifault reports on faults that it considers to be untestable, and has a post processing utility to combine output from incremental and distributed simu lations. In general, Verifault has more detail available in its reporting options and I was unable to completely explore them all, but both simulators supply clear, concise reports of all the essential information required for a complete analysis of the fault coverage of a given set of test patterns.

Figure 3: Product Information
Product Company Platforms Supported Pricing
Silos III Simucad, Inc.
32970 Alvarado-Niles Rd.
Union City, CA 94587
(510) 487-9721
silos@simucad.com
PC - Windows, Windows NT
DEC Alpha - Windows NT
SUN and Sparc-compatible
$5000* - Windows and NT
$18000* - UNIX
* includes logic simulator
Verifault-XL Cadence Design Systems
555 River Oaks Parkway
San Jose, CA 95134
(408) 943-1234
cadenceconnect@cadence.com
SUN and Sparc compatible
HP PA-Risc
Master:
$50,000 - floating license
$40,000 - single node
Slave:
$12,000 - floating license
$10,000 - single node

Statistical Simulation

The term "statistical fault simulation" refers selecting a random sample of faults from a given circuit and running the fault simulation only on these faults. The user specifies the size of the sample as a percentage of the total fault set. This procedure provides a quick estimate of the total fault coverage. Statistically speaking, the accuracy of the estimate is directly proportional to the size of the sample.

Both Verifault and Silos allow statistical simulation. Results from some of these runs are shown in Figure 4 as an example to the reader of what accuracy and run times to expect. Note that the samples chosen for fault simulation by each product will be different. Therefore, any direct comparison of performance for these tests has little meaning since different fault groups will take different amounts of processing to complete. The data is presented here for completeness.

Figure 4: Statistical fault simulation results for tests A1-A9
Product Fault
sample size
(%)
Predicted
fault
detection
(%)
CPU sec
Verifault 1 69.6 1202
Verifault 2 68.5 1768
Verifault 5 68.2 3306
Verifault 10 67.5 6504
Silos 2 62.4 4986
Silos 4 64.9 8727
Silos 10 65.3 21437
Silos 20 64.6 42517

Performance

While "fast fault simulation" may be an oxymoron, and neither simulator shows any promise of dispelling that fact, the performance of each was within the bounds of acceptability. Each simulator had its own strengths and showed a definite preference for different settings on equivalent parameters. For example, as a rule, Silos III prefers a lower setting for concurrent faults per pass. Verifault makes more efficient use of memory than Silos. By way of evidence, the save files are smaller by a factor of 2, and Verifault is able to handle 2-4 times the number of concurrent faults without causing the workstation to page its memory. More concurrent faults being simulated means fewer passes through the simulator, a fact that gains importance as the length of the test pattern increases. Verifault shows better performance on short patterns, implying that it has a faster compile time than Silos. And the evidence from the attempts to run the A10 pattern indicates that Verifault has better convergence for troublesome nodes.

Figure 5: Verifault 100% simulation results
Test Fault detection
(%)
CPU sec
(500 faults/pass)
A1 39.0 37289
A2 22.2 40818
A3 1.0 29286
A4 1.0 19935
A5 0.1 26667
A6 0.4 1401
A7 3.3 9856
A8 1.5 1887
A9 0.8 1045
Total 69.3 168184

Figures 5 and 6 show the results of the 100%, or exhaustive fault simulations. These runs test the detection of all the faults in the circuit. Comparing the two, it is clear that Silos III has some difficulty with test A1. The CPU time for this run was 42.8 hours compared to 10.4 for Verifault. Increasing the concurrent faults setting to 250 faults per pass improved Silos's run time, but it still took 37.8 hours. Even so, the total run time for tests A1-A9 was only 24% more for Silos than for Verifault. If you discard the anomalous results for test A1, the run time for A2-A9 favors Silos by a factor of two. Overall, though, I would still give highest performance marks to Verifault by a slim margin, mostly due to its superior convergence capability as implied by tests A1 and A10 (see above.)

Figure 6: Silos 100% simulation results
Test Fault detection
(%)
CPU sec
(100 faults/pass)
CPU sec
(250 faults/pass)
A1 38.0 154014 135969
A2 21.6 25230 -
A3 0.8 10276 -
A4 0.9 7943 -
A5 0.0 7703 -
A6 0.4 2530 -
A7 2.9 12186 -
A8 1.5 4242 -
A9 0.8 1814 -
Total 66.9 225938 207893

Bugs, Incompatibilities and Limitations

Verifault has no major bugs that I could detect, and its compatibility with Verilog-XL is nearly 100%. However, there are two things worth mentioning. During compilation, when Verilog-XL or Verifault encounters a cell with unconnected ports (most commonly a flip-flop with an unused Q or QB), they issue a warning. These can be numerous and clutter the output with messages which can obscure those messages which may be important. Verilog-XL has a switch to turn off the reporting of this warning message, but Verifault does not. Furthermore, Verifault is at least one revision behind Verilog-XL due to resource limitations at Cadence. Cadence report that this situation will be rectified when they synchronize the revisions of all their tools sometime in 1996.

Though it supports the most commonly used Verilog commands, system tasks and compiler directives, Silos does not support them all. Simucad publishes a list of supported and unsupported features with its documentation. Of main interest to my client was the `protect compiler directive which is unsupported. This is because Cadence has not released its encryption scheme to OVI. Simucad does have its own encryption and will work with cell vendors to encrypt their proprietary megacells for its customers. One limitation of Silos is that the test patterns must be formatted in a special tabular format in order to run iterative simulations. If the vectors are already available in tabular form, it is not difficult to modify them. Otherwise, they may be translated to tabular form using a simulator. This is an inconvenient but not difficult task.

Future Enhancements

There is one enhancement that bears mentioning here. Cadence has since released Version 1.4 of Verifault this quarter which is they claim has the ability to adjust the number of concurrent faults per pass on the fly. It will do this by monitoring the memory usage and paging levels during simulation and adjusting the parameter after each completed pass. This would be a terrific time-saver, since it would eliminate the need for guessing the right value or using a trial and error method to fine tune the fault simulation. The optimum value will be arrived at by the tool itself, allowing the simulation to complete in as little time as possible.

Conclusion

The choice of which fault simulator to purchase is one which must be tailored to the priorities of the user, and it is not the purpose of this article to make a recommendation. Since none of the tests in this evaluation were longer than 3300 lines, I would withhold final judgment on performance comparisons until having compared a pattern with 20000 or more lines. Also, circuits differ greatly in their tendency to oscillate under faulted conditions, and your mileage may vary. Give that disclaimer, either of these simulators should be able to accomplish most fault grading tasks. Those users with very large or troublesome circuits may decide they need the distributed mode or superior convergence of Verifault, while those cost-conscious users may be quite content with Silos III.


Jerry McGoveran is an independent consultant specializing in the design and verification of integrated circuits and the selection and integration of EDA tools. He can be reached at (925) 757-0685. Comments and questions are welcome and should be emailed to jerry@certuscg.com.


Back to CERTUS

Copyright © 1995, Jerry E. McGoveran