Random Stability Change Detection

Anyone who has worked with random simulations and regressions has come across the case where there is a failing seed, you go to fix the HVL or RTL and re-run the seed. It then passes but the condition that caused the failure is no longer present due to random stability changing due to your changes to the RTL or HVM.

For years I’ve been thinking about a standard way to detect such changes, but I’ve never implemented it. Any ideas, existing work, or recommendations?

In reply to jeremy.ralph:

There are the following two papers that treat the subject: Random Stability in SystemVerilog and UVM Random Stability

Maybe some of the things presented in the papers could be linted using a tool, but a lot are patters that you have to follow.

As a side note, it’s not just changes to the code that can cause random stability issues. Different command line arguments to the tool could affect this as well (e.g. GUI vs. command line or with vs. without coverage).

As another side note, it’s not just random stability that can be affected. There’s also the possibility that iteration order gets affected. One example I’ve seen was iterating in a foreach loop over an associative array with class handles as keys.

In reply to Tudor Timi:

Thanks Tudor! I can attest that command line arg will change random stability and it’s quite painful while trying to close coverage.

Consider the following case(s):

1000 seed regression run in optimized (nondebug mode for speed) with a low UVM verbosity, one failing seed. Re-run failing seed in debug mode (for breakpoints, signal visibility). Failing seed is now gone due to a change in stability. I have observed some cases where even changing the UVM verbosity can change random stability.

Stability change detection mechanism idea:
print a randomized number to a file at the end of a simulation for each seed in the regression. At end of simulation read the last run’s random number for the seed from the file and compare to the most recent. If they differ random stability has changed. Ideally the vendors would have a built in way… maybe they do… Mentor?

In reply to jeremy.ralph:

I’m not sure that this is enough. Each thread gets its own RNG, so it would be perfectly possible to have some threads remain unaffected (like the one that prints the number at the end), but other threads could be affected by stability issues.

In reply to Tudor Timi:

Thanks Tudor. Not so straight forward I guess. One of these times I may take a stab at it.