UVM scheduling and the need for Clocking Blocks

I am not sure if this is a System Verilog question rather than a UVM one, in which case I apologize.
I have tried to read all the documentation I could find regarding clocking blocks, and I have reached a sufficient understanding on how they work, but am still confused on the reason they are needed, i.e. what triggers race conditions between testbench and DUT.

Let us assume our DUT has a single clocked process which assigns an output signal through a Non Blocking Assignment.
Let us also assume the TB instantiates a monitor which evaluates that same signal, connected through an interface, on the posedge of clock. In pseudo-code:

always @(posedge clk)
out_a <= new_val;

forever begin
@(posedge clock);
if (out_a != old_val) begin

My understanding is that the monitor might see the old or the new value of out_a, causing the race condition. What I don’t understand is, since the monitor should be triggered on the clock posedge and therefore scheduled “together” with the DUT always block, shouldn’t the change to out_a be scheduled at a later simulation delta (thanks to the NBA), ensuring therefore that the monitor samples the “old” out_a value?
I guess my confusion is I’m finding it difficult to understand how the monitor @ statement is scheduled, and I can’t for the life of me find it clearly defined in the IEEE documentation.

Thank you for your help

In reply to Francesco Colonna:

are indeed the same “clock” generating a posedge in the same delta cycle, then you are correct that the monitor samples the “old” value of out_a.

I’m not sure if you indented to use two different clock signals, but that’'s where problems can be introduced, especially in gate-level simulations with delays. If
gets delayed more than a delta cycle from
, you get the “new” value of out_a.

A clocking block helps by guaranteeing that you use the sampled value out_a before the the clock edge and removing any skew between clocks. But if you do not plan on doing any gate-level simulation and have made sure all your synchronized clocks have no delta delays, you can avoid using clocking blocks.

All of what I just wrote assumes you start your UVM testbench with run_test() called in a module. All of the “old” verses “new” races get flipped around if you start your testbench in a program block (which I strongly recommend you avoid). In that case your monitor reads the “new” value of out_a, assuming there are no delays in the RTL. And now if
gets delayed more than a delta cycle from
, the monitor see the “old” instead of the “new” value. Add a clocking block makes sure you read the “old” values out_a in your monitor.

In reply to dave_59:

Hello Dave,
indeed the different clock names were only a typo. I was trying to write a generic example because I could not easily extract actual code from my design, and I clearly didn’t write it carefully enough :)

In general I agree with your statements.
However, I have also experienced issues when not using clocking blocks. That is definitely true when using Verification IPs, for which is pretty hard for me to know exactly what happens to the clock signals internally. That has always left me wondering what exactly the VIPs could be up to, to get the signal sampling delayed wrt the clock…
The ones that I have used have specific switches and flags to enable the use of clocking blocks, and in my experience it is really required to do so to avoid issues down the line.

Unfortunately at this time I don’t have a practical example of a TB/monitor that I made myself where these issues occur, last time I saw this was a couple of years back and I have always used clocking blocks since, so I’m afraid this discussion can only be fairly hypothetical.

Thank you for your reply, I know at least that my confusion was warranted :)