UVM monitor sees the inputs after 2 clock cycles

My driver code is as below for an up/down counter:

 virtual task run_phase(uvm_phase phase);
    super.run_phase (phase);
    // task to initialise the driver signals
    initialize();
    forever begin
      seq_item_port.get_next_item(req);
      drive();
      seq_item_port.item_done();
    end //forever
  endtask : run_phase

task initialize();
  //default values  
  counter_vif.load_counter <= 1'b0;
  counter_vif.count_value  <= 0;
  counter_vif.up_counter   <= 0;
endtask : initialize


virtual task drive();  
  @(posedge counter_vif.clk);
    if (counter_vif.clk_en) begin
      counter_vif.count_value  <= req.count_value;
      counter_vif.load_counter <= req.load_counter;
      counter_vif.up_counter   <= req.up_counter;
     end //if
  req.print();
endtask

My monitor code is as below:

 virtual task run_phase(uvm_phase phase);
 forever begin
   @(posedge counter_vif.MONITOR.clk);
    if (counter_vif.MONITOR.clk_en && counter_vif.MONITOR.reset) begin 
     seq_item_collected.count_value   = `MON_IF.count_value;
     seq_item_collected.load_counter  = `MON_IF.load_counter;
     seq_item_collected.up_counter    = `MON_IF.up_counter;
     // DUT needs a clk to load and start the counter
     if(`MON_IF.load_counter)
        @(posedge counter_vif.MONITOR.clk);  
     seq_item_collected.current_value = `MON_IF.current_value;
     seq_item_collected.count_reached = `MON_IF.count_reached;
    
     seq_item_collected.print();

    // write into uvm_analysis_port for reference by next components
     trans_collected_port.write(seq_item_collected);
    end //if
 end //forever
 endtask : run_phase

But the problem I see here is:
Though i am setting load_counter = 1 in the first sequence, i see that monitor and scoreboard see the load_countrer =1 after 2 clocks(3rd sequence). i.e. I am just trying to figure out why is it so?

req.print(); of driver prints @15 :
# UVM_INFO updn_counter_driver.sv(69) @ 15: uvm_test_top.env.counter_agent.driver [DRIVER DEBUG1]
# --------------------------------------------------
# Name            Type                   Size  Value
# --------------------------------------------------
# req             updn_counter_seq_item  -     @673 
#   count_value   integral               7     'h10 
#   load_counter  integral               1     'h1  
#   up_counter    integral               1     'h0  

# UVM_INFO updn_counter_monitor.sv(45) @ 35: uvm_test_top.env.counter_agent.monitor [MONITOR DEBUG1]  seq_item_collected 
# ---------------------------------------------------------
# Name                   Type                   Size  Value
# ---------------------------------------------------------
# updn_counter_seq_item  updn_counter_seq_item  -     @643 
#   count_value          integral               7     'h10 
#   load_counter         integral               1     'h1  
#   up_counter           integral               1     'h0

anyone has any idea, if i am missing something here?

In reply to uvmsd:

Your monitor is going to send a transaction out every clock cycle regardless of any driver transaction (you should construct a new seq_item_collected for each object you write).
What does it show for the second transaction?

In reply to dave_59:

@dave_59

I spent good amount of time in debugging, analysing and going through below info:
http://www.sunburst-design.com/papers/CummingsSNUG2016AUS_VerificationTimingTesting.pdf

I changed my code as below and things are working.
**But i have a question:
**
I added below line before sampling the DUT outputs.
@(posedge counter_vif.monitor_cb);
**Will not above line add a clock cycle delay? what exactly does it do? please let me know.
**

 virtual task run_phase(uvm_phase phase);
   super.run_phase(phase);
 forever begin
   @(posedge counter_vif.clk);
    if (counter_vif.clk_en  && counter_vif.reset == 1) begin 
     seq_item_collected.count_value   = counter_vif.count_value;
     seq_item_collected.load_counter  = counter_vif.load_counter;
     seq_item_collected.up_counter    = counter_vif.up_counter;
  
      // Sample DUT outputs with monitor clock
     @(posedge counter_vif.monitor_cb);  
     seq_item_collected.current_value = counter_vif.current_value;
     seq_item_collected.count_reached = counter_vif.count_reached;

     seq_item_collected.sprint();
     
     trans_collected_port.write(seq_item_collected)
    end  //    
 end //forever
 endtask : run_phase

In reply to uvmsd:

Can’t answer since you do not show the code in your interface. By yes, a clocking block adds another cycle. But once you introduce a clocking block, you should be using only that clocking block event and not the raw clk signal. Same is true for the signals you are trying to sample.

In reply to dave_59:

@dave_59
Here is my interface:
interface updn_counter_if(input logic clk, reset, clk_en);

//data and control fields
logic [ABSOLUTE_DATA_WIDTH-1:0] count_value;
logic load_counter;
logic up_counter;
logic [ABSOLUTE_DATA_WIDTH-1:0] current_value;
logic count_reached;

clocking driver_cb @(posedge clk);
default input #1 output #1;
output load_counter;
output count_value;
output up_counter;
input current_value;
input count_reached;
endclocking

clocking monitor_cb @(posedge clk);
default input #1 output #1;
input load_counter;
input count_value;
input up_counter;
input current_value;
input count_reached;
endclocking
endinterface

You mean, I shouldn’t use @(posedge counter_vif.clk); in monitor to get the driving signals? Instead use
@(posedge counter_vif.driver_cb); while copying driving signals from interface? If so, how do I refer to clk_en?

Output sampling is done if (counter_vif.clk_en ==1), is that valid? I mean, are clk_en and monitor_cb be in sync?

Will @(posedge counter_vif.monitor_cb); add another clock cycle delay?

But the DUT and my TB values are matching. If above line had added one clk cycle delay, TB would have been 1 clk behind, right?

Sorry for more questions. I just wanted to get further clarity. Thank you!