Start/finish_item slows simulation down

trogers · January 21, 2016, 12:12pm

Hello everyone!
I’ve noticed, that my test runs pretty slowly, though the test itslef and the DUT are not very complicated. I looked through the profiling report, and found out, that calling “start_item” and “finish_item” in my sequence takes about 98% of entire resourses, used in simulation. Way too much, I guess=)
What’s wrong with these methods? Is it a common place, or maybe there are some techniques, that allow to speed up transfers between sequencer and driver?
Thank you!

tfitz · January 21, 2016, 2:35pm

I’m not aware of any specific problems with start_item/finish_item. Perhaps you could include your sequence code?

trogers · January 22, 2016, 9:02am

In reply to tfitz:

Sure.
here’s sequence


class v_data_sequence extends uvm_sequence#( v_data_transaction	#(							.V_DATA_WIDTH(V_DATA_WIDTH),											.N_COMP(N_COMP_IN),
.Y_WIDTH(Y_WIDTH),
.X_WIDTH(X_WIDTH)
)
);
	parameter IN_IMAGE_X_WIDTH = 2560;
	parameter IN_IMAGE_Y_WIDTH = 1920;
	parameter PIX_IN_INPUT_IMAGE = IN_IMAGE_X_WIDTH + IN_IMAGE_Y_WIDTH;
	
	read_file_class #(
                    .IMAGE_SIZE_X(IN_IMAGE_X_WIDTH),
                    .IMAGE_SIZE_Y(IN_IMAGE_Y_WIDTH),
                    .PIX_WIDTH(24),
                    .ADJUST_PIX_WIDTH(10),
		.LITTLE_ENDIAN(1)
                  ) reader;
	
	v_data_transaction	#(
.V_DATA_WIDTH(V_DATA_WIDTH),
.N_COMP(N_COMP_IN),
.Y_WIDTH(Y_WIDTH),
.X_WIDTH(X_WIDTH)
)	v_data_tx;
	
  function new( string name = "" );
    super.new( name );
  endfunction: new
     
  task body();
	bit [24-1:0] vd_mx [IN_IMAGE_Y_WIDTH-1:0][IN_IMAGE_X_WIDTH-1:0];
	int	line_num = 0;
	int elem_n = 0;
	int file_id;
	int results;
	
	bit [V_DATA_WIDTH-1:0] pix;
		
	file_id = $fopen("input_image.dat", "rb");
	
	reader = read_file_class #(
                   .IMAGE_SIZE_X(IN_IMAGE_X_WIDTH),
                    .IMAGE_SIZE_Y(IN_IMAGE_Y_WIDTH),
                    .PIX_WIDTH(24),
                    .ADJUST_PIX_WIDTH(10),
.LITTLE_ENDIAN(1)
                  )::type_id::create( .name( "reader"));
		
	reader.read_from_file(
                         .vd_mx(vd_mx),
                         .start_offset(0),
                          .file_id(file_id)
                        );

		
for (int line_num = 0; line_num < Y_SIZE; line_num++)
      begin
	for (int elem_n = 0; elem_n < X_SIZE + 2*UNACTIVE_PART; elem_n++)
	begin
	v_data_tx = v_data_transaction	#(
.V_DATA_WIDTH(V_DATA_WIDTH),
.N_COMP(N_COMP_IN),
.Y_WIDTH(Y_WIDTH),
.X_WIDTH(X_WIDTH)
)::type_id::create( .name( "v_data_tx" ));
	start_item( v_data_tx );
	//assert( v_data_tx.randomize() );
	if ((elem_n >= UNACTIVE_PART) && (elem_n < X_SIZE + UNACTIVE_PART))
	begin
	v_data_tx.x_active = 1;
	end
	else
	begin
	v_data_tx.x_active = 0;
	end
						
	if (v_data_tx.x_active)
	begin
	v_data_tx.x = elem_n - UNACTIVE_PART;
	end
	else
	begin
	v_data_tx.x = '0;
	end
	v_data_tx.y = line_num;
	
	if (~line_num)
	begin
        if (~elem_n)
	begin
	pix = vd_mx[line_num+512][elem_n+1024][7:0];//R
	end
	else
	begin
	pix = vd_mx[line_num+512][elem_n+1024][15:8];//G
	end
        end
	else
	begin
        if (~elem_n)
	begin
	pix = vd_mx[line_num+512][elem_n+1024][15:8];//G
        end
        else
	begin
	pix = vd_mx[line_num+512][elem_n+1024][23:16];//B
	end
	end
						
	v_data_tx.v_data = pix;//line_num + elem_n;
	finish_item( v_data_tx );
	end
end
endtask: body

  `uvm_object_utils( v_data_sequence )

endclass: v_data_sequence

and here’s part of a driver code


forever
begin
seq_item_port.get_next_item( v_data_tx );
	send_word(
		.v_data(v_data_tx.v_data),
		.y(v_data_tx.y),
         	.x(v_data_tx.x),
		.x_active(v_data_tx.x_active)
		);
	seq_item_port.item_done();
end
...
task v_data_driver::send_word (
bit [N_COMP-1:0] [V_DATA_WIDTH-1:0]	v_data,
bit [Y_WIDTH-1:0]      	y,
bit [X_WIDTH-1:0]       x,
bit                     x_active
);
		
foreach (v_data[i])
begin
v_data_bus.v_data[i] <= v_data[i];
end

v_data_bus.y <= y;
v_data_bus.x <= x;
v_data_bus.x_active <= x_active;
repeat (3) @(posedge v_data_bus.clk);
v_data_bus.pix_en <= 1'b1;
@(posedge v_data_bus.clk)
v_data_bus.pix_en <= 1'b0;
endtask

sorry about indents (or, should i say, their absence). I used Notepad and it turn out to be a disaster)

dave_59 · January 22, 2016, 9:56pm

In reply to trogers:

I find it hard to believe if this is your only running sequence that start/finish_item would be slowing your simulation down. You would be calling them only once every 4 clock cycles. Maybe you are reading the profile incorrectly. You should work with your simulation vendor to explore more options.

trogers · January 26, 2016, 7:51am

In reply to dave_59:

yeah, it’s a surprise for me too. But here’s profiling report, and lines 92 and 146 are the ones, where start_item and finish_item are called respectively. I use QuestaSim 10.0b, maybe i face this problem because of my old Questa version?

UPD: looked what happens in uvm_sequencer_base.svh:1017

task uvm_sequencer_base::wait_for_item_done(uvm_sequence_base sequence_ptr,
                                            int transaction_id);
  int sequence_id;

  sequence_id = sequence_ptr.m_get_sqr_sequence_id(m_sequencer_id, 1);
  m_wait_for_item_sequence_id = -1;
  m_wait_for_item_transaction_id = -1;

  if (transaction_id == -1)
    wait (m_wait_for_item_sequence_id == sequence_id);
  else
    wait ((m_wait_for_item_sequence_id == sequence_id &&
           m_wait_for_item_transaction_id == transaction_id));
endtask

line 1017 is

if (transaction_id == -1)
    wait (m_wait_for_item_sequence_id == sequence_id);