Suppose I have some protocol with 2 phases- ctrl & data and i want to implement outstanding limiter inside the driver.
the follwing code is a simplified code which illustrate my first attempt:
these are the phases driver
task ctrl_phase_driver();
trans tr;
vif.master_cb.ctrl_valid <= 0;
forever
begin
if(tr == null)
begin
wait(pending_tr.size());
tr = pending_tr.pop_front();
@(vif.master_cb);
end
vif.master_cb.ctrl_sig <= tr.ctrl_sig;
vif.master_cb.ctrl_valid <= 1;
@(vif.master_cb iff vif.master_cb.ctrl_ready);
vif.master_cb.ctrl_valid <= 0;
outstanding_tr.push_back(tr); // push transaction to data phase driver queue
tr = pending_tr.pop_front();
end
endtask
task data_phase_driver();
trans tr;
vif.master_cb.data_valid <= 0;
forever
begin
if(tr == null)
begin
wait(outstanding_tr.size());
tr = outstanding_tr.pop_front();
@(vif.master_cb);
end
foreach(tr.data[beat_num])
begin
vif.master_cb.data <= tr.data[beat_num];
vif.master_cb.last <= (beat_num+1 == tr.data.size());
vif.master_cb.data_valid <= 1;
@(vif.master_cb iff vif.master_cb.data_ready);
if(beat_num+1 == tr.data.size())
--num_outstanding;
vif.master_cb.data_valid <= 0;
repeat(tr.delay_between_beats[beat_num]) @(vif.master_cb);
end
tr = outstanding_tr.pop_front();
end
endtask
in addition, there is kind of centeral scheduler which pops transaction items from seq_item_port and push them to outstanding_tr.
the outstanding mechanism should be centeral since there are actually multiple 2-phases interfaces and the outstanding limit is function of both max_total_outstanding max_if_outstanding.
(for example, there are 3 interfaces with 4,5,6 max_if_outstanding but the whole system can be confiugred with max_total_outstanding=8)
But for this discussion we can neglect that and see the schduler as (pseudo code):
task sch();
forever
begin
wait(
(req_fifo.size())
&&
(num_outstanding < cfg.num_if_outstanding_transactions)
);
pending_tr.push_back( req_fifo.pop_front() );
++num_outstanding;
end
endtask
the problem is that:
lets say that cfg.num_if_outstanding_transactions=1
since num_outstanding decrement (inside the data_phase_driver) is done after @(vif.master_cb iff vif.master_cb.data_ready)
the sch is aware of that only after the clocking block event, and hence the next ctrl phase will start after one cycle delay with respect to last data phase beat (i.e, next ctrl phase can not be back2back with last data phase transaction)
my current solution is to add seperate thread to data_phase_driver which monitors the end of transaction:
task data_phase_driver();
trans tr;
fork
forever
begin
wait(vif.master_cb.triggered && vif.data_ready && vif.data_valid && vif.last);
--num_outstanding;
wait(!vif.master_cb.triggered);
end
join_none
vif.master_cb.data_valid <= 0;
forever
begin
if(tr == null)
begin
wait(outstanding_tr.size());
tr = outstanding_tr.pop_front();
@(vif.master_cb);
end
foreach(tr.data[beat_num])
begin
vif.master_cb.data <= tr.data[beat_num];
vif.master_cb.last <= (beat_num+1 == tr.data.size());
vif.master_cb.data_valid <= 1;
@(vif.master_cb iff vif.master_cb.data_ready);
vif.master_cb.data_valid <= 0;
repeat(tr.delay_between_beats[beat_num]) @(vif.master_cb);
end
end
endtask
this code works with my current simulator but I’ve the feeling that I’m doing it badly (e.g., sampling signals without clocking block)
any suggestion or comment will be great.
thanks!