Choose scoreboard methodology to implement

Hi all,

I have couple of years working as Verification Engineer. I verified some IPs (bus matrix interconnection, RDMA, etc) using UVM. I found that we have 2 methodoloies to implement scoreboard:

  1. On-fly check: Check immediately incoming transactions to scoreboard at run phases (main phase).
  2. Collect all transactions ar run phases and check at report phases (report, check phase).

Sometimes my scenarios are very difficult to check On-fly (methodology #1), I have to use the methodology #2 which is easy to implement.

I have question: Is methodology #2 good? Should I use #2 to apply? My test usually generates really big a mount of stimulus/transaction, does it consume too memory that crash simulation?

Thanks everybody, hopefully receive your responses.
Chris

In reply to cuonghl:

I recommend checking transactions as they are generated. This allows you to immediately identify any errors so that you can pinpoint when the errors occur. You can also stop the simulation immediately to reduce computing resource requirements.

In reply to cuonghl:

Earlier, when we used to implement checkers or loggers in our testbench, one of the rules we normally used was to segregate clearly the different steps as follows:

  1. collect data
  2. process
  3. log them (if this is a monitor or data logger)

as you can see, it is not a good idea to interleave 1 and 2 unless there is a compelling reason to do so. This not only makes logic difficult to understand and maintain, it is not efficient simulation wise. For example, it makes sense to dump a transaction once instead of partially dumping the transaction in 4 steps.

But I would like to understand why you think on the fly checking is difficult. Can you give an example?

On the fly checker has some benefits but the cost associated is not worth it (in my view).
For example, on the fly can help us to stop/terminate simulation the moment we see bad behaviour but its hardly useful or hardly any major savings in time.

Anyway, I shared some thoughts from my side. I hope it helps.

In reply to cgales:

Yes, I agree with you. But sometimes our situation is too difficult to check immediately (“On-fly”)? Actually, I never see my simulation crash or something, so I wonder, why how big does memory consume for the methology #2?

In reply to cuonghl:

  1. Collect all transactions ar run phases and check at report phases (report, check phase).
    Sometimes my scenarios are very difficult to check On-fly (methodology #1), I have to use the methodology #2 which is easy to implement.
    Chris

Could you please elaborate why you have to use approach 2? If you have millions of patterns this will slow down your simulation.
In most cases you can compare your data during simulation.

In reply to verif_learner:

In reply to cuonghl:
Earlier, when we used to implement checkers or loggers in our testbench, one of the rules we normally used was to segregate clearly the different steps as follows:

  1. collect data
  2. process
  3. log them (if this is a monitor or data logger)
    as you can see, it is not a good idea to interleave 1 and 2 unless there is a compelling reason to do so. This not only makes logic difficult to understand and maintain, it is not efficient simulation wise. For example, it makes sense to dump a transaction once instead of partially dumping the transaction in 4 steps.
    But I would like to understand why you think on the fly checking is difficult. Can you give an example?
    On the fly checker has some benefits but the cost associated is not worth it (in my view).
    For example, on the fly can help us to stop/terminate simulation the moment we see bad behaviour but its hardly useful or hardly any major savings in time.
    Anyway, I shared some thoughts from my side. I hope it helps.

As I said, we can’t always use fly-checking. Let’s see the following example:
Our test sends to DUT 2 messages (red, geen), each message have 4 packets (transactions) FIRST, MIDDLE, MIDDLE and LAST. These packets are interleaving. DUT only handles the message once it receives all the packets.

Now, I want to check:

  • Data of these message are good or not
  • The packet-ordering is good or not
  • The packet-interleaving of 2 messages happen or not

Obviously, we can’t check on-fly right? This case, we have to collect all transactions/packets of these 2 messages before checking them when simulation’s finished.

In reply to cuonghl:

The scenarios you are describing can be easily check during simulation time, without storing a lot of data.

In reply to chr_sue:

In reply to cuonghl:
The scenarios you are describing can be easily check during simulation time, without storing a lot of data.

You meant, we can easily check the following item as well?

  • The packet-ordering is good or not
  • The packet-interleaving of 2 messages happen or not

I just give you an example of 2 messages. But what happen if we have 100 messages or more interleaving together?
Could you give me an example how to check them?

In reply to cuonghl:

I hope you appreciate that when it says, on the fly, it means within the protocol limits and not literally at every time instant or atomic transaction.

In reply to verif_learner:

In reply to cuonghl:
I hope you appreciate that when it says, on the fly, it means within the protocol limits and not literally at every time instant or atomic transaction.

Not really understand what you said, but could you give me any suggestion with example that I posted in previous post? I have 2 way to check it:

  • Check at run phases, anytime a packet comes to SB
  • Collect all and check at report phases after simulation finished.

What is the trade-off between 2 methods?

Thanks.

In reply to cuonghl:

You Can store the data belonging together in a tlm_fifo and performing then the compare.
It does not matter how many different data packets you have. Finally you need only very few storing places. Storing all data and comparing after the run_phase is very, very expensive in Terms of resources and time. You might store several GB of data before starting with the compare.

In reply to chr_sue:

In reply to cuonghl:
You Can store the data belonging together in a tlm_fifo and performing then the compare.
It does not matter how many different data packets you have. Finally you need only very few storing places. Storing all data and comparing after the run_phase is very, very expensive in Terms of resources and time. You might store several GB of data before starting with the compare.

No, it doesn’t solve my example’s problems. To compare all packets are easy with on-the-fly checking, but checking the scenario happen is something more.

Thanks.

In reply to cuonghl:

What is the difference between checking on-the-fly and the checking scenario.
The scenarion you have described above Can be easily checked during the run_phase.
But if you believe you don’t want do this, it is your decision.

In reply to cuonghl:

You should ALWAYS be able to predict what the response of your DUT will be to the stimulus sent to it. This is what the design specification is for. If you DUT has an input of packets A, B and C, then you predict that it will output packets X, Y, and Z. There should never be any point where you have to decide if the packet received should be X or Y or Z. If you have to make these decisions, then your predictor model for the DUT isn’t detailed enough.

In reply to cgales:

cgales,

i wouldn’t fully agree with above though in most cases we should aim for this level of prediction.

In many cases, DUT uses very sophisticated algorithms and not only that many times the response will depend on how it is implemented (3 pipeline vs 4 pipeline internally).
So, the predictor has to not only implement the same complex algorithm (which is fine) but has also to mimic exact internals of DUT implementation. The second part is a pain and designer can change design anytime which is difficult to track.

So, what do we do?

If overall sequence is known at the output then we have safely used it. For example, message-in and message-out without worrying about how the packet chunks within message come out.

Normally, we evaluate all these and the cost complexity of implementing these and only then embark on implementation.

Anyway, my 2 cents and what we have used in multiple projects.

In reply to verif_learner:

I consider this a very dangerous attitude for a verification engineer to have which can lead to very expensive design respins. The behavioral model should be 100% accurate to the RTL design. If an implementation decision affects the behavior of the design, then the implementation should be completely specified. There should be no room for interpretation by different engineers as this leads to ambiguity and bugs.

RTL functionality is 100% deterministic. For any given set of inputs, there is exactly one output. It is the job of the verification engineer to determine what this output is and ensure that the DUT functions correctly. Anything less is unacceptable.

In reply to cgales:

Unfortunately, we live in a world that is driven by timelines and schedules. Where designers will make huge number of changes and many a times due to other than verification inputs. FOr example, physical design feedback.

Anyway, let me articulate through an example of what I am saying

if a = b +c
d = e + f

y = a + d then checking y as expected should be fine enough.
I may not want to worry about whether a and d are as per the expectations.

In reply to verif_learner:

In reply to cgales:
cgales,
i wouldn’t fully agree with above though in most cases we should aim for this level of prediction.
In many cases, DUT uses very sophisticated algorithms and not only that many times the response will depend on how it is implemented (3 pipeline vs 4 pipeline internally).
So, the predictor has to not only implement the same complex algorithm (which is fine) but has also to mimic exact internals of DUT implementation. The second part is a pain and designer can change design anytime which is difficult to track.
So, what do we do?
If overall sequence is known at the output then we have safely used it. For example, message-in and message-out without worrying about how the packet chunks within message come out.
Normally, we evaluate all these and the cost complexity of implementing these and only then embark on implementation.
Anyway, my 2 cents and what we have used in multiple projects.

I agree with you. Sometimes we meet the challenge when

  1. Use a black-box IP
  2. Include complex algorithms
  3. With complex hierarchy

It’s really hard or may take huge effort to create golden model to predict the DUT output which data transfer through arbitrator or other unknown delay unit.

In reply to cuonghl:

I just give you an example of 2 messages. But what happen if we have 100 messages or more interleaving together?

Hi Chris,

Agreed, as per this specification of 100 interleaved messages, then the verif. env should store all the 100 messages in the fifo to do the comparison with the expected o/p messages. And for that matter any number of messages that are still in the pipeline and hasn’t been processed by the DUT yet, cannot be removed from Scoreboard fifo.

But as and when one message (say message ‘n’) is processed and comes out of DUT, then that message ‘n’ can be processed, compared and deleted out of the fifos. This can be do-able in your example atleast. I believe this is still considered as on-the-fly checking, although you are storing some I/P messages (as required by the design scenario).

But, if we store the processed messages(O/P) too, we may end up saving tons of messages in the SB fifos and can lead to simulation performance issues too (and you cannot run worst case heavy traffic tests).

In reply to S.P.Rajkumar.V:

You are storing the Messages on the generation side. If the DUT gives you a Response you Can directly compare the corresponding Messages, thus avoiding to store all generated and actual messages. This is the common way you should follow.