File parsing in System Verilog

Hi,

I have several test-patterns provided by external vendor in this format:

cmd_pattern.txt

data in {0x24,0x32,0x58,0x64…
0x14,0x16,0x32…}

data out {0x24,0x32,0x58,0x64…
0x14,0x16,0x32…}

randomized_key {0x45,0x64,0x56 …}

There are about 30-40 command types with varying lengths of arguments which need to be parsed by the test-case and provided as stimulus to DUT.
I would like to know what is the best method available in System Verilog for parsing such patterns. Following are some of the concerns I have:

  1. Usage of $sscanf() will lead to about 30-40 conditional statements which may not be scalable for any future changes in patterns
  2. Usage of Perl/Python based script to pre-process the pattern helps but it still doesn’t give me the flexibility to directly create a Txn Object containing all this information which I can directly pass to my System Verilog BFM’s

I’m thinking something on the lines of parsing the cmd file in Perl and using it to print out a SV based file as shown below:

class data_txn;
bit [31:0] addr;
byte data_in;
byte data_out;
endclass

data_txn q[$];
data_txn data_1;
data_1 = new();
data_1.data_in = new[16]
data_1.data_in = {0x24,0x32,0x58,0x64…}
q.push_back(data_1);

data_txn data_2;
data_2 = new();
data_2.data_in = new[16]
data_2.data_in = {0x24,0x32,0x58,0x64…}
q.push_back(data_2);


Any suggestions/ideas on this to implement this better is greatly appreciated.

Thanks
Venkat

One potential problem converting the pattern file to SystemVerilog source is that you would need to recompile your source for each test, and each time you changed a test. That might present a performance problem.

A better approach would be to read in the file a line at a time and either use the UVM’s Regular Expression DPI code to pattern match the commands, or use $sscanf to parse the line. It might help to scan just the command string at the beginning of the line and use that as in index into an associative array to format scanning in the rest of the line. For example, something loike

int code;
string line, command;
string cmd_format[string] = '{
                              "cmd1":"cmd1 { %x, %x, %x }",
                              "cmd2":"cmd2 { %x, %x, %x, %x }"
                              };

code = $sscanf(line,"%s",command);
if (cmd_format.exists(command))
      code = $sscanf(line,cmd_format[command],a[0],a[1],a[2],a[3],a[4],a[5],... a[max]);
    else
    $error("bad command");

In reply to dave_59:

Hi Dave,

Thanks for pointing out the potential performance issue. For my case however, the patterns are provided by vendor and mostly act as a sanity test-suite which I don’t edit or modify as of date. So each pattern will exactly translate to only one kind of test without much variations. However, I do like the idea of associative array and creating the parser one time to help provide additional flexibility in the test without need to recompile. I also need to generate some cover-groups for the test-pattern based Txn objects which I will create to help me get an up-front view of what portion of this testing is covered from vendor IP.

Thanks
Venkat

In reply to venkstart:

Hi Venkat/Dave,

I believe the size of the data_in or data_out is not mentioned. So, how do you intend to parse the number of byte elements provided?
If the size is dynamic we cannot cover the possible scenarios in the cmd_format pattern.

I had a similar requirement now and I’m stuck on how to decide on the length and how to extract the bytes from the given data.
Eg: Assume I’ve an unpacked array in my SV file and I need to extract the byte elements from the given data from a text file.
bit [7:0] data; // data type in SV
// data format in the text file
data_in 00010203040506
data_in 000102
data_in 121314151617181920212223
So, I wanted to know if there is any better way or easy way of extracting the data bytes from this sort of data.
Any other suggestion on this extraction is most welcome.

Thanks,
Somasekhar M

In reply to Somasekhar M:

The easy way to do this is to read the data as a single string. Then declare a queue and push a byte at a time as you convert each ASCII element to binary.

The performance of this should be OK but if not, you can do this with a DPI-C routine.

In reply to dave_59:

Thanks Dave.

I shall try with Queue approach first.
Could you let me know how the pseudo for this DPI would look like? Or Could you point me some references that helps to build this DPI quickly?

I’ve another two questions regarding this file parsing.

  1. If the elements are separated by ‘,’ instead of spaces, how to extract them. The above code doesn’t seems to work with ‘,’.
  2. I want to extract the command and process the data based on number of arguments returned successfully. Can I do that? Currently sscanf is returning ‘0’ always.
    Eg:
    command.txt:
    READ ADDR // Perform read from the address specified
    READ ADDR EXP_DATA // Perform read from the address and match with EXP_DATA. Trigger an error in case of mismatch

Thanks,
Somasekhar M

In reply to Somasekhar M:

Dealing with dynamically sized data generated in C and passed to SystemVerilog through the DPI can be tricky and if not done properly, you can lose the efficiency you were trying to gain in the first place. I would avoid trying to use the DPI in this case unless you really think the performance of interpreting the text file will be an issue when compared to the performance of the rest of the simulation.

$fscanf and $sscanf both format a string as a sequences of nonwhite-space characters. So the first string will consume the entire line unless there is some white-space in it.

For your simple test format, you might consider building a state machine to scan the line a character at a time since you will need to do it anyways to handle your variable sized data.

In reply to dave_59:

Thanks Dave for your inputs. I shall be going with a state machine kind of approach.