Randomization of "triangular shaped" distribution around desired average

I would like to randomize an array of numbers with the following given constraints:

  1. all numbers are inside [1:7]
  2. array average is 4.882
  3. “triangular shaped” distribution around the array average. i.e. majority of the numbers (>90%) should be in the range [$floor(avg):$ceil(avg)] (=[4:5]), and then with lower probability the numbers {3,6}, and then with the least probability (~1%) the numbers {1,2,7}

I’ve tried the following code:


class credit_flow_data #(int size = 1000);
  int                         credit_prob;
  rand int                    credit_high_duration_arr[size];
  longint                     credit_high_duration_arr_desired_sum;
  real                        credit_high_duration_desired_avg;
  real                        credit_low_duration_desired_avg;

  function new(int credit_prob, real credit_low_duration_desired_avg);
    this.credit_prob = credit_prob;
    this.credit_low_duration_desired_avg = credit_low_duration_desired_avg;
    calc_credit_high_duration_desired_avg();
    this.credit_high_duration_arr_desired_sum = longint'(this.size*this.credit_high_duration_desired_avg);
  endfunction
  
  function void calc_credit_high_duration_desired_avg();
    real credit_prob_normalized = real'(this.credit_prob)/100.0;
    if (credit_prob_normalized == 0 || credit_prob_normalized == 1) begin
      this.credit_high_duration_desired_avg = credit_prob_normalized;
    end
    else begin
      this.credit_high_duration_desired_avg = credit_prob_normalized/(1-credit_prob_normalized)*this.credit_low_duration_desired_avg;
    end
    $display("credit_high_duration_desired_avg = %0f", credit_high_duration_desired_avg);
  endfunction: calc_credit_high_duration_desired_avg
  
  
  constraint credit_high_duration_arr_c {
    if (this.credit_prob == 0 || this.credit_prob == 100) {
      foreach (credit_high_duration_arr[i]) {
        credit_high_duration_arr[i] == this.credit_prob/100;
      }
    }
    else {
      foreach (credit_high_duration_arr[i]) {
        credit_high_duration_arr[i] inside {[1:7]};
        credit_high_duration_arr[i] dist {4:=950, 5:=950, [1:2]:/5, 3:=50, 6:=50, 7:=5};
      }
      credit_high_duration_arr.sum() with (int'(item)) == credit_high_duration_arr_desired_sum;
    }
  }
endclass

module tb;
  credit_flow_data #(1000) data;
  int credit_prob;
  initial begin
    credit_prob = 83;
    data = new(credit_prob, 1);
    data.randomize();
    for (int i=0; i<data.size; i++) begin
      $display("%0d", data.credit_high_duration_arr[i]);
    end
  end
endmodule

The avg is as expected - 4.882, but oddly the probability of 7’s is too high - 154/1000=15.4% !
In addition, most of the 7’s are located at the end of the array (sequence of 7’s), and it’s less desired. I would like a more shuffled like behavior (i.e. the 7’s should be spread throughout the array).

Any suggestions?

Other solutions to the problems also welcome :)

In reply to l87:

The dist construct is not really a hard constraint other than the fact that values with a probability of 0 or otherwise not mentioned are excluded (meaning the inside {[1:7}} constraint is redundant).

Without the sum constraint, the fixed value distribution you wrote with would produce an average of 4.4988 for each array element. Note that [1:2]:/5 is just a shortcut for writing 1:=2.5, 2:=2.5, not that an array element has a choice of having the value 1 or 2, with a combined weight of 5.

In reply to dave_59:

Thanks Dave!

I know that using these specific dist values I’ve wrote, the inside constraint is redundant.

Actually, the desired average which I’ve tried to simplify first, can be any value in the range [1:7], and it’s derived from the “credit_prob” value which is in the range [50:87]. i.e. the user sets the desired “credit_prob” he wants (using the new function), and from it I need to extract the desired average (using “calc_credit_high_duration_desired_avg” function) and the distribution for the randomized values.
For example, the value 4.882 that I’ve originally wrote is for “credit_prob”=83. The dist values were given as an example of the desired “triangular shaped” distribution.

So, clarifying the above, what I actually need is:

  1. All numbers are inside [1:7]
  2. User sets “credit_prob” in the range [50:87] → calculate desired average (in the range [1:7]) - “credit_high_duration_desired_avg”
  3. “triangular shaped” distribution around the array average. i.e. majority of the numbers (>90%) should be in the range [$floor(avg):$ceil(avg)], and then with lower probability the numbers {$floor(avg)-1,$ceil(avg)+1}, and then with the least probability (~1%) the numbers {[1:$floor(avg)-2],[$ceil(avg)+2:7]}

Please note that currently I assume “credit_low_duration_desired_avg”=1 (it may change later).

Hopefully now it’s more clear :)

In reply to l87:

What I’m trying to ask is if the average of the array elements a hard constraint? Or is it enough to just have the triangular distribution. It might be too difficult to have both constraints.

In reply to dave_59:

I need the sum to be around the given value.
I can handle some range around the desired sum if it will allow me the triangular distribution I want…
For example, if the desired sum is 1000, then I can live with actual sum in the range [990:1010] if it will allow me to get more accurate triangular distribution (if it’s even possible of course)