WHy is simulation performance greater in unpacked arrays compared to packed array?

Suppose there are 2 arrays packed and unpacked .
Ex: bit [1:0][7:0]packed_arr;
bit unpacked_arr[1:0][7:0];

Accessing 1 bit i.e., packed_arr[0][0] and unpacked_arr[0][0] in both the arrays will require to mask the 15 MSB bits if its a (16X255)memory.

Then how can one say simulation performance is greater in unpacked array?

In reply to Shruti Kamble:

How a tool decides to organize an array in the host simulator’s memory is not visible to the tool user. You get better (greater) performance when the array element boundaries align with the host memory byte or word boundaries. Since unpacked arrays are more likely to be accessed one element at time, the compiler might decide to allocate one byte for each bit of the array.

In reply to dave_59:

Thank you dave_59.
But do you mean to say that packed arrays don’t align with the host memory byte or word?
That is why packed array performance is poorer than unpacked?

In reply to Shruti Kamble:
Packed arrays elements are less likely to align with the host memory byte or word.

In reply to dave_59:

Thank you Dave… :)

In reply to Curious_cat:

In reply to dave_59:
How does the alignment with host memory exactly help in faster access of the unpacked array?
Does it mean there are lesser number of clock cycles involved to get the element from the memory?

In reply to Curious_cat:

Yes, performance is always measured by clock cycles per operation. Unaligned read access means you might have to read more memory locations, and an unaligned write might require a read-modify-write operation.