Simulation speed and memory footprint of SV data-types

Dear experts,

I am looking for some information and associated guidelines regarding the use of SystemVerilog data-types to optimize the simulation speed and memory footprint with Questa, especially in the context of UVM classes (sequence items, components, etc).

Regarding the 2-state logic data-types, SV provides bit, byte, shortint, int and longint. Am I right to assume that SV data-types may not be equivalent in terms of simulation speed and memory footprint?

Booleans/Binary decisions:
They can be encoded in a single bit. In C, the result of a boolean operation is commonly encoded as an int data-type. In SV with Questa, is there any speed/memory benefit or penalty of using the bit data-type instead of the int data-type (or any other data-type) to store the result of boolean operations?

Packed bits vs. byte/shortint/int/longint data-types:
For instance a quartet can be declared as bit [3:0] or stored in a larger standard SV data-type, byte, shortint, int, longint. Regarding performance of packed-bits versus standard SV data-types, I guess the answer to the simulation speed depends on the operations performed on this variable. However if there is no significant difference between packed bits and standard data-types in terms of memory footprint and simulation speed, it might be easier to use packed bits. If there is a noticeable difference, it might be preferable to use standard SV data-types and mask the unused bits whenever necessary. Do we have an evaluation of the speed/memory benefit or penalty of using a packed-bits rather than a standard data-type?

Byte vs shortint vs int vs longint data-types:
While the number of bits required for each data-type differ, is there a significant difference in terms of memory footprint and simulation speed between these 4 SV data-types with Questa on current Intel processors (linux x64)?
For instance, is the simulation faster when accessing/manipulating members of int data-type rather than of other data-types?
Do we always observe a memory footprint proportional to the number of bits corresponding to the size of the standard data-type (4 data-members of int data-type would always require 4 times the memory of 4 data-member of byte data-type)?

Can the relative performance of bits, packed-bits, byte, shortint, int and longint be considered as similar in case of data-members of classes and in case of variables/arguments of functions and tasks?

PS: On 2-state versus 4-state logic data-types, I assume that 2-state logic simulates faster than 4-state logic, leading to the underlying rule: preferably use 2-state logic unless 4-state logic is required.

In reply to Denis Lavaud:

This public methodology forum is not for tool specific help, but I can give you some general advice about the use of SystemVerilog data types.

The memory layout of any SystemVerilog data type is not defined by the standard and in fact optimizations may change it due to the way they are being used in a particular design. Users have no way of determining the actual memory layout because there are no pointers in SystemVerilog. You can reasonably assume that 4-state types take up twice the memory of a 2-state types. But you need to remember there is a lot of extra overhead around each variable depending on things like event controls (is any piece of code waiting for this variable to change, or individual bits of this variable to change) and randomization.

Performance of 2-state versus 4-state depends more on your usage. If you are going to be accessing your variables bit-by-bit, then 2/4-state makes no difference in performance. But operations on larger aggregates may be better on 2-state because of memory throughput. Most simulators are usually bound by cache throughput more than actual processor power.

There are profiling features in most tools available to help you determine where the memory and performance bottlenecks are.