Matrix multiplication

Can you explain what you are trying to accomplish? I see that variables a and b 32-bits wide. Are you trying to have each element 4-bits wide? If so the resulting variable must be at least 8-bits wide.