In reply to eda2k4:
This is a perfect example of a completely superfluous #0 (At least for the reason stated in the comment).
The next line after #0 is
wait (m_phase_all_done == 1);
Any suspension or completion of the parent thread causes the child threads to start.