Welcome!
On many computers I experienced poor performance of 32 bit guests running on 64 bit Linux host (I used only the Debian family). At last I managed to collect benchmark data.
I made the benchmark by running custom VBA macro, (which we use in our company) that generates 284 pages long Word document full of Excel Pie charts, tables and comments. The macro is run as a single task (excluding the standard services) on a set of identically configured Windows XP 32-bit systems. I measured the time (in sec.) needed to perform the test.
The computer (i.e. my notebook Asus P53E) supports both VT-d extensions and native Windows XP. It has 2-core processor, each core is hyperthreaded, so in total we have 4 mostly independent execution units.
I use the latest VirtualBox 4.2 and VMWare Workstation 9.0 for Linux, installed together on the same host (running Mint 13 Maya) but never run simultaneously.
The results (in column Time) are no less accurate than ± 10%
Here are the results (sorry for the format, but I couldn't find out a better solution for tables in SO):
+---------------+-------------+------------------------------------------------------+---------+------------+----------------+------+
| Host software | # processor | Windows kernel | IO APIC | VT-x/AMD-V | 2D Video Accel | Time |
+---------------+-------------+------------------------------------------------------+---------+------------+----------------+------+
| VirtualBox | 1 | Advanced Configuration and Power Interface (ACPI) PC | 0 | 1 | 0 | 1139 |
| VirtualBox | 1 | Advanced Configuration and Power Interface (ACPI) PC | 0 | 1 | 1 | 1050 |
| VirtualBox | 1 | Advanced Configuration and Power Interface (ACPI) PC | 0 | 0 | 1 | 1644 |
| VirtualBox | 4 | ACPI Multiprocessor PC | 1 | 1 | 1 | 6809 |
| VMWare | 1 | ACPI Uniprocessor PC | | 1 | 1 | 1175 |
| VMWare | 4 | ACPI Multiprocessor PC | | 1 | 1 | 3412 |
| Native | 4 | ACPI Multiprocessor PC | | | | 1693 |
| Native | 1 | Advanced Configuration and Power Interface (ACPI) PC | | | | 1170 |
+---------------+-------------+------------------------------------------------------+---------+------------+----------------+------+
Here are the striking conclusions:
Although I've read in the VirtualBox fora about abysmal performance with 32-bit guest on 64-bit host, VMWare also has problems compared to native run, still being twice faster(!) than VBox.
Although VBA is inherently single-threaded, the Excel calculations, which take much more than a half of total computation time, supposedly aren't. So one would expect some speed gain when running on 2+ cores ("+" for hyperthreading). What we see is a speed loss. And quite big one too.
For the VirtualBox the VT-d extension isn't a big deal.
Can anyone shed some light on why the singlethreaded Windows kernel is so much faster than the SMP one?