The reason registered memory is specified for 8GB DIMMs is to reduce the DIMM's electrical loading on the memory channels, which improves signal integrity.
Registered memory typically has more ranks, i.e., they have more memory ICs on the DIMM than a single rank DIMM. Some large registered DIMMs can have 36 ICs, or 4 ranks. All these ICs put a higher load on the channel. More loads have more capacitance which can either possibly cause errors or it will force the system to run these DIMMs ar a slower speed.
To overcome this, these large DIMMs ares "buffered", i.e., registers are used to enable one rank at a time. This reduces the memory channel loading. The registers might add another cycle or two (I don't remember exactly how it works) so they could be slightly slower than unbuffered memory, but that is a tradeoff between capacity and a slight decrease in speed.
Unbuffered vs. registered memory has nothing to do with a single or dual CPU configuration.
BambiBoomZ, you said you added 4x 8GB memory, for a total of 40GB. What was the original 8GB in the system? Was it a single 8GB DIMM, 2x 4GB, or 4 x 2GB DIMMs? For maximum performance, each channel (pair of black and white connectors) must have the same size and type of DIMMs installed. If you now have mismatched channels, that is, some channels without the second DIMM installed), you could find memory performance is slower with 40GB (according to a benchmark) than with only the 4 x 8GB installed in the black connectors. FYI.
Memory IC capacities have increased since the Z620 was released. A single rank 8GB unbuffered DIMM might work, but it was not verified by HP when the system was released. So, as an HP employee, I cannot recommend it or any other non-HP memory.