Tuesday, March 27, 2012

Database Cluster CPU choice.

Hi all,
We are considering hardware upgrade.Can SQL Server 2005 Enterprise and
Standard Edition fully take advantage of the Quad Core architecture ?. Quad
Core Intel Xeon X5355 is top of the range. Dual Core 5160 is another choice
worth considering. At the current prices,which one would be a more cost
effective choice.
Regards,
http://www.intel.com/performance/server/xeon/database.htm
I don't know how well SQL2005 would take advantage of Quad Cores, I have not
had a chance to conduct my tests yet. But I'm curious as to what systems
(servers) and how many sockets you have in mind.
Linchi
"Sezgin" wrote:

> Hi all,
> We are considering hardware upgrade.Can SQL Server 2005 Enterprise and
> Standard Edition fully take advantage of the Quad Core architecture ?. Quad
> Core Intel Xeon X5355 is top of the range. Dual Core 5160 is another choice
> worth considering. At the current prices,which one would be a more cost
> effective choice.
> Regards,
> http://www.intel.com/performance/server/xeon/database.htm
>
>
|||This is a pretty general processor architecture question.
Can SS2K and SS2K5 make use of Intel's Hyper-Threading and either Intel's or
AMD's multi-core technologies? Yes, they can. The OS presents any/all of
these as logical processors to SQL Server. However, both provide NUMA
awareness, and SS2K5 provides NUMA optimization for those process
architectures that provide it. This is critical because only Intel's
Itanium and AMD's x64 processor architectures provide this capability.
Each processor architecture provides its own pros and cons depending on the
workload characteristics. As such, a deep understanding of your systems
workload requirements and each processor's architectural capabilities is
necessary before any recommendation could be made.
The fact is that for single-threaded workloads, a single-socket cpu with a
larger on-die cache and higher clock rate will outperform all other
technologies.
Now, when you start talking about multi-tasking as in desktop/workstation
scenarios or multi-threading applications as in server DBMS or high-end
gaming systems, multi-core/multi-socket presents real multi-tasking and
parallel computing opportunities that are just not possible with sing
processor solutions, no matter how fast.
Whether you use Hyper-Threading technology, multi-core, or a combination of
the two, you can not match the power of SMP multi-socket. Even in the link
that you provided, Intel reports a 50% to 80% improvement of the Quad-Core
over the previous generation Dual-Core products. Wait a minute! It just
doubled the number of cores, but only obtained, at best, an 80% performance
improvement? If you were to double the number of sockets, you would double
the amount of power, and, unfortunately, you would double your cost and
energy consumption as well. This is the advantage multi-core provides over
SMP scalability, not to mention that most software vendors only require you
to license the sockets, not the cores nor the Hyper-Threads.
However, straight SMP architecture also hits a scalability limit: as you add
additional processors to the system (sockets), you increase the amount of
overhead required to manage thread context switches and the associated
memory to cpu loading. Once you get above 8-way solutions, the SMP
controller and cpus start to spend more time moving data on and off the
processors as scheduling requests come in than they do running actual
requested work. Intel's Hyper-Threading Technology was an attempt to
alleviate this congestion. But there are drawbacks associated with its use
as well.
The advantage of Hyper-Threading is that two full thread contexts are loaded
to the CPU at a time. Why it works is based on the probability that the
next scheduled thread will be higher for the last signaled threads than the
others in the queue. In which case, the memory context has already been
loaded to the processor, and as such, reduces the latency inherent in SMP
overhead. However, if that assumption is false, as is the case most often
in OLTP DBMS solutions, the overhead of loading 2 contexts with every
scheduling request can actually double the SMP overhead. For DSS and some
OLAP DBMS solutions, very few context switches, relatively speaking, are
requested anyway; so, again the technology can actually increase the SMP
scheduling overhead.
Now, for multi-core solutions. As was stated above, although their per core
processing power is lower than a comparable multi-socket solution, so is the
electrical power and thermal requirements, in addition to the licensing
costs. The consideration here then is the implementation differences
between Intel's and AMD's solutions. To help explain these, a quick look at
the Itanium's solution and NUMA architecture can prove useful.
Besides the inherent benefits of the Itanium's EPIC instruction set, which
rivals and can exceed the processing capabilities of RISC-based solutions,
engineers had to contend with the SMP scaling problem. Although 8-way and
even 32-way SMP chipset architectures had been constructed, the complexity a
nd expense of these systems detoured their usage. To overcome this, 4-way
systems were deemed optimal and NUMA architectures were constructed by use
of "cell" construction.
The way it works is that each "cell" is a 4-way collection of SMP
processors. Cells are then interlinked through a technology similar to
AMD's Hyper-Transport memory bus solution. A NUMA (Non-Uniform Memory
Architecture) attempts to localize memory allocations and usage to each
cell, which avoids the "bridging" mechanisms used in standard SMP solutions,
and is much more efficient. As cells are interlinked, although memory
across cells and processors can be requested, the latency induced by such
calls reduces to the efficiency of the old SMP architectures. So, if
applications are written to be aware of this architecture, these interlink
calls can be minimized, memory requests and thread scheduling can be
optimized to maintain NUMA cell proximity, thereby scaling the overall
solution beyond the 4-way limitations and actually increase the memory bus
speeds. Unfortunately, for non-NUMA aware applications, this configuration
provides little memory optimization benefit. Although, the raw NUMA
configuration can be provided more cheaply than an equivalent straight SMP
one.
Keep in mind that NUMA is a macro-architecture implementation utilizing
multi-way processor sockets and 4-way per cell interlinks. Interestingly,
HP went a step further with their MX-2 technology utilizing two full CPUs on
a single PCB per-socket, thus scaling each NUMA-node (i.e., cell) to 8-way
SMP. These presented the first "macro-scale" multi-core solutions.
When it came time for real multi-core, AMD architected their solution with
the above understanding of memory transport and NUMA efficiency. From the
point of view of the OS, AMD multi-core processors behave very similarly to
NUMA solutions. On a single socket, incased in a single processor package,
two full cores where introduced, but in addition, a localized, large,
hyper-fast memory buffer. When SMP chipsets were constructed, these
on-socket memory buffers were interlinked, again in a NUMA style fashion,
through high speed interlinks, called Hyper-Transport.
Intel, on the other hand, to catch up and get to market quicker, simply
extended their solutions to multi-core, but introduced no new memory
transport improvements. Even with this new Quad-Core processors, the memory
transport architecture remains the same as the single-core Xeon solutions,
and thus still suffers from some of the SMP scalability limitations.
What does this mean from the point of view of the DBMS? It means that you
can use SQL Server's NUMA-awareness against Intel's Itanium and AMD's
multi-core processors, but not on Intel's dual or quad cores. Moreover, SQL
Server for Itanium has been optimized for the EPIC instruction set in
addition to large-scale NUMA. AMD, however, is more a logical NUMA, still
uses x86 extended x64 instructions, but still blows the doors off of the
Intel Xeon solutions.
However, Intel has also recently released their hyper-threaded, multi-core
Itanium processor (series 9000): the Montecito, which actually doubles the
processing capability over the older Itanium 2 Madison chips. Moreover, the
quad-core Itanium Tukwila processor is expected to launch sometime in 2007.
From these, the full-scale, macro-style, NUMA cell implementations can be
constructed.
Moreover, in early 2008, AMD will be launching a new quad-core processor
built on a newly updated Hyper-Transport 3.0 interconnect protocol.
Hope this helps. I would keep my eye on the TPC-C sites. The Itanium-2
solutions are all in the top performer categories, boasting Millions of
Transactions per Minute (Intel's Xeon quad-core only reached 200 thousand
TPM). But for less expensive, smaller scale requirements, the AMD
multi-cores also reach into the top mark lists. AMD's dual-core reached the
same relative limits it took the Xeon quad-core to reach. Recent
expectations for the AMD Barcelona quad-core 4-way would exceed 500,000 TPM,
almost double the Intel marks.
Sincerely,
Anthony Thomas

"Sezgin" <anonymous@.anonymous.com> wrote in message
news:uJFPxKWOHHA.4244@.TK2MSFTNGP04.phx.gbl...
> Hi all,
> We are considering hardware upgrade.Can SQL Server 2005 Enterprise and
> Standard Edition fully take advantage of the Quad Core architecture ?.
Quad
> Core Intel Xeon X5355 is top of the range. Dual Core 5160 is another
choice
> worth considering. At the current prices,which one would be a more cost
> effective choice.
> Regards,
> http://www.intel.com/performance/server/xeon/database.htm
>
>

No comments:

Post a Comment