Advertising

Showing posts with label ARMv8 architecture. Show all posts
Showing posts with label ARMv8 architecture. Show all posts

Wednesday, 20 July 2016

Unknown

Gigabyte and Cavium announce 14 ARM-based server products

Gigabyte and Cavium have announced a new lineup of products that provide “a compelling, high performance alternative” to the incumbent server technologies available on the market. In total 14 new server SKUs, leveraging Gigabyte’s server expertise and Cavium’s ThunderX ARM-based platform, have been announced.

Cavium’s ThunderX ARM-based platform
The Cavium ThunderX ARM 64-bit ARMv8 SOCs are described as workload optimised chips for data centre and cloud processing. They are available with up to 48 cores and in clock speeds of up to 2.5GHz. Gigabyte’s new servers are available in dual-socket configurations and allow for up to 2.0GHz clock speeds. Other key specs of Gigabyte’s market ready server products are:
  • The highest integrated I/O capability with up to 160Gb of I/O bandwidth
  • Four DDR4 72-bit memory controllers capable of supporting up to 1TB of memory in a dual socket configuration at 2133MHz
  • Best in class performance per watt and performance per dollar for storage and compute applications
  • A comprehensive range of designs, from cost-focused entry level solutions to high density storage and compute focused platforms
Adopters of the ‘disruptive’ new Cavium ThunderX based servers from Gigabyte will achieve similar performance but a “more compelling TCO” than using traditional x86 server systems. As well as the expected fanfare for these server products from hardware partners Gigabyte, Cavium, and ARM, industry big hitters such as Red Hat, Innodisk, SUSE and QLogic were among those to welcome the introduction of these servers.


Customers in the US, Europe and Asia are already starting to receive Cavium ThunderX based Gigabyte server products and they are available to order now. Cloud service providers have already demonstrated strong demand for the new SKUs says Gigabyte.



Read More

Monday, 13 June 2016

Unknown

Cavium's 64-Bit ARM ThunderX2 Packs up to 54 Cores


Cavium unveiled its 64-bit ARM-basedThunderX2 processor for servers in cloud data centers used for workloads such as compute, security, storage, data analytics, network function virtualization (NFV) and distributed databases.

The second generation ARM processor from Cavium, which offers a number of on-board accelerators and advanced capabilities, packs up to 54 cores, enabling it to deliver two to three times the performance across a wide range of standard benchmarks and applications compared to ThunderX. It is built in 14nm FinFET process and is compliant with ARMv8.2 architecture as well as ARM's Server Base System Architecture (SBSA) standard.


Cavium ARM's Server Base System


Key ThunderX2 features will include:

  • 2nd generation of full custom Cavium ARM core: 2.4 to 2.8GHz in normal mode, Up to 3 GHz in Turbo mode; > 2X single thread performance compared to ThunderX.
  • Up to 54 cores per socket delivering 2-3X socket level performance compared to ThunderX.
  • Cache: 40K I-Cache and 64K D-cache, highly associative; 32MB shared Last Level Cache (LLC).
  • Single and dual socket configuration support using 2nd generation of Cavium Coherent Interconnect with > 2.5X coherent bandwidth compared to ThunderX.
  • System Memory: 6 DDR4 memory controllers per socket; Dual DIMM per memory controller, for a total of 12 DIMMs per socket.
  • Full system virtualization for low latency from virtual machine to IO enabled through Cavium virtSOC technology.
  • Integrated 10/25/40/50/100GbE network connectivity.
  • Multiple integrated SATAv3 interfaces.
  • Integrated PCIe Gen3 interfaces, x1, x4, x8 and x16 support.
  • Integrated Hardware Accelerators: OCTEON style packet parsing, shaping, lookup, QoS and forwarding; Virtual Switch (vSwitch) offload; Virtualization, storage and NITROX V security.
Four versions of the ThunderX2 will be offered:
·         ThunderX2_CP:  Optimized for cloud compute workloads such as private and public clouds, web serving, web caching, web search, commercial HPC workloads such as computational fluid dynamics (CFD) and reservoir modeling. This family supports multiple 10/25/40/50/100 GbE network Interfaces and PCIe Gen3 interfaces. It also includes accelerators for virtualization and vSwitch offload.
·         ThunderX2_ST: Optimized for big data, cloud storage, massively parallel processing (MPP) databases and Data warehousing workloads. This family supports multiple 10/25/40/50/100 GbE network interfaces, PCIe Gen3 interfaces and SATAv3 interfaces. It also includes hardware accelerators for data protection/ integrity/security, user to user efficient data movement.
·         ThunderX2_SC:  Optimized for secure web front-end, security appliances and cloud RAN type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces and PCIe Gen3 interfaces. Integrated hardware accelerators include Cavium’s industry leading, 5th generation NITROX security technology with acceleration for IPSec, RSA and SSL.
·         ThunderX2_NT: Optimized for media servers, scale-out embedded applications and NFV type workloads. This family supports multiple 10/25/40/50/100 GbE interfaces. It also includes OCTEON style hardware accelerators for packet parsing, shaping, lookup, QoS and forwarding.
"ThunderX2 combines our next generation core that will deliver significantly higher single thread performance with next generation IO and hardware accelerators to provide a compelling value proposition for the server market and greatly expand the serviceable server TAM," said Syed Ali, President and CEO of Cavium. "ThunderX2 will enable flexible, scalable and fully optimizable servers for next generation software defined data centers."

Read More
Unknown

Exploring A Production Cavium Thunderx Platform With A Gigabyte R120-T30 1U Server



Cavium sent us over a Gigabyte R120-T30 just about three weeks after the production scale 48-core ThunderX ARMprocessor began shipping. We are going to have a much more comprehensive view of performance over the next few weeks and we are seeing much better performance out of this system than we initially expected. In the meantime, we wanted to provide an overview of the platform we received. The impact of the platform cannot be understated. This is production silicon 48 core 64-bit ARMv8 that is the first direct competitor to the Intel Xeon E5-1600 and Xeon E5-2600 lines we have seen since AMD effectively exited the market. After spending a few weeks with the platform, we feel ready to publish an overview. The Cavium ThunderX is a very different approach to a server processor. While we will explore the performance of the 48-core Cavium ThunderX in a series of upcoming pieces, the Gigabyte R120-T30 1U server provides insights into some of the biggest selling points of the ThunderX platform. 

Test Configuration 
In the test configuration supplied by Cavium, we had the following configuration:
  • Server: Gigabyte R120-T30 (4x 3.5″ hot swap bay 1U)
  • Motherboard: Gigabyte MT30-GS0
  • CPU: Cavium ThunderX 48-core ARMv8
  • RAM: 128GB (4x 32GB DDR4 2133MHz RDIMMs) (1TB max capacity using 8x 128GB DIMMs)
  • SSDs: 2x SanDisk CloudSpeed Ultra 800GB SATA
  • PSU: 400w 80 Plus Gold
What this configuration hides is the fact that this system is a lot more advanced than a similar Intel platform from an I/O perspective. The Cavium ThunderX provides significantly more I/O standard than Intel’s Xeon E5 architecture.

Gigabyte R120-T30 1U server with Cavium ThunderX Overview

From the outside, the Gigabyte R120-T30 1U looks much like a standard 1U 4-bay 3.5″ server. What is inside the machine is different than what almost anyone who has purchased a server over the past 7 years has experienced.
1U server with Cavium

Opening the top cover, we see what appears to be a standard form factor motherboard with a large CPU heatsink. The Gigabyte R120-T30 has 5x hot swap fans and a PCIe riser slot as well.

1U Gygabyte server with Cavium

On either side of the CPU we have 4 DIMM slots. Each can take DDR4 RDIMMs to fill the ThunderX’s quad channel memory design with up to 2 DIMMs per channel (DPC.) We had 4x 32GB DDR4 2133MHz RDIMMs installed but we did also add a set of 8x 32GB (256GB) DDR4 RDIMMs just to validate all eight DIMM slots could be populated. The ThunderX supports up to 128GB DDR4 DIMMs so with 8x 128GB we get a total of 1TB of memory in a single CPU platform. Here is what the Gigabyte MT30-GS0 motherboard looks like without the fan shroud and PCIe riser:

SoC for Data Center Servers

While the newest Intel Xeon E5-1600 and E5-2600 V4 systems can utilize 3DPC for 1.5TB of RAM per CPU, 1TB is still an impressive total. For comparison, the Intel Xeon D SoC Intel produces tops out at 128GB (dual channel memory and 2 DPC.) Moving down the Intel stack further the Xeon E3 V5 lineup tops out at a paltry 64GB max. We have verified via the STREAM benchmark that there is significantly more bandwidth than the Intel Xeon D platform, even using faster DDR4-2400 RAM, but not as much as the Intel Xeon E5-2600 V4 platform. More on this in a future piece.

As we removed the CPU heatsink seemed very familiar. We noticed that the heatsink mounting points in the Gigabyte platform were just about a LGA2011 narrow ILM width. We pulled out a LGA2011 narrow ILM heatsink and sure enough, the mounting points lined up. We did not try installing a different heatsink as we were unsure of the pressure specs of the CPU package. Gigabyte did a nice job re-purposing existing designs for this new platform.
Cavium ThunderX SoC
 Underneath the heatsink we see a soldered Cavium ThunderX SoC:


The overall package is not socketed, but it is large compared to a LGA2011 chip. Here is an example as a size comparison:
 Cavium ThunderX UP – arm server SoC
In terms of storage, the Cavium ThunderX has access to 16x SATA ports per CPU.  To put that in perspective, an Intel Xeon E5-1600 V3, Intel Xeon E5-2600 V4 dual socket system, or even the quad socket Xeon E5-4600 series all use the same PCH which provides SATA connectivity (Intel onboard SAS was abandoned after Ivy Bridge.) With Cavium, each processor can give you up to 16 SATA III ports (note this can be lower depending on SKU.) Our 1U platform only has 4 of the 16x SATA III ports wired, but it does mean that a 3U 16-bay chassis will not require an add-on HBA.

1U arm server platform

We did verify that the platform was correctly telling Ubuntu 14.04 LTS that we have a SATA III 6.0gbps connection:

arm servers for cloud cmputing

The SATA ports are arranged in two 8-port sets with 7-pin SATA III connectors. It seems like the dual socket Gigabyte/ Cavium systems are using higher-density connectors.


The motherboard also has two PCIe 3.0 x8 slots. Most add-on cards come in PCIe x1, x4 or x8 form factors so ThunderX supports only up to an x8 PCIe slot.

The networking side is completely unlike what we see from most Intel platforms. Our test unit has a total of 80Gbps worth of networking from the SoC. There is a QSFP+ 40Gb Ethernet port and four SFP+ 10Gb Ethernet ports. Just to give one an idea, this is equivalent to having an Intel Fortville X710-am2 onboard. In card format, on the Intel side that would occupy a PCIe 3.0 x8 slot with an Intel XL710-QDA2 installed.
Cavium ThunderX UP for hpc appliacnes
For those looking at high network bandwidth, this is a truly awesome setup onboard.
Rounding out the list of features, one can see a standard VGA port, serial port, IPMI out-of-band management LAN port and four USB 3.0 connectors (two front two rear.) The second ARM SoC onboard this platform is the AST2400 which is probably the #1 BMC in the industry. It is great to see Gigabyte/ Cavium use the industry standard BMC.

Final Words
If you cannot tell where this is headed, we have some interesting comparisons coming in future pieces. The Intel Xeon D platform we have today still provides lower power and better single threaded performance, but in terms of memory bandwidth, memory capacity, SATA storage expansion, and networking, it is thoroughly out-classed by the Cavium ThunderX. From what we have heard in terms of pricing both list and street, we expect the Cavium ThunderX platforms to be very competitive with the higher-end Xeon D platforms in terms of price.

Shifting to the Intel Xeon E5-1600 V3 and Intel Xeon E5-2600 V4 side, for NVMe storage, those platforms have a clear PCIe lane advantage. Furthermore, we expect areas like Windows / VMware virtualization hosts, HPC servers or those with high single threaded performance needs, higher memory capacities and etc. to stay as Intel strongholds for the near future.

On the other hand, as the software side matures, especially with releases like the recent Ubuntu 16.04 LTS, we are seeing performance in web stack applications get significantly better. As an example, our OpenSSL / RSA testing was showing an ~80% performance increase moving from stock OpenSSL 1.0.1g to 1.0.2g on the ThunderX 48-core part. OpenSSL 1.1 seems like we will again see a solid performance gain. These types of performance increases are unlike what we see on the Intel side and show how rapidly the ARM side is maturing.

We will have more on management features in a coming piece in this series, but we can share the the experience is an absolute pleasure compared to some of the ARM development boards. The AST2400 integration is something we were pleased to see.

If you can use the onboard I/O or accelerators in the Cavium ThunderX workload optimized SKUs, Cavium is going to make a strong case. From the pricing we have seen thus far, if one can utilize a unique combination of 48x ARM cores, high-speed networking, SATA ports and accelerators, the Cavium platform is very compelling.
We are going to have a more in-depth look at day to day management and (small scale) operations with the ThunderX platform in the coming weeks. We have also been running baseline performance figures using the ThunderX as a web appliance (e.g. as a nginx, redis, and SSL offload server.) Stay tuned to STH for some very interesting tips, tricks and performance results.





Read More