Intel unveiled their Silvermont microarchitecture on May 6th, giving a broad overview of the innovations that will soon come to Intel Atom processors. Silvermont is the first major rework of the microarchitecture in Atom. After it’s initial launch in 2008 with the Bonnell architecture at the 45nm node, Intel only delivered a 32nm shrink named Saltwell that integrated the north bridge, while leaving the CPU cores basically unchanged. With the 22nm Silvermont core, the company introduces some major changes.

Furthermore the company plans to accelerate the roadmap and plans to provide a 14nm successor called Airmon already next year. Intel doesn’t go into detail whether Airmont is merely a die shrink or comes with additional enhancements at this point. The generation after that which will also be a 14nm part doesn’t have a codename yet. Primarily this shows that Intel is committed to bring their Atom products up to par in terms of manufacturing.

It also is a testament to the growing importance of small cores. Indeed Silvermont-based products are going into vastly different markets. Each of the target markets gets a different codename: Avoton for servers and storage, Rangeley for networking, Baytrail for convertibles, tablets, small notebooks and the like, Merrifield for smartphones and an undefined codename as of yet for in-car infotainment systems. Intel even said that they are prepared to do customized designs, something AMD heavily touted in the last months.

The 2-issue, Out of Order Execution Engine

Ever since the original Pentium, Atom was the only CPU featuring an in-order execution engine. That decision was deliberate to maximize power savings. With Silvermont this will change though, using a higher performance out of order execution engine as it is used in other x86 CPUs. Being able to reorder instruction execution comes with a power hit but provides greatly improved performance. As a tradeoff, support for Hyper-Threading is removed from Silvermont, which would have further impacted the power budget and didn’t provide such a tangible performance advantage as with an in-order engine.

The issue width ? the number of instructions that can be sent to execution units simultaneously ? wasn’t changed and remains at two. This is similar to what we see with AMDs Bobcat and Jaguar architectures which target a similar segment. Intel’s big cores like Ivy Bridge are 4-issue cores. However, while the bigger cores are internally really RISC-machines (reduced instruction set computer, refers to the fact that it can only do simple instructions) with a CISC-frontend (complex instruction set computer, refers to the x86 architecture which has a lot of complex instructions that actually do many things at once), with Atom ever since the original Bonnell generation, Intel aimed to execute many instructions as-is without splitting them up into micro-operations. Intel calls this macro-execution.

Many other details of the branch prediction, execution units, internal caches and so on got improved with the aim of increased performance and reduced power consumption, but Intel didn’t go into much technical detail in this unveil.

Core Organization and Features

The cores are organized in modules with two cores sharing a common L2 cache that is up to 1MB in size. The company made an interesting remark how one of their competitors shares the L2 cache with ‘many’ cores. That was without doubt directed at AMDs Jaguar architecture, which uses a 2MB L2 cache shared across four cores. So on a per core level both chips have the same amount of cache, but Intel believes their solution will provide much more bandwidth. Now there are differences which will have an impact on performance either way, but in general one can’t call one or the other implementation is superior based on the number of cores that share the L2 cache.

A Silvermont based CPU can have up to four of those modules, thus allowing for up to eight core configurations, which will be especially interesting for the microserver segment. Inside the chip, the modules are connected with a point-to-point interface to a SoC fabric. In Silvermont the frequency and power management states can be set on a core level. Compared to previous Atom SoCs, power can be dynamically allocated to CPU cores and the GPU. If the constraints of the platform allow it, cores can even enter a turbo mode.

The instruction set extension have been upgraded to a Westmere-level, which means Silvermont supports SSE up to and including SSE4.1 and SSE4.2, AES-NI as well as the POPCNT instruction and of course the Intel 64 instructions. Hardware virtualization now also supports Extended Page Tables. Given Intels track record of fusing off certain features depending on the target market, it could be that not all of these features will be available on all SKUs.

As always Intel touts the advantage of being an IDM (integrated design manufacturer), where the whole technology stack comes from one company and every aspect can be optimized to improve the final product. This time they tout how they optimized their 22nm process for SoCs.

Performance Figures and Projections

The company claims a 2.8x improvement in performance or a 4.4x reduction in power consumption at the same performance compared to current Salwell-based Atom products. Most interestingly the single core comparisons are more favorable, with a 2x performance increase at the same power level as well as peak performance compared to Saltwell cores. Since the scaling is based on the geometric mean of a number of commonly used benchmarks, one can’t make a detailed analysis of the multi-core scaling and the performance hit through the removal of Hyper-Threading. Compared to the predecessor, Silvermont performance looks solid.

Intel put a number of additional colorful comparison slides vs unnamed competitors based on ARM architecture. For those interested, Intel put up the whole presentation on their investor relations website. Basically Intel claims a 40 ? 110% performance advantage at 1W (their smartphone target) and a 60 ? 130% advantage at 1.5W (their ARM tablet comparison). These advantages in turn amount to enormous power savings compared to the competition (1.6x ? 5.8x). The smartphone comparison is based on a dual-core Silvermont vs quad-core Arm SoCs, while the tablet comparison is based on quad-core Silvermont. While surely those are nice graphs, they are solely based on the SPECint rate_base2000 benchmark, which even Intel acknowledges is outdated. So take these with a grain of salt.

Aside from their occasional remarks directed at AMD and ARM (without specifically naming either competitor), at the very end of the webcast, it was made clear that putting a chip into a mobile design space got nothing to do with the architecture and that Intel is more than capable to compete in this space. Without doubt Intel is committed to succeed in this space. Silvermont appears to be a very promising design that brings some much needed innovations to Atom. However we would caution to take the bold claims with a grain of salt and wait for independent evaluations of the hardware before jumping to conclusions.