The architecture and feature set of the Cortex-A7 processor are identical to the Cortex-A15 processor, with differences in the Cortex-A7 processor's microarchitecture focused on providing optimum energy efficiency, enabling the two processors to operate in tandem in a big.LITTLE configuration to provide the ultimate combination of high-performance with ultra-low power consumption.
As a standalone processor, the Cortex-A7 will enable entry level smartphones at below $100 price point in the 2013-2014 timeframe that are equivalent to a $500 high-end smartphone in 2010. These entry-level smartphones will redefine connectivity and internet usage in the developing world.
The Cortex-A7 processor is a highly energy efficient applications processor designed to enable low-cost, fully featured entry-level smart phones in addition to other low-power applications.
The processor is fully compatible with other Cortex-A series processor and incorporates all of the features of the high-performance Cortex-A15 processor including virtualization, Large Physical Address Extensions (LPAE) NEON advanced SIMD, and AMBA 4 ACE coherency.
A single Cortex-A7 processor can deliver 5x energy-efficiency, 50% greater performance and is one fifth the size of the ARM Cortex-A8 processor, which powers many of today's most popular smartphones.
- Best power-efficiency and footprint as a standalone applications processor
- More performance than 2011 mainstream smartphone CPU
- Up to 20% more performance while consuming 60% less power
- Companion CPU to Cortex-A15 to enable big.LITTLE Processing
- Software can run on an energy efficient Cortex-A7 processor and on a high performance Cortex-A15 processor as needed without recompilation
- AMBA 4 ACE coherency interface enables sub-20us context migration between big and LITTLE CPU clusters
The Cortex-A7 processor has been licensed by a large number of the industry's leading silicon manufacturers including:
Power, Performance, and Area (PPA)
- >1GHz in 28nm
- Single Core 0.45mm2
- With FPU & NEON™, 32K L1 caches
- Power similar to highly-efficient Cortex-A5
Efficient web browsing performance enabled by:
- In-order 8 stage pipeline with branch prediction
- Improvements in the memory management unit and bus interface
- Integrated low-latency L2 cache
Performance improvements in the Cortex-A7 CPU Design
- Integrated L2 cache
- Lower transaction latencies
- Improved OS support for L2 cache maintenance due to simplified software control
- Designed with a low power approach
- Consecutive tag-data lookup and fixed 8 way-set associativity balances performance against lookup energy
- External request on L2 miss – non-speculative to reduce energy
- Branch prediction improvements
- BTIC caches fetches after a direct branch, hides branch shadow on tight loops
- Improved memory system performance
- 64b Load Store path, improves integer and NEON performance over Cortex-A5 (32b path)
- 128b AMBA 4 buses improves bandwidth
- Increased TLB size (256 entry, up from 128 entry for Cortex-A9 and Cortex-A5)
- Increases performance for large workloads like web browsing
|Memory Management||ARMv7 Memory Management Unit|
|Debug and Trace||CoreSight™ SoC-400|
|Cortex-A7 MPCore Key Features|
|Thumb-2 Technology||Delivers the peak performance of traditional ARM code while also providing up to a 30% reduction in memory required to store instructions|
|TrustZone Technology||Ensures reliable implementation of security applications ranging from digital rights management to electronic payment|
|NEON||NEON technology can accelerate multimedia and signal processing algorithms such as video encode/decode, 2D/3D graphics, gaming, audio and speech processing, image processing, telephony, and sound synthesis|
|DSP & SIMD Extensions||Increase the DSP processing capability of ARM solutions in high-performance applications, while offering the low power consumption required by portable, battery-powered devices. The DSP extensions are optimized for a broad range of software applications including servo motor control, Voice over IP (VOIP) and video & audio codecs.|
|Floating-Point||Hardware support for floating-point operations in half-, single- and double-precision floating point arithmetic. The floating-point capabilities of the Cortex-A7 processor offer increased performance for floating point arithmetic used in next generation of consumer products such as Internet appliances, set-top boxes, and home gateways.|
|Hardware Virtualization||Highly efficient hardware support for data management and arbitration, whereby multiple software environments and their applications are able to access simultaneously the system capabilities. This enables the realization of devices that are robust, with virtual environments that are well isolated from each other.|
|Large Physical Address Extensions (LPAE)||The introduction of Large Physical Address Extensions (LPAE) enables the processor to access up to 1TB of memory.|
|Optimized Level 1 Caches||Performance and power optimized L1 caches combine minimal access latency techniques to maximize performance and minimize power consumption. Caches are configurable size 8kB~64KB for instruction and for data. Also providing the option for cache coherence for enhanced inter-processor communication or support of rich SMP capable OS for simplified multicore software development|
|Integrated, Configurable Size Level 2 Cache Controller||Providing low latency and high bandwidth access to up to 1 MB of cached memory in high frequency designs, or design needing to reduce the power consumption associated with off chip memory access. The L2 cache is optional on Cortex-A7.|
|AMBA® 4 Cache Coherent Interconnect (CCI)||The CCI provides AMBA 4 AXI™ Coherency Extensions (ACE) compliant ports for full coherency between multiple Cortex-A7 MPCore processors, better utilizing caches and simplifying software development. This feature is essential for high bandwidth applications including gaming, servers and networking that require clusters of coherent single and multicore processors. Combined with the ARM CoreLink™ network interconnect and memory controller IP, the CCI increases system performance and power efficiency.|
|Cortex-A7 NEON Media Processing Engine (MPE)||The Cortex-A7 MPE provides an engine that offers both the performance and functionality of the Cortex-A7 Floating-Point Unit and an implementation of the NEON Advanced SIMD instruction set for further acceleration of media and signal processing functions. The MPE extends the Cortex-A7 processor's floating-point unit (FPU) to provide a quad-MAC and additional 64-bit and 128-bit register set supporting a rich set of SIMD operations over 8, 16 and 32-bit integer and 32-bit Floating-Point data quantities.|
|Cortex-A7 Floating-Point Unit (FPU)||The FPU provides high-performance single, and double precision Floating-Point instructions compatible with the ARM VFPv4 architecture that is software compatible with previous generations of ARM Floating-Point coprocessor.|
|Advanced MultiCore Features|
|The processor also utilizes the widely established ARM MPCore multicore technology, enabling performance scalability and control over power consumption to exceed the performance of today's comparable high-performance devices while remaining within tight mobile power constraints. Multicore processing provides the ability for any of the four component processors to be shut down when not in use, for instance when the device is in standby mode, to save power. When higher performance is required, every processor is utilized to meet the demand while still sharing the workload to keep power consumption as low as possible.|
|Snoop Control Unit||The SCU is responsible for managing the interconnect, arbitration, communication, cache-2-cache and system memory transfers, cache coherence and other capabilities for the processor. The Cortex-A7 MPCore processor also exposes these capabilities to other system accelerators and non-cached DMA driven peripherals to increase performance and reduce system wide power consumption. This system coherence also reduces software complexity involved maintaining software coherence within each OS driver.|
|AMBA® 4 AMBA Coherency Extension (ACE)-Lite||This mechanism enables external non-cached bus masters to perform coherent reads and writes to the Cortex-A7 memory map. The snoop control unit manages coherency and makes connection through AMBA-4 ACE-Lite; this acts as a functional replacement for the accelerator coherency port (ACP) that was present in Cortex-A5 and Cortex-A9. ACE-Lite is particularly useful for applications where the Cortex-A7 CPU is managing IO traffic driven by an external DMA.|
|Generic Interrupt Controller||Implementing the standardized and architected interrupt controller, the GIC provides a rich and flexible approach to inter-processor communication and the routing and prioritization of system interrupts. Supporting up to 480 independent interrupts, under software control, each interrupt can be distributed across CPU, hardware prioritized, and routed between the operating system and TrustZone software management layer. This routing flexibility and the support for virtualization of interrupts into the operating system, provides one of the key features required to enhance the capabilities of a solution utilizing a hypervisor|
The ARM CoreLink™ interconnect and memory controller IP addresses the critical challenge of efficiently moving and storing data between multiple Cortex-A7 MPCore processors, high performance media processors and dynamic memories to optimize the system performance and power consumption of the SoC. The CoreLink system IP enables SoC designers to maximize the utilization of system memory bandwidth and reduce static and dynamic latencies. While the ARM CoreSight™ technology provides complete on-chip debug and correlated, real-time trace visibility for all cores of the Cortex-A7 MPCore processor, reducing risk and speeding development of high quality multiprocessing software. The new AMBA® 4 Cache Coherent Interconnect (CCI) provides optimum system bandwidth and latency.The CCI provides AMBA 4 AXI™ Coherency Extensions (ACE) compliant ports for full coherency between multiple Cortex-A7 MPCore processors, better utilizing caches and simplifying software development. This feature is essential for high bandwidth applications including gaming, servers and networking that require clusters of coherent single and multicore processors. Combined with the ARM CoreLink network interconnect and memory controller IP, the CCI increases system performance and power efficiency.
ARM Physical IP Platforms deliver process optimized IP, for best-in-class implementations of the Cortex-A7 processor at 40nm and below. A set of high performance Processor Optimization Packs (POPs) containing advanced ARM Physical IP for 28nm technologies to enable rapid development of leadership physical implementation supports the Cortex-A7 processor. ARM is also working early to assure a roadmap to 20nm optimizations. Optimization packs support ARM's strategy of offering specifically targeted Physical IP to enable Partners to achieve tuned implementations of ARM cores. ARM is uniquely able to design the optimization packs in parallel with the Cortex-A7 MPCore processor architecture, enabling the processor and physical IP combination to deliver workstation class performance in a mobile power envelope while facilitating rapid time-to-market.
The ARM Development Suite 5 (DS-5™) tool suite, as well as a wide range of third party tools, operating systems and EDA flows fully support all ARM processors. ARM DS-5 software development tools are unique in their ability to provide solutions that take full advantage of the complete ARM technology portfolio. The ARM Development Studio 5 (DS-5™) provides a complete range of software tools to create, debug and optimize systems based on the Cortex-A15 MPCore processor. It incorporates the DS-5 Debugger, whose powerful and intuitive graphical environment enables fast debugging of bare metal, Linux and Android native applications. In addition, its new ARM Streamline™ Performance Analyzer simplifies the identification of hot spots in software and load balancing between cores.The ARM Compiler, which already includes specific optimizations for the Cortex-A15 MPCore processor, enables early software development before silicon availability and an ARM Versatile™ Reference Virtual Platform built on ARM Fast Models technology. This Virtual Platform is available for a free 6-month evaluation.
The Mali™ family of products combine to provide the complete graphics stack for all embedded graphics needs, enabling device manufacturers and content developers to deliver the highest quality, cutting edge graphics solutions across the broadest range of consumer devices. Support ARM training courses and Active Assist on-site system-design advisory services enable licensees to integrate efficiently the Cortex-A7 MPCore processor into their design to realize maximum system performance with lowest risk and fastest time-to-market.
ARM training courses and Active Assist on-site system-design advisory services enable licensees to integrate efficiently the Cortex-A7 MPCore processor into their design to realize maximum system performance with lowest risk and fastest time-to-market.
big.LITTLE Processing with the Cortex-A15 and Cortex-A7 Processors (150Kb PDF)
Cortex™-A7 MPCore™ Technical Reference Manual
Cortex™-A7 NEON™ Media Processing Engine Technical Reference Manual