P870 High-Performance RISC-V Processor
RISC-V is based on **standards**

Standards Accelerate Software Adoption and Portability

- Standards reduce cost
  - Faster Adoption
  - Compatibility across vendors

- Layered standards enable customization
  - RISC-V embraces customization without breaking compatibility

- More than just ISA Standards
  - RISC-V Standards extend beyond the Core ISA to system-level components

---

**Example SoC**

**Microcontroller Processor**

- **RVM-CSI Platform**
  - Specifies ISA Profile and System Level Requirements for a common RISC-V MCU

- **RVM23 Profile**
  - Specifies a set of RISC-V ISA extensions suitable for Microcontrollers

**Application Processor**

- **OS-A Platform**
  - Specifies ISA Profile and System Level Requirements for a common RISC-V APU

- **RVA23 Profile**
  - Specifies a set of RISC-V ISA extensions suitable for Application Processors

**Accelerators**

- **RVA23 Profile**
  - Specifies a set of RISC-V ISA extensions suitable for Application Processors

- **Custom**
  - RISC-V allows for custom instructions without breaking compatibility with existing software

---

**WorldGuard**

**RISC-V Advanced Interrupt Architecture**

**Debug and Trace**
SiFive Performance family relentless innovation

- P550
- P450/P470
- P650/P670
- RVA22/RVA23
- RVA20+
- P870
- P870-A
- RVA23

- More performance
- Higher core count
- Leading RISC-V feature deployment
- Automotive specific features

3rd generation OoO core

Customer In-Silicon Dates

2022 2023 2024
SiFive Provides **Complete & Scalable** Solutions

**SiFive IP Complex**

- **CPU Clusters**
- **SiFive Cores**
- **Scalable Coherent Interconnect**
- **Advanced Power Management**

** Scalable High-Performance & High-Efficiency**
Cores: P870, P670, & P470 (with selected Mix+Match)

**Shared Cluster L2 Cache**

**System IP to Enable Complete RISC-V SoC solutions**

**Advanced Interrupt Controller**
**SiFive Insight Debug & Trace**
**IOMMU**
**SiFive WorldGuard Security**
P870 Pipeline

Branch Predict/Fetch

Decode/Rename

Integer

Load/Store

Floating Point

Vector
P870 μArch

36-byte fetch

Instruction Cache
64KB

Decode
 Rename
 Dispatch
 6-wide

Vector Sequencer

Vector Disp Buffer

Iss Q

ADD MUL MAC

Crypt

Div

Perm

Float Point Disp Buffer

Iss Q

ADD MUL MAC

DIV SQR

Integer Dispatch Buffer

Iss Q

Iss Q

Iss Q

Iss Q

Iss Q

Iss Q

Iss Q

ALU

ALU

ALU

BR

ALU

BR

Load/Store Dispatch Buffer

Issue Q

AGU

AGU

AGU

LD

LDST

LDST

Prefetchers

Data Cache
64KB

32-byte/cycle

Shared L3

32-byte/cycle

Cluster L2

32-byte/cycle

©2023 SiFive

©2023 SiFive
P870 µArch

- 64k Icache with 32-byte/cycle fill
- 1K Next Line Predictor
- 64-entry Return Address Stack
- 16K entry TAGE Direction Predictor
- 2.5K entry Indirect Predictor
- 36-byte Fetch
- 32 entry ITLB
P870 µArch

- 6-wide decode
- 32-bit and 16-bit instructions
- Register Renames:
  - 228 Integer
  - 240 Floating Point
  - 128 Vector
- ROB – up to 1120 instructions
- 6-wide dispatch
P870 µArch

- Vector Sequencer unrolls multi-register instructions
- 32 Vector issue queue entries
- Two 128-bit, mostly symmetric, vector execution pipelines
- Dual Crypto Units
P870 µArch

- 48 FP Issue queue entries
- Two mostly symmetric FP pipelines
- 2-cycle fadd SP/DP
- 2-cycle fmul SP/DP
- 4-cycle fmac SP/DP
P870 µArch

- 96 Integer issue queue entries
- 4 ALU units
- 1 branch units
- 1 BR/ALU Unit

Vector Sequencer

Vector Disp Buffer

Iss Q

ADD

MUL

MAC

Crypt

Div

Perm

Float Point Disp Buffer

Iss Q

ADD

MUL

MAC

Crypt

Div

Mask

Integer Dispatch Buffer

Iss Q

Iss Q

Iss Q

Iss Q

Iss Q

Iss Q

ALU

ALU

ALU

ALU

BR

BR

MUL

MUL

Div

Sqrt

Integer Dispatch Buffer

Prefetchers

Data Cache 64KB

NLP

RAS

Cond BP

Ind BP

Shared L3

Cluster L2

32-byte/cycle

32-byte/cycle

36-byte fetch

Instruction Cache 64KB

64KB

32-byte/cycle
P870 µArch

- 64KB Data Cache
- 1 Load pipe
- 2 Load/Store pipes
- 32 LdSt issue queue entries
- 48 Load buffer
- 48 Store buffers
- 64 entry L1 DTLB
- 1K entry L2 TLB
- Stride and pattern prefetchers
P870 µArch

- Shared Cluster L2 Cache
  - Up to 4 cores per L2
  - L3 Cache Shared with all Clusters
  - 32 byte/cycle evicts and fills
Cluster topology with shared L2 cache and distributed L3 cache
P870 Consumer example platform

SiFive Advanced
Debug & Interrupt

Interrupt Controller

Debug & Trace

SiFive CPU Clusters

Performance cluster

High-efficiency cluster

Always-On cluster

P870 P870

Shared L2$

P470 P470

Shared L2$

P470 P470

Shared L2$

E6

Shared L3$

SiFive System IPs

IOMMU

WorldGuard gadgets
P870-A Functional safety features

- Core pairs in lockstep
- Online diagnostic and STL
- SECDED ECC
- High integrity L2/L3 cache controllers and SECDED ECC
- Advanced RAS architecture enabling error configuration, reporting, reaction and injection
- Multi-Cluster Coherent Crossbar
- Multi-Cluster Non-Coherent Crossbar
- Cluster bus and crossbar with error detection code and error handling
- High integrity interrupt controller
- ASIL D
- ASIL B
- Core Clusters
  - CPU Tile 0
  - CPU Tile 1
  - CPU Tile 2
  - CPU Tile 3
  - L1 I$  L1 D$
  - L2$ slice 0  L2$ slice 1
  - L3$ slice 0  L3$ slice 1
- Memory Ports
- System, Peripheral, Front ports
- Debug
- Interrupt
SiFive broad IP portfolio
Scalable from MCU to high-performance compute

<table>
<thead>
<tr>
<th>Automotive</th>
<th>Intelligence</th>
<th>Performance</th>
</tr>
</thead>
<tbody>
<tr>
<td>P870-A</td>
<td>X200-Series</td>
<td>P500-Series</td>
</tr>
<tr>
<td>64-bit</td>
<td>AI processor for Edge</td>
<td>&gt;8.6 SpecInt 2k6/GHz</td>
</tr>
<tr>
<td></td>
<td>Hypervisor extension</td>
<td>3-wide OoO core</td>
</tr>
<tr>
<td></td>
<td>Vector crypto</td>
<td>RVA20+</td>
</tr>
<tr>
<td></td>
<td>IOMMU &amp; AIA</td>
<td></td>
</tr>
<tr>
<td></td>
<td>WorldGuard</td>
<td></td>
</tr>
<tr>
<td></td>
<td>Shared cluster cache</td>
<td></td>
</tr>
<tr>
<td></td>
<td>RVA23</td>
<td></td>
</tr>
<tr>
<td></td>
<td>ASIL B, D</td>
<td></td>
</tr>
</tbody>
</table>

| S7-A             |                                 | P400-Series       |
| 32/64-bit        |                                 | >8.6 SpecInt 2k6/GHz |
|                  |                                 | 3-wide OoO core   |
|                  |                                 | RVA22             |
|                  |                                 |                   |
|                  |                                 |                   |
| E6-A             |                                 | P600-Series       |
| 32-bit, balanced |                                 | >13.1 SpecInt 2k6/GHz |
|                  | performance and efficiency       | 4-wide OoO core   |
|                  | ASIL B, D                         | RVA22             |
|                  |                                 |                   |
|                  |                                 |                   |

| Essential        |                                 | P800-Series       |
| U6-Series        |                                 | >18 SpecInt 2k6/GHz |
| 64-bit, high     |                                 | 6-wide OoO core   |
|                  | performance                      | 128b vector length|
| S2-Series        |                                 | Hypervisor extension|
| 64-bit, Area     |                                 |                   |
|                  | optimized                         |                   |

| E2-Series        | S6-Series                         | S7-Series         |
| 32-bit, balanced |                                 | 64-bit, high      |
|                  | performance and efficiency        | performance       |
|                  |                                 | embedded          |

| E6-Series        |                                 |                   |
| 32-bit, optimized|                                 |                   |
| E7-Series        |                                 |                   |

©2023 SiFive
Empowering innovators

www.sifive.com