OpenSPARC T1 處理器解析(1.2)
Run OpenSPARC T1 on CentOS (百度網盤) https://blog.csdn.net/oYeXingDeHuHuan1/article/details/124620324
這次講1.2 節,英文好的,可以自己直接看。我會高亮並註解(#漢語註解在中間#)一些關鍵資訊,並找到對應verilog原始碼做簡單的解釋。如果讀者看過這兩三書可以更好的理解:《計算機組成與設計 硬體/軟體介面》《計算機體系結構-量化研究方法》,《超標量處理器設計》,當然不限於這三本,看過其他計算機體系結構和處理器設計的書也可以。
1.2 Functional Description
The features of the OpenSPARC T1 processor include:
■ 8 SPARC V9 CPU cores, with 4 threads per core, for a total of 32 threads #8核,每核4執行緒#
■ 132 Gbytes/sec crossbar interconnect for on-chip communication #132GB/s 的crossbar#
■ 16 Kbytes of primary (Level 1) instruction cache per CPU core #16kB 一級指令快取#
■ 8 Kbytes of primary (Level 1) data cache per CPU core #8kB 一級資料快取#
■ 3 Mbytes of secondary (Level 2) cache – 4 way banked, 12 way associative shared
by all CPU cores #3MB 二級快取,4個bank, 共12路一起共享#
■ 4 DDR-II DRAM controllers – 144-bit interface per channel, 25 GBytes/sec peak
total bandwidth #4個DDR-II DRAM控制器,如果要上Zed Board,可以試著改為DDR3#
■ IEEE 754 compliant floating-point unit (FPU), shared by all CPU cores #浮點處理單元#
■ External interfaces: J-Bus interface (JBI) for I/O – 2.56 Gbytes/sec peak bandwidth, 128-bit
multiplexed address/data bus #外部介面,後面再看#
■ Serial system interface (SSI) for boot PROM #boot PROM的介面,具體支援哪些PROM?#
FIGURE 1-1 shows a block diagram of the OpenSPARC T1 processor illustrating the
various interfaces and integrated components of the chip.
1.3 OpenSPARC T1 Components
This section provides further details about the OpenSPARC T1 components.
1.3.1 SPARC Core
Each SPARC core has hardware support for four threads #支援4執行緒的硬體#.This support consists of a
full register file (with eight register windows) per thread, with most of the address
space identifiers (ASI), ancillary state registers (ASR), and privileged registers
replicated per thread. The four threads share the instruction, the data caches, and the
TLBs. Each instruction cache is 16 Kbytes with a 32-byte line size #I cache 時講#. The data caches are write through, 8 Kbytes, and have a 16-byte line size #D cache 時講#. The TLBs include an autodemap feature which enables the multiple threads to update the TLB without
locking. #四個執行緒共享指令快取,資料快取和TLB,每個執行緒都有單獨的暫存器組,大部分是地址空間標識,輔助狀態暫存器,特權暫存器和TLB(translation lookaside buffers,可以理解為緩衝區),TLB有自動定址功能,能夠使多執行緒不被鎖的更新TLB#
Each SPARC core has single issue, six stage pipeline.#每個核有單發射,6級流水線# These six stages are:
1. Fetch
2. Thread Selection
3. Decode
4. Execute
5. Memory
6. Write Back
FIGURE 1-2 shows the SPARC core pipeline used in the OpenSPARC T1 Processor.
Each SPARC core has the following units:
1. Instruction fetch unit (IFU) includes the following pipeline stages – fetch, thread
selection, and decode. The IFU also includes an instruction cache complex.
2. Execution unit (EXU) includes the execute stage of the pipeline.
3. Load/store unit (LSU) includes memory and writeback stages, and a data cache
complex.
4. Trap logic unit (TLU) includes trap logic and trap program counters.
5. Stream processing unit (SPU) is used for modular arithmetic functions for crypto.
6. Memory management unit (MMU).
7. Floating-point frontend unit (FFU) interfaces to the FPU.
總結: 四個執行緒有各自的暫存器組,如下圖的Strand Instauction Registers ,Register Files ,Store Buffers ,都有四份,其他的部分是公用的,通過每個執行緒的標識和優先順序來選擇對應執行緒的指令和資料。
邊看邊寫,難免有錯誤或理解不到位的地方,歡迎留言指出,討論。關於對應的程式碼還在看,之後會講。