Linux儲存IO棧(4)-- SCSI子系統之概述
概述
Linux SCSI子系統的分層架構:
低層:代表與SCSI的物理介面的實際驅動器,例如各個廠商為其特定的主機介面卡(Host Bus Adapter, HBA)開發的驅動,低層驅動主要作用是發現連線到主機介面卡的scsi裝置,在記憶體中構建scsi子系統所需的資料結構,並提供訊息傳遞介面,將scsi命令的接受與傳送解釋為主機介面卡的操作。
高層: 代表各種scsi裝置型別的驅動,如scsi磁碟驅動,scsi磁帶驅動,高層驅動認領低層驅動發現的scsi裝置,為這些裝置分配名稱,將對裝置的IO轉換為scsi命令,交由低層驅動處理。
中層:包含scsi棧的公共服務函式。高層和低層通過呼叫中層的函式完成其功能,而中層在執行過程中,也需要呼叫高層和低層註冊的回撥函式做一些個性化處理。
Linux SCSI模型
Linux SCSI模型是核心的抽象,主機介面卡連線主機IO匯流排(如PCI匯流排)和儲存IO匯流排(如SCSI匯流排)。一臺計算機可以有多個主機介面卡,而主機介面卡可以控制一條或多條SCSI匯流排,一條匯流排可以有多個目標節點與之相連,並且一個目標節點可以有多個邏輯單元。
在Linux SCSI子系統中,核心中的目標節點(target)對應SCSI磁碟,SCSI磁碟中可以有多個邏輯單元,統一由磁碟控制器控制,這些邏輯單元才是真正作為IO終點的儲存裝置,核心用裝置(device)對邏輯單元進行抽象;核心中的Host對應主機介面卡(物理的HBA/RAID卡,虛擬的iscsi target)
核心使用四元組 來唯一標識一個scsi的邏輯單元,在sysfs中檢視sda磁碟<2:0:0:0>顯示如下:
root@ubuntu16:/home/comet/Costor/bin# ls /sys/bus/scsi/devices/2\:0\:0\:0/block/sda/
alignment_offset device events_poll_msecs integrity removable sda5 subsystem
bdi discard_alignment ext_range power ro size trace
capability events holders queue sda1 slaves uevent
dev events_async inflight range sda2 stat
root@ubuntu16 :/home/comet/Costor/bin# cat /sys/bus/scsi/devices/2\:0\:0\:0/block/sda/dev
8:0
root@ubuntu16:/home/comet/Costor/bin# ll /dev/sda
brw-rw---- 1 root disk 8, 0 Sep 19 11:36 /dev/sda
- host: 主機介面卡的唯一編號。
- channel: 主機介面卡中scsi通道編號,由主機介面卡韌體維護。
- id: 目標節點唯一識別符號。
- lun: 目標節點內邏輯單元編號。
SCSI命令
SCSI 命令是在 Command Descriptor Block (CDB) 中定義的。CDB 包含了用來定義要執行的特定操作的操作程式碼,以及大量特定於操作的引數。
命令 | 用途 |
---|---|
Test unit ready | 查詢裝置是否已經準備好進行傳輸 |
Inquiry | 請求裝置基本資訊 |
Request sense | 請求之前命令的錯誤資訊 |
Read capacity | 請求儲存容量資訊 |
Read | 從裝置讀取資料 |
Write | 向裝置寫入資料 |
Mode sense | 請求模式頁面(裝置引數) |
Mode select | 在模式頁面配置裝置引數 |
藉助大約 60 種可用命令,SCSI 可適用於許多裝置(包括隨機存取裝置,比如磁碟和像磁帶這樣的順序儲存裝置)。SCSI 也提供了專門的命令以訪問箱體服務(比如儲存箱體內部當前的感測和溫度)。
核心資料結構
主機介面卡模板scsi_host_template
主機介面卡模板是相同型號主機介面卡的公共內容,包括請求佇列深度,SCSI命令處理回撥函式,錯誤處理恢復函式。分配主機介面卡結構時,需要使用主機介面卡模板來賦值。在編寫SCSI低層驅動時,第一步便是定義模板scsi_host_template,之後才能有模板生成主機介面卡。
struct scsi_host_template {
struct module *module; //指向使用該模板實現的scsi_host,低層驅動模組。
const char *name; //主機介面卡名稱
int (* detect)(struct scsi_host_template *);
int (* release)(struct Scsi_Host *);
const char *(* info)(struct Scsi_Host *); //返回HBA相關資訊,可選實現
int (* ioctl)(struct scsi_device *dev, int cmd, void __user *arg); //使用者空間ioctl函式的實現,可選實現
#ifdef CONFIG_COMPAT
//通過該函式,支援32位系統的使用者態ioctl函式
int (* compat_ioctl)(struct scsi_device *dev, int cmd, void __user *arg);
#endif
//將scsi命令放進低層驅動的佇列,由中間層呼叫,必須實現
int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);
//以下5個函式是錯誤處理回撥函式,由中間層按照嚴重程度呼叫
int (* eh_abort_handler)(struct scsi_cmnd *); //Abort
int (* eh_device_reset_handler)(struct scsi_cmnd *); //Device Reset
int (* eh_target_reset_handler)(struct scsi_cmnd *); //Target Reset
int (* eh_bus_reset_handler)(struct scsi_cmnd *); //Bus Reset
int (* eh_host_reset_handler)(struct scsi_cmnd *); //Host Reset
//當掃描到新磁碟時呼叫,中間層回撥這個函式中可以分配和初始化低層驅動所需要的結構
int (* slave_alloc)(struct scsi_device *)
//在裝置受到INQUIRY命令後,執行相關的配置操作
int (* slave_configure)(struct scsi_device *);
//在scsi裝置銷燬之前呼叫,中間層回撥用於釋放slave_alloc分配的私有資料
void (* slave_destroy)(struct scsi_device *);
//當發現新的target,中間層呼叫,使用者分配target私有資料
int (* target_alloc)(struct scsi_target *);
//在target被銷燬之前,中間層呼叫,低層驅動實現,用於釋放target_alloc分配的資料
void (* target_destroy)(struct scsi_target *);
//需要自定義掃描target邏輯時,中間層迴圈檢查返回值,直到該函式返回1,表示掃描完成
int (* scan_finished)(struct Scsi_Host *, unsigned long);
//需要自定義掃描target邏輯時,掃描開始前回調
void (* scan_start)(struct Scsi_Host *);
//改變主機介面卡的佇列深度,返回設定的佇列深度
int (* change_queue_depth)(struct scsi_device *, int);
//返回磁碟的BIOS引數,如size, device, list (heads, sectors, cylinders)
int (* bios_param)(struct scsi_device *, struct block_device *,
sector_t, int []);
void (*unlock_native_capacity)(struct scsi_device *);
//在procfs中的讀寫操作回撥
int (*show_info)(struct seq_file *, struct Scsi_Host *);
int (*write_info)(struct Scsi_Host *, char *, int);
//中間層發現scsi命令超時回撥
enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
//通過sysfs屬性reset主機介面卡時,回撥
int (*host_reset)(struct Scsi_Host *shost, int reset_type);
#define SCSI_ADAPTER_RESET 1
#define SCSI_FIRMWARE_RESET 2
const char *proc_name; //在proc檔案系統的名稱
struct proc_dir_entry *proc_dir;
int can_queue; //主機介面卡能同時接受的命令數
int this_id;
/*
* This determines the degree to which the host adapter is capable
* of scatter-gather.
*/ //聚散列表的引數
unsigned short sg_tablesize;
unsigned short sg_prot_tablesize;
/*
* Set this if the host adapter has limitations beside segment count.
*/ //單個scsi命令能夠訪問的扇區最大數量
unsigned int max_sectors;
/*
* DMA scatter gather segment boundary limit. A segment crossing this
* boundary will be split in two.
*/
unsigned long dma_boundary; //DMA聚散段邊界值,超過該值將被切割成兩個
#define SCSI_DEFAULT_MAX_SECTORS 1024
short cmd_per_lun;
/*
* present contains counter indicating how many boards of this
* type were found when we did the scan.
*/
unsigned char present;
/* If use block layer to manage tags, this is tag allocation policy */
int tag_alloc_policy;
/*
* Track QUEUE_FULL events and reduce queue depth on demand.
*/
unsigned track_queue_depth:1;
/*
* This specifies the mode that a LLD supports.
*/
unsigned supported_mode:2; //低層驅動支援的模式(initiator或target)
/*
* True if this host adapter uses unchecked DMA onto an ISA bus.
*/
unsigned unchecked_isa_dma:1;
unsigned use_clustering:1;
/*
* True for emulated SCSI host adapters (e.g. ATAPI).
*/
unsigned emulated:1;
/*
* True if the low-level driver performs its own reset-settle delays.
*/
unsigned skip_settle_delay:1;
/* True if the controller does not support WRITE SAME */
unsigned no_write_same:1;
/*
* True if asynchronous aborts are not supported
*/
unsigned no_async_abort:1;
/*
* Countdown for host blocking with no commands outstanding.
*/
unsigned int max_host_blocked; //主機介面卡傳送佇列的低閥值,允許累計多個命令同時派發
#define SCSI_DEFAULT_HOST_BLOCKED 7
/*
* Pointer to the sysfs class properties for this host, NULL terminated.
*/
struct device_attribute **shost_attrs; //主機介面卡類屬性
/*
* Pointer to the SCSI device properties for this host, NULL terminated.
*/
struct device_attribute **sdev_attrs; //主機介面卡裝置屬性
struct list_head legacy_hosts;
u64 vendor_id;
/*
* Additional per-command data allocated for the driver.
*/ //scsi 命令緩衝池,scsi命令都是預先分配好的,儲存在cmd_pool中
unsigned int cmd_size;
struct scsi_host_cmd_pool *cmd_pool;
/* temporary flag to disable blk-mq I/O path */
bool disable_blk_mq; //禁用通用塊層多佇列模式標誌
};
主機介面卡Scsi_Host
Scsi_Host描述一個SCSI主機介面卡,SCSI主機介面卡通常是一塊基於PCI匯流排的擴充套件卡或是一個SCSI控制器晶片。每個SCSI主機介面卡可以存在多個通道,一個通道實際擴充套件了一條SCSI匯流排。每個通過可以連線多個SCSI目標節點,具體連線數量與SCSI匯流排帶載能力有關,或者受具體SCSI協議的限制。 真實的主機匯流排介面卡是接入主機IO總線上(通常是PCI匯流排),在系統啟動時,會掃描掛載在PCI總線上的裝置,此時會分配主機匯流排介面卡。
Scsi_Host結構包含內嵌通用裝置,將被鏈入SCSI匯流排型別(scsi_bus_type)的裝置連結串列。
struct Scsi_Host {
struct list_head __devices; //裝置連結串列
struct list_head __targets; //目標節點連結串列
struct scsi_host_cmd_pool *cmd_pool; //scsi命令緩衝池
spinlock_t free_list_lock; //保護free_list
struct list_head free_list; /* backup store of cmd structs, scsi命令預先分配的備用命令連結串列 */
struct list_head starved_list; //scsi命令的飢餓連結串列
spinlock_t default_lock;
spinlock_t *host_lock;
struct mutex scan_mutex;/* serialize scanning activity */
struct list_head eh_cmd_q; //執行錯誤的scsi命令的連結串列
struct task_struct * ehandler; /* Error recovery thread. 錯誤恢復執行緒 */
struct completion * eh_action; /* Wait for specific actions on the
host. */
wait_queue_head_t host_wait; //scsi裝置恢復等待佇列
struct scsi_host_template *hostt; //主機介面卡模板
struct scsi_transport_template *transportt; //指向SCSI傳輸層模板
/*
* Area to keep a shared tag map (if needed, will be
* NULL if not).
*/
union {
struct blk_queue_tag *bqt;
struct blk_mq_tag_set tag_set; //SCSI支援多佇列時使用
};
//已經派發給主機介面卡(低層驅動)的scsi命令數
atomic_t host_busy; /* commands actually active on low-level */
atomic_t host_blocked; //阻塞的scsi命令數
unsigned int host_failed; /* commands that failed.
protected by host_lock */
unsigned int host_eh_scheduled; /* EH scheduled without command */
unsigned int host_no; /* Used for IOCTL_GET_IDLUN, /proc/scsi et al. 系統內唯一標識 */
/* next two fields are used to bound the time spent in error handling */
int eh_deadline;
unsigned long last_reset; //記錄上次reset時間
/*
* These three parameters can be used to allow for wide scsi,
* and for host adapters that support multiple busses
* The last two should be set to 1 more than the actual max id
* or lun (e.g. 8 for SCSI parallel systems).
*/
unsigned int max_channel; //主機介面卡的最大通道編號
unsigned int max_id; //主機介面卡目標節點最大編號
u64 max_lun; //主機介面卡lun最大編號
unsigned int unique_id;
/*
* The maximum length of SCSI commands that this host can accept.
* Probably 12 for most host adapters, but could be 16 for others.
* or 260 if the driver supports variable length cdbs.
* For drivers that don't set this field, a value of 12 is
* assumed.
*/
unsigned short max_cmd_len; //主機介面卡可以接受的最長的SCSI命令
//下面這段在scsi_host_template中也有,由template中的欄位賦值
int this_id;
int can_queue;
short cmd_per_lun;
short unsigned int sg_tablesize;
short unsigned int sg_prot_tablesize;
unsigned int max_sectors;
unsigned long dma_boundary;
/*
* In scsi-mq mode, the number of hardware queues supported by the LLD.
*
* Note: it is assumed that each hardware queue has a queue depth of
* can_queue. In other words, the total queue depth per host
* is nr_hw_queues * can_queue.
*/
unsigned nr_hw_queues; //在scsi-mq模式中,低層驅動所支援的硬體佇列的數量
/*
* Used to assign serial numbers to the cmds.
* Protected by the host lock.
*/
unsigned long cmd_serial_number; //指向命令序列號
unsigned active_mode:2; //標識是initiator或target
unsigned unchecked_isa_dma:1;
unsigned use_clustering:1;
/*
* Host has requested that no further requests come through for the
* time being.
*/
unsigned host_self_blocked:1; //表示低層驅動要求阻塞該主機介面卡,此時中間層不會繼續派發命令到主機介面卡佇列中
/*
* Host uses correct SCSI ordering not PC ordering. The bit is
* set for the minority of drivers whose authors actually read
* the spec ;).
*/
unsigned reverse_ordering:1;
/* Task mgmt function in progress */
unsigned tmf_in_progress:1; //任務管理函式正在執行
/* Asynchronous scan in progress */
unsigned async_scan:1; //非同步掃描正在執行
/* Don't resume host in EH */
unsigned eh_noresume:1; //在錯誤處理過程不恢復主機介面卡
/* The controller does not support WRITE SAME */
unsigned no_write_same:1;
unsigned use_blk_mq:1; //是否使用SCSI多佇列模式
unsigned use_cmd_list:1;
/* Host responded with short (<36 bytes) INQUIRY result */
unsigned short_inquiry:1;
/*
* Optional work queue to be utilized by the transport
*/
char work_q_name[20]; //被scsi傳輸層使用的工作佇列
struct workqueue_struct *work_q;
/*
* Task management function work queue
*/
struct workqueue_struct *tmf_work_q; //任務管理函式工作佇列
/* The transport requires the LUN bits NOT to be stored in CDB[1] */
unsigned no_scsi2_lun_in_cdb:1;
/*
* Value host_blocked counts down from
*/
unsigned int max_host_blocked; //在派發佇列中累計命令達到這個數值,才開始喚醒主機介面卡
/* Protection Information */
unsigned int prot_capabilities;
unsigned char prot_guard_type;
/*
* q used for scsi_tgt msgs, async events or any other requests that
* need to be processed in userspace
*/
struct request_queue *uspace_req_q; //需要在使用者空間處理的scsi_tgt訊息、非同步事件或其他請求的請求佇列
/* legacy crap */
unsigned long base;
unsigned long io_port; //I/O埠編號
unsigned char n_io_port;
unsigned char dma_channel;
unsigned int irq;
enum scsi_host_state shost_state; //狀態
/* ldm bits */ //shost_gendev: 內嵌通用裝置,SCSI裝置通過這個域鏈入SCSI匯流排型別(scsi_bus_type)的裝置連結串列
struct device shost_gendev, shost_dev;
//shost_dev: 內嵌類裝置, SCSI裝置通過這個域鏈入SCSI主機介面卡型別(shost_class)的裝置連結串列
/*
* List of hosts per template.
*
* This is only for use by scsi_module.c for legacy templates.
* For these access to it is synchronized implicitly by
* module_init/module_exit.
*/
struct list_head sht_legacy_list;
/*
* Points to the transport data (if any) which is allocated
* separately
*/
void *shost_data; //指向獨立分配的傳輸層資料,由SCSI傳輸層使用
/*
* Points to the physical bus device we'd use to do DMA
* Needed just in case we have virtual hosts.
*/
struct device *dma_dev;
/*
* We should ensure that this is aligned, both for better performance
* and also because some compilers (m68k) don't automatically force
* alignment to a long boundary.
*/ //主機介面卡專有資料
unsigned long hostdata[0] /* Used for storage of host specific stuff */
__attribute__ ((aligned (sizeof(unsigned long))));
};
目標節點scsi_target
scsi_target結構中有一個內嵌驅動模型裝置,被鏈入SCSI匯流排型別scsi_bus_type的裝置連結串列。
struct scsi_target {
struct scsi_device *starget_sdev_user; //指向正在進行I/O的scsi裝置,沒有IO則指向NULL
struct list_head siblings; //鏈入主機介面卡target連結串列中
struct list_head devices; //屬於該target的device連結串列
struct device dev; //通用裝置,用於加入裝置驅動模型
struct kref reap_ref; /* last put renders target invisible 本結構的引用計數 */
unsigned int channel; //該target所在的channel號
unsigned int id; /* target id ... replace
* scsi_device.id eventually */
unsigned int create:1; /* signal that it needs to be added */
unsigned int single_lun:1; /* Indicates we should only
* allow I/O to one of the luns
* for the device at a time. */
unsigned int pdt_1f_for_no_lun:1; /* PDT = 0x1f
* means no lun present. */
unsigned int no_report_luns:1; /* Don't use
* REPORT LUNS for scanning. */
unsigned int expecting_lun_change:1; /* A device has reported
* a 3F/0E UA, other devices on
* the same target will also. */
/* commands actually active on LLD. */
atomic_t target_busy;
atomic_t target_blocked; //當前阻塞的命令數
/*
* LLDs should set this in the slave_alloc host template callout.
* If set to zero then there is not limit.
*/
unsigned int can_queue; //同時處理的命令數
unsigned int max_target_blocked; //阻塞命令數閥值
#define SCSI_DEFAULT_TARGET_BLOCKED 3
char scsi_level; //支援的SCSI規範級別
enum scsi_target_state state; //target狀態
void *hostdata; /* available to low-level driver */
unsigned long starget_data[0]; /* for the transport SCSI傳輸層(中間層)使用 */
/* starget_data must be the last element!!!! */
} __attribute__((aligned(sizeof(unsigned long))));
邏輯裝置scsi_device
scsi_device描述scsi邏輯裝置,代表scsi磁碟的邏輯單元lun。scsi_device描述符所代表的裝置可能是另一臺儲存裝置上的SATA/SAS/SCSI磁碟或SSD。作業系統在掃描到連線在主機介面卡上的邏輯裝置時,建立scsi_device結構,用於scsi高層驅動和該裝置通訊。
struct scsi_device {
struct Scsi_Host *host; //所歸屬的主機匯流排介面卡
struct request_queue *request_queue; //請求佇列
/* the next two are protected by the host->host_lock */
struct list_head siblings; /* list of all devices on this host */ //鏈入主機匯流排介面卡裝置連結串列
struct list_head same_target_siblings; /* just the devices sharing same target id */ //鏈入target的裝置連結串列
atomic_t device_busy; /* commands actually active on LLDD */
atomic_t device_blocked; /* Device returned QUEUE_FULL. */
spinlock_t list_lock;
struct list_head cmd_list; /* queue of in use SCSI Command structures */
struct list_head starved_entry; //鏈入主機介面卡的"飢餓"連結串列
struct scsi_cmnd *current_cmnd; /* currently active command */ //當前正在執行的命令
unsigned short queue_depth; /* How deep of a queue we want */
unsigned short max_queue_depth; /* max queue depth */
unsigned short last_queue_full_depth; /* These two are used by */
unsigned short last_queue_full_count; /* scsi_track_queue_full() */
unsigned long last_queue_full_time; /* last queue full time */
unsigned long queue_ramp_up_period; /* ramp up period in jiffies */
#define SCSI_DEFAULT_RAMP_UP_PERIOD (120 * HZ)
unsigned long last_queue_ramp_up; /* last queue ramp up time */
unsigned int id, channel; //scsi_device所屬的target id和所在channel通道號
u64 lun; //該裝置的lun編號
unsigned int manufacturer; /* Manufacturer of device, for using 製造商
* vendor-specific cmd's */
unsigned sector_size; /* size in bytes 硬體的扇區大小 */
void *hostdata; /* available to low-level driver 專有資料 */
char type; //SCSI裝置型別
char scsi_level; //所支援SCSI規範的版本號,由INQUIRY命令獲得
char inq_periph_qual; /* PQ from INQUIRY data */
unsigned char inquiry_len; /* valid bytes in 'inquiry' */
unsigned char * inquiry; /* INQUIRY response data */
const char * vendor; /* [back_compat] point into 'inquiry' ... */
const char * model; /* ... after scan; point to static string */
const char * rev; /* ... "nullnullnullnull" before scan */
#define SCSI_VPD_PG_LEN 255
int vpd_pg83_len; //sense命令 0x83
unsigned char *vpd_pg83;
int vpd_pg80_len; //sense命令 0x80
unsigned char *vpd_pg80;
unsigned char current_tag; /* current tag */
struct scsi_target *sdev_target; /* used only for single_lun */
unsigned int sdev_bflags; /* black/white flags as also found in
* scsi_devinfo.[hc]. For now used only to
* pass settings from slave_alloc to scsi
* core. */
unsigned int eh_timeout; /* Error handling timeout */
unsigned removable:1;
unsigned changed:1; /* Data invalid due to media change */
unsigned busy:1; /* Used to prevent races */
unsigned lockable:1; /* Able to prevent media removal */
unsigned locked:1; /* Media removal disabled */
unsigned borken:1; /* Tell the Seagate driver to be
* painfully slow on this device */
unsigned disconnect:1; /* can disconnect */
unsigned soft_reset:1; /* Uses soft reset option */
unsigned sdtr:1; /* Device supports SDTR messages 支援同步資料傳輸 */
unsigned wdtr:1; /* Device supports WDTR messages 支援16位寬資料傳輸*/
unsigned ppr:1; /* Device supports PPR messages 支援PPR(並行協議請求)訊息*/
unsigned tagged_supported:1; /* Supports SCSI-II tagged queuing */
unsigned simple_tags:1; /* simple queue tag messages are enabled */
unsigned was_reset:1; /* There was a bus reset on the bus for
* this device */
unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN
* because we did a bus reset. */
unsigned use_10_for_rw:1; /* first try 10-byte read / write */
unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */
unsigned no_report_opcodes:1; /* no REPORT SUPPORTED OPERATION CODES */
unsigned no_write_same:1; /* no WRITE SAME command */
unsigned use_16_for_rw:1; /* Use read/write(16) over read/write(10) */
unsigned skip_ms_page_8:1; /* do not use MODE SENSE page 0x08 */
unsigned skip_ms_page_3f:1; /* do not use MODE SENSE page 0x3f */
unsigned skip_vpd_pages:1; /* do not read VPD pages */
unsigned try_vpd_pages:1; /* attempt to read VPD pages */
unsigned use_192_bytes_for_3f:1; /* ask for 192 bytes from page 0x3f */
unsigned no_start_on_add:1; /* do not issue start on add */
unsigned allow_restart:1; /* issue START_UNIT in error handler */
unsigned manage_start_stop:1; /* Let HLD (sd) manage start/stop */
unsigned start_stop_pwr_cond:1; /* Set power cond. in START_STOP_UNIT */
unsigned no_uld_attach:1; /* disable connecting to upper level drivers */
unsigned select_no_atn:1;
unsigned fix_capacity:1; /* READ_CAPACITY is too high by 1 */
unsigned guess_capacity:1; /* READ_CAPACITY might be too high by 1 */
unsigned retry_hwerror:1; /* Retry HARDWARE_ERROR */
unsigned last_sector_bug:1; /* do not use multisector accesses on
SD_LAST_BUGGY_SECTORS */
unsigned no_read_disc_info:1; /* Avoid READ_DISC_INFO cmds */
unsigned no_read_capacity_16:1; /* Avoid READ_CAPACITY_16 cmds */
unsigned try_rc_10_first:1; /* Try READ_CAPACACITY_10 first */
unsigned is_visible:1; /* is the device visible in sysfs */
unsigned wce_default_on:1; /* Cache is ON by default */
unsigned no_dif:1; /* T10 PI (DIF) should be disabled */
unsigned broken_fua:1; /* Don't set FUA bit */
unsigned lun_in_cdb:1; /* Store LUN bits in CDB[1] */
atomic_t disk_events_disable_depth; /* disable depth for disk events */
DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */
DECLARE_BITMAP(pending_events, SDEV_EVT_MAXBITS); /* pending events */
struct list_head event_list; /* asserted events */
struct work_struct event_work;
unsigned int max_device_blocked; /* what device_blocked counts down from */
#define SCSI_DEFAULT_DEVICE_BLOCKED 3
atomic_t iorequest_cnt;
atomic_t iodone_cnt;
atomic_t ioerr_cnt;
struct device sdev_gendev, //內嵌通用裝置, 鏈入scsi匯流排型別(scsi_bus_type)的裝置連結串列
sdev_dev; //內嵌類裝置,鏈入scsi裝置類(sdev_class)的裝置連結串列
struct execute_work ew; /* used to get process context on put */
struct work_struct requeue_work;
struct scsi_device_handler *handler; //自定義裝置處理函式
void *handler_data;
enum scsi_device_state sdev_state; //scsi裝置狀態
unsigned long sdev_data[0]; //scsi傳輸層使用
} __attribute__((aligned(sizeof(unsigned long))));
核心定義的SCSI命令結構scsi_cmnd
scsi_cmnd結構有SCSI中間層建立,傳遞到SCSI低層驅動。每個IO請求會被建立一個scsi_cnmd,但scsi_cmnd並不一定是時IO請求。scsi_cmnd最終轉化成一個具體的SCSI命令。除了命令描述塊之外,scsi_cmnd包含更豐富的資訊,包括資料緩衝區、感測資料緩衝區、完成回撥函式以及所關聯的塊裝置驅動層請求等,是SCSI中間層執行SCSI命令的上下文。
struct scsi_cmnd {
struct scsi_device *device; //指向命令所屬SCSI裝置的描述符的指標
struct list_head list; /* scsi_cmnd participates in queue lists 鏈入scsi裝置的命令連結串列 */
struct list_head eh_entry; /* entry for the host eh_cmd_q */
struct delayed_work abort_work;
int eh_eflags; /* Used by error handlr */
/*
* A SCSI Command is assigned a nonzero serial_number before passed
* to the driver's queue command function. The serial_number is
* cleared when scsi_done is entered indicating that the command
* has been completed. It is a bug for LLDDs to use this number
* for purposes other than printk (and even that is only useful
* for debugging).
*/
unsigned long serial_number; //scsi命令的唯一序號
/*
* This is set to jiffies as it was when the command was first
* allocated. It is used to time how long the command has
* been outstanding
*/
unsigned long jiffies_at_alloc; //分配時的jiffies, 用於計算命令處理時間
int retries; //命令重試次數
int allowed; //允許的重試次數
unsigned char prot_op; //保護操作(DIF和DIX)
unsigned char prot_type; //DIF保護型別
unsigned char prot_flags;
unsigned short cmd_len; //命令長度
enum dma_data_direction sc_data_direction; //命令傳輸方向
/* These elements define the operation we are about to perform */
unsigned char *cmnd; //scsi規範格式的命令字串
/* These elements define the operation we ultimately want to perform */
struct scsi_data_buffer sdb; //scsi命令資料緩衝區
struct scsi_data_buffer *prot_sdb; //scsi命令保護資訊緩衝區
unsigned underflow; /* Return error if less than
this amount is transferred */
unsigned transfersize; /* How much we are guaranteed to //傳輸單位
transfer with each SCSI transfer
(ie, between disconnect /
reconnects. Probably == sector
size */
struct request *request; /* The command we are 通用塊層的請求描述符
working on */
#define SCSI_SENSE_BUFFERSIZE 96
unsigned char *sense_buffer; //scsi命令感測資料緩衝區
/* obtained by REQUEST SENSE when
* CHECK CONDITION is received on original
* command (auto-sense) */
/* Low-level done function - can be used by low-level driver to point
* to completion function. Not used by mid/upper level code. */
void (*scsi_done) (struct scsi_cmnd *); //scsi命令在低層驅動完成時,回撥
/*
* The following fields can be written to by the host specific code.
* Everything else should be left alone.
*/
struct scsi_pointer SCp; /* Scratchpad used by some host adapters */
unsigned char *host_scribble; /* The host adapter is allowed to
* call scsi_malloc and get some memory
* and hang it here. The host adapter
* is also expected to call scsi_free
* to release this memory. (The memory
* obtained by scsi_malloc is guaranteed
* to be at an address < 16Mb). */
int result; /* Status code from lower level driver */
int flags; /* Command flags */
unsigned char tag; /* SCSI-II queued command tag */
};
驅動scsi_driver
struct scsi_driver {
struct device_driver gendrv; // "繼承"device_driver
void (*rescan)(struct device *); //重新掃描前呼叫的回撥函式
int (*init_command)(struct scsi_cmnd *);
void (*uninit_command)(struct scsi_cmnd *);
int (*done)(struct scsi_cmnd *); //當低層驅動完成一個scsi命令時呼叫,用於計算已經完成的位元組數
int (*eh_action)(struct scsi_cmnd *, int); //錯誤處理回撥
};
裝置模型
- scsi_bus_type: scsi子系統匯流排型別
struct bus_type scsi_bus_type = {
.name = "scsi", // 對應/sys/bus/scsi
.match = scsi_bus_match,
.uevent = scsi_bus_uevent,
#ifdef CONFIG_PM
.pm = &scsi_bus_pm_ops,
#endif
};
EXPORT_SYMBOL_GPL(scsi_bus_type);
- shost_class: scsi子系統類
static struct class shost_class = {
.name = "scsi_host", // 對應/sys/class/scsi_host
.dev_release = scsi_host_cls_release,
};
初始化過程
作業系統啟動時,會載入scsi子系統,入口函式是init_scsi,使用subsys_initcall定義:
static int __init init_scsi(void)
{
int error;
error = scsi_init_queue(); //初始化聚散列表所需要的儲存池
if (error)
return error;
error = scsi_init_procfs(); //初始化procfs中與scsi相關的目錄項
if (error)
goto cleanup_queue;
error = scsi_init_devinfo();//設定scsi動態裝置資訊列表
if (error)
goto cleanup_procfs;
error = scsi_init_hosts(); //註冊shost_class類,在/sys/class/目錄下建立scsi_host子目錄
if (error)
goto cleanup_devlist;
error = scsi_init_sysctl(); //註冊SCSI系統控制表
if (error)
goto cleanup_hosts;
error = scsi_sysfs_register(); //註冊scsi_bus_type匯流排型別和sdev_class類
if (error)
goto cleanup_sysctl;
scsi_netlink_init(); //初始化SCSI傳輸netlink介面
printk(KERN_NOTICE "SCSI subsystem initialized\n");
return 0;
cleanup_sysctl:
scsi_exit_sysctl();
cleanup_hosts:
scsi_exit_hosts();
cleanup_devlist:
scsi_exit_devinfo();
cleanup_procfs:
scsi_exit_procfs();
cleanup_queue:
scsi_exit_queue();
printk(KERN_ERR "SCSI subsystem failed to initialize, error = %d\n",
-error);
return error;
}
scsi_init_hosts函式初始化scsi子系統主機介面卡所屬的類shost_class:
int scsi_init_hosts(void)
{
return class_register(&shost_class);
}
scsi_sysfs_register函式初始化scsi子系統匯流排型別scsi_bus_type和裝置所屬的類sdev_class類:
int scsi_sysfs_register(void)
{
int error;
error = bus_register(&scsi_bus_type);
if (!error) {
error = class_register(&sdev_class);
if (error)
bus_unregister(&scsi_bus_type);
}
return error;
}
scsi低層驅動是面向主機介面卡的,低層驅動被載入時,需要新增主機介面卡。主機介面卡新增有兩種方式:1.在PCI子系統掃描掛載驅動時新增;2.手動方式新增。所有基於硬體PCI介面的主機介面卡都採用第一種方式。新增主機介面卡包括兩個步驟:
1. 分別主機介面卡資料結構sc