openstack中有關虛擬機器cpu繫結總結
The flavor extra specs will be enhanced to support twonew parameters
- hw:cpu_policy=shared|dedicated
- hw:cpu_threads_policy=avoid|separate|isolate|prefer
If the policy is set to ‘shared’ no change will be madecompared to the current default guest CPU placement policy. The guest vCPUswill be allowed to freely float across host pCPUs, albeit potentiallyconstrained by NUMA policy.
The threads policy will control how the scheduler / virtdriver place guests wrt CPU threads. It will only apply if the sheduler policyis ‘dedicated’
· avoid: the scheduler will not place the guest on a hostwhich has hyperthreads.
· separate: if the host has threads, each vCPU will beplaced on a different core. ie no two vCPUs will be placed on thread siblings
· isolate: if the host has threads, each vCPU will beplaced on a different coreand no vCPUs from other guests will be able to be placedon the same core. ie one thread sibling is guaranteed to always be unused,
· prefer: if the host has threads, vCPU will be placed onthe same core, so they are thread siblings.
The image metadata properties will also allow specification of the threads policy
- hw_cpu_threads_policy=avoid|separate|isolate|prefer
This will only be honoured if the flavor does not already have a threads policy set. This ensures the cloud administrator can have absolute control over threads policy if desired.
The scheduler will have to be enhanced so that it considers the usage of CPUs by existing guests.Use of a dedicated CPU policy will have to be accompanied by the setup of aggregates to split the hosts into two groups, one allowing overcommit of shared pCPUs and the other only allowing dedicated CPU guests.ie we do not want a situation with dedicated CPU and shared CPU guests on the same host. It is likely that the administrator will already need to setup host aggregates for the purpose of using huge pages for guest RAM. The same grouping will be usable forboth dedicated RAM (via huge pages) and dedicated CPUs (via pinning).
The compute host already has a notion of CPU sockets which are reserved for execution of base operating system services(vcpu_pin_set). This facility will be preserved unchanged. ie dedicated CPU guests will only be placed on CPUs which are not marked as reserved for the base OS.
Note that exact vCPU to pCPU pinning is not exposed to the user as this would require them to have direct knowledge of the host pCPU layout.Instead they request that the instance receive "dedicated" CPU resourcing and Nova handles allocation of pCPUs
也就是說這個所謂的繫結,並不是讓使用者顯式的將一個vcpu繫結到某一物理cpu上,openstack不會暴露給使用者物理cpu的layout資訊;它的使用只是由使用者指定繫結選項dedicated,並制定繫結策略,由nova來通過一系列排程具體選擇繫結某個vcpu到某一pcpu上。使用方法一般是建兩個host-aggregate,一個叫cpu_pinning,一個叫normal,兩個aggregate加入不同物理機,有繫結需求的虛機使用cpu_pinning這個aggregate中的物理機建虛機。不會將有繫結需求和沒有繫結需求的cpu放在同一個物理機上
Example usage:
* Create a host aggregate and add set metadata on it toindicate it is to be used for pinning, 'pinned' is used
for the example but any key value can beused. The same key must used be used in later steps though:
$ nova aggregate-create cpu_pinning
$ novaaggregate-set-metadata cpu_pinning pinned=true
Foraggregates/flavors that wont be dedicated set pinned=false:
$ nova aggregate-create normal
$ nova aggregate-set-metadata normal pinned=false
Before creating the new flavor forperformance intensive instances update all existing flavors so that their extraspecifications match them to the compute hosts in the normal aggregate.ieset all existing flavors to avoid this aggregate:
$ for FLAVOR in `nova flavor-list | cut -f 2 -d ' ' |grep -o [0-9]*`; \
do nova flavor-key ${FLAVOR} set \
"aggregate_instance_extra_specs:pinned"="false";\
done
Createa new flavor for performance intensive instances. The differences in behaviourbetween the two will be the result of the metadata we add to the new flavorshortly.
$ nova flavor-create pinned.medium 6 2048 20 2
Set the hw:cpy_policy flavorextra specification to dedicated. This denotes that all instances createdusing this flavor willrequirededicated compute resources and be pinned accordingly.
$ nova flavor-key 6 set hw:cpu_policy=dedicated
Set the aggregate_instance_extra_specs:pinned flavorextra specification to true. This denotes that all instances created usingthis flavor will be sent to hosts in host aggregates with pinned=true intheir aggregate metadata:
$ nova flavor-key 6 setaggregate_instance_extra_specs:pinned=true
Finally, we must add some hosts to our performance hostaggregate. Hosts that are not intended to be targets for pinned instancesshould be added to the normal host aggregate(see nova host-list toget the host name(s)):
$ nova aggregate-add-host cpu_pinning compute1.nova
$ nova aggregate-add-host normal compute2.nova
Scheduler Configuration
On each node where the OpenStack ComputeScheduler (openstack-nova-scheduler) runs, edit /etc/nova/nova.conf. Add the AggregateInstanceExtraSpecFilter andNUMATopologyFilter values to the list of scheduler_default_filters.These filters are used to segregate the compute nodes that can be used for CPUpinning from those that can not and to apply NUMA aware scheduling rules whenlaunching instances: scheduler_default_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,CoreFilter,NUMATopologyFilter,AggregateInstanceExtraSpecsFilter
Once the change has been applied, restartthe openstack-nova-scheduler service:
# systemctl restart openstack-nova-scheduler
After the above - with a normal (non-adminuser) try to boot an instance with the newly created flavor:
$ nova boot --image fedora --flavor 6 test_pinning
Confirm the instance has succesfully bootedand that it's vCPU's are pinned to _a single_ host CPU by observing the<cputune> element of the generated domain XML:
# virsh list
Id Name State
----------------------------------------------------
2 instance-00000001 running
# virsh dumpxml instance-00000001
...
<vcpu placement='static' cpuset='2-3'>2</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='2'/>
<vcpupin vcpu='1' cpuset='3'/>
</cputune>
The resultant output will be quite long,but there are some key elements related to NUMA layout and vCPU pinning tofocus on:
- As you might expect the vCPU placement for the 2 vCPUs remains static though a cpuset range is no longer specified alongside it – instead the more specific placement definition defined later on are used:
<vcpuplacement='static'>2</vcpu>
- The vcpupin, and emulatorpin elements have been added. These pin the virtual machine instance’s vCPU cores and the associated emulator threads respectively to physical host CPU cores. In the current implementation the emulator threads are pinned to the union of all physical CPU cores associated with the guest (physical CPU cores 2-3).
<cputune>
<vcpupinvcpu='0' cpuset='2'/>
<vcpupinvcpu='1' cpuset='3'/>
<emulatorpincpuset='2-3'/>
</cputune>
三.結合一、二
這樣可以做到既給物理機留核,又可以讓虛機使用繫結的cpu。步驟就是把一和二結合起來。