1. 程式人生 > 實用技巧 >Codeforces 605E. Intergalaxy Trips 題解

Codeforces 605E. Intergalaxy Trips 題解

為了能夠獲取到Docker容器的執行狀態,使用者可以通過Docker的stats命令獲取到當前主機上執行容器的統計資訊,可以檢視容器的CPU利用率、記憶體使用量、網路IO總量以及磁碟IO總量等資訊。

除了使用命令以外,使用者還可以通過Docker提供的HTTP API檢視容器詳細的監控統計資訊。

CAdvisor是Google開源的一款用於展示和分析容器執行狀態的視覺化工具。通過在主機上執行CAdvisor使用者可以輕鬆的獲取到當前主機上容器的執行統計資訊,並以圖表的形式向用戶展示。
在本地執行CAdvisor也非常簡單,直接執行一下命令即可:

docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --publish=8080:8080 \
  --detach=true \
  --name=cadvisor \
  google/cadvisor:latest

但是因為主機的8080埠被佔用了,所以把上面的命令修改成如下的:

docker run \
  --volume=/:/rootfs:ro \
  --volume=/var/run:/var/run:rw \
  --volume=/sys:/sys:ro \
  --volume=/var/lib/docker/:/var/lib/docker:ro \
  --publish=9095:9095 \
  --detach=true \
  --name=cadvisor \
  google/cadvisor:latest

但是啟動後進行檢視會有倆埠存在,一個時8080,另一個是9095.

通過如下步驟登陸到docker容器中檢視命令的選項,會有一個-port

引數,並且官網中也有明確的說明:

但是在使用的時候,卻沒法使用這個引數。

因此放棄使用docker方式部署,改用二進位制的方式。

進入容器中檢視命令選項

# docker exec -it cadvisor /bin/sh
/ # cd /usr/bin/
/usr/bin # ./cadvisor --help
Usage of ./cadvisor:
  -allow_dynamic_housekeeping
        Whether to allow the housekeeping interval to be dynamic (default true)
  -alsologtostderr
        log to standard error as well as files
  -application_metrics_count_limit int
        Max number of application metrics to store (per container) (default 100)
  -boot_id_file string
        Comma-separated list of files to check for boot-id. Use the first one that exists. (default "/proc/sys/kernel/random/boot_id")
  -bq_account string
        Service account email
  -bq_credentials_file string
        Credential Key file (pem)
  -bq_id string
        Client ID
  -bq_project_id string
        Bigquery project ID
  -bq_secret string
        Client Secret (default "notasecret")
  -collector_cert string
        Collector's certificate, exposed to endpoints for certificate based authentication.
  -collector_key string
        Key for the collector's certificate
  -container_hints string
        location of the container hints file (default "/etc/cadvisor/container_hints.json")
  -containerd string
        containerd endpoint (default "unix:///var/run/containerd.sock")
  -disable_metrics metrics
        comma-separated list of metrics to be disabled. Options are 'disk', 'network', 'tcp', 'udp', 'percpu', 'sched', 'process'. Note: tcp and udp are disabled by default due to high CPU usage. (default process,tcp,udp,sched)
  -docker string
        docker endpoint (default "unix:///var/run/docker.sock")
  -docker-tls
        use TLS to connect to docker
  -docker-tls-ca string
        path to trusted CA (default "ca.pem")
  -docker-tls-cert string
        path to client certificate (default "cert.pem")
  -docker-tls-key string
        path to private key (default "key.pem")
  -docker_env_metadata_whitelist string
        a comma-separated list of environment variable keys that needs to be collected for docker containers
  -docker_only
        Only report docker containers in addition to root stats
  -docker_root string
        DEPRECATED: docker root is read from docker info (this is a fallback, default: /var/lib/docker) (default "/var/lib/docker")
  -enable_load_reader
        Whether to enable cpu load reader
  -event_storage_age_limit string
        Max length of time for which to store events (per type). Value is a comma separated list of key values, where the keys are event types (e.g.: creation, oom) or "default" and the value is a duration. Default is applied to all non-specified event types (default "default=24h")
  -event_storage_event_limit string
        Max number of events to store (per type). Value is a comma separated list of key values, where the keys are event types (e.g.: creation, oom) or "default" and the value is an integer. Default is applied to all non-specified event types (default "default=100000")
  -global_housekeeping_interval duration
        Interval between global housekeepings (default 1m0s)
  -housekeeping_interval duration
        Interval between container housekeepings (default 1s)
  -http_auth_file string
        HTTP auth file for the web UI
  -http_auth_realm string
        HTTP auth realm for the web UI (default "localhost")
  -http_digest_file string
        HTTP digest file for the web UI
  -http_digest_realm string
        HTTP digest file for the web UI (default "localhost")
  -listen_ip string
        IP to listen on, defaults to all IPs
  -log_backtrace_at value
        when logging hits line file:N, emit a stack trace
  -log_cadvisor_usage
        Whether to log the usage of the cAdvisor container
  -log_dir string
        If non-empty, write log files in this directory
  -log_file string
        If non-empty, use this log file
  -logtostderr
        log to standard error instead of files
  -machine_id_file string
        Comma-separated list of files to check for machine-id. Use the first one that exists. (default "/etc/machine-id,/var/lib/dbus/machine-id")
  -max_housekeeping_interval duration
        Largest interval to allow between container housekeepings (default 1m0s)
  -max_procs int
        max number of CPUs that can be used simultaneously. Less than 1 for default (number of cores).
  -mesos_agent string
        Mesos agent address (default "127.0.0.1:5051")
  -mesos_agent_timeout duration
        Mesos agent timeout (default 10s)
  -port int
        port to listen (default 8080)
  -profiling
        Enable profiling via web interface host:port/debug/pprof/
  -prometheus_endpoint string
        Endpoint to expose Prometheus metrics on (default "/metrics")
  -skip_headers
        If true, avoid header prefixes in the log messages
  -stderrthreshold value
        logs at or above this threshold go to stderr (default 2)
  -storage_driver driver
        Storage driver to use. Data is always cached shortly in memory, this controls where data is pushed besides the local cache. Empty means none. Options are: <empty>, bigquery, elasticsearch, influxdb, kafka, redis, statsd, stdout
  -storage_driver_buffer_duration duration
        Writes in the storage driver will be buffered for this duration, and committed to the non memory backends as a single transaction (default 1m0s)
  -storage_driver_db string
        database name (default "cadvisor")
  -storage_driver_es_enable_sniffer
        ElasticSearch uses a sniffing process to find all nodes of your cluster by default, automatically
  -storage_driver_es_host string
        ElasticSearch host:port (default "http://localhost:9200")
  -storage_driver_es_index string
        ElasticSearch index name (default "cadvisor")
  -storage_driver_es_type string
        ElasticSearch type name (default "stats")
  -storage_driver_host string
        database host:port (default "localhost:8086")
  -storage_driver_influxdb_retention_policy string
        retention policy
  -storage_driver_kafka_broker_list string
        kafka broker(s) csv (default "localhost:9092")
  -storage_driver_kafka_ssl_ca string
        optional certificate authority file for TLS client authentication
  -storage_driver_kafka_ssl_cert string
        optional certificate file for TLS client authentication
  -storage_driver_kafka_ssl_key string
        optional key file for TLS client authentication
  -storage_driver_kafka_ssl_verify
        verify ssl certificate chain (default true)
  -storage_driver_kafka_topic string
        kafka topic (default "stats")
  -storage_driver_password string
        database password (default "root")
  -storage_driver_secure
        use secure connection with database
  -storage_driver_table string
        table name (default "stats")
  -storage_driver_user string
        database username (default "root")
  -storage_duration duration
        How long to keep data stored (Default: 2min). (default 2m0s)
  -store_container_labels
        convert container labels and environment variables into labels on prometheus metrics for each container. If flag set to false, then only metrics exported are container name, first alias, and image name (default true)
  -v value
        log level for V logs
  -version
        print cAdvisor version and exit
  -vmodule value
        comma-separated list of pattern=N settings for file-filtered logging

使用二進位制方式部署

cd /home/cadvisor-0.37.0
wget https://github.com/google/cadvisor/releases/download/v0.37.0/cadvisor
# 普通本地執行:./cadvisor  -port=8080 &>>/var/log/cadvisor.log

使用service服務管理程式

# chown -R prometheus:prometheus /home/cadvisor-0.37.0
# chmod -R 777 /home/cadvisor-0.37.0   #防止因為selinux出現這個啟動錯誤:Failed at step EXEC spawning /home/cadvisor-0.37.0/cadvisor: Permission denied

# vim /usr/lib/systemd/system/cadvisor.service
[Unit]
Description=cadvisor
Documentation=https://github.com/google/cadvisor/tree/master/docs
After=network.target

[Service]
Type=simple
User=prometheus
ExecStart=/home/cadvisor-0.37.0/cadvisor -port 9096
Restart=on-failure

[Install]
WantedBy=multi-user.target

通過訪問http://localhost:9096可以檢視,當前主機上容器的執行狀態,如下所示:

下面表格中列舉了一些CAdvisor中獲取到的典型監控指標:

指標名稱 型別 含義
container_cpu_load_average_10s gauge 過去10秒容器CPU的平均負載
container_cpu_usage_seconds_total counter 容器在每個CPU核心上的累積佔用時間 (單位:秒)
container_cpu_system_seconds_total counter System CPU累積佔用時間(單位:秒)
container_cpu_user_seconds_total counter User CPU累積佔用時間(單位:秒)
container_fs_usage_bytes gauge 容器中檔案系統的使用量(單位:位元組)
container_fs_limit_bytes gauge 容器可以使用的檔案系統總量(單位:位元組)
container_fs_reads_bytes_total counter 容器累積讀取資料的總量(單位:位元組)
container_fs_writes_bytes_total counter 容器累積寫入資料的總量(單位:位元組)
container_memory_max_usage_bytes gauge 容器的最大記憶體使用量(單位:位元組)
container_memory_usage_bytes gauge 容器當前的記憶體使用量(單位:位元組
container_spec_memory_limit_bytes gauge 容器的記憶體使用量限制
machine_memory_bytes gauge 當前主機的記憶體總量
container_network_receive_bytes_total counter 容器網路累積接收資料總量(單位:位元組)
container_network_transmit_bytes_total counter 容器網路累積傳輸資料總量(單位:位元組)

與Prometheus整合

修改/etc/prometheus/prometheus.yml,將cAdvisor新增監控資料採集任務目標當中:

- job_name: cadvisor
  static_configs:
  - targets:
    - localhost:9096

重啟Prometheus服務,檢視