heat CLI and ceilometer CLI

在OpenStack中, 當我們要利用hest開啟一個機櫃,
並根據一些數值進行auto-scaling時,
我們常常遇到一些問題, 例如:

到底現在ceilometer讀到的數值為何?
是否到達scaling out/ scaling down的標準?

這些問題都可以利用heat和ceilometer的command line interface (CLI)來查詢,
http://docs.openstack.org/cli-reference/heat.html
http://docs.openstack.org/cli-reference/ceilometer.html

在以下內容中, 我們將介紹一些常用的CLI,
並說明這些CLI的功能,



查詢alarm的狀態, 狀態包含三個state: {alarm, ok, insufficient data}
alarm代表alarm被啟動, ok代表alarm通過, insufficient data代表沒有足夠資料進行判斷,
(為了排版, 我更改了部份顯示內容, 請直接下command確認輸出結果)

# ceilometer alarm-list
+------------+--------------------------+-------+--------+-------+----------+---------+---+
| Alarm ID   | Name                     | State |Severity|Enabled|Continuous|condition|TC |
+------------+--------------------------+-------+--------+-------+----------+---------+---+
| 6cd64cce-* | check-cpu_alarm_low-***  | alarm | low    | True  | True     | ***     |No |
| 7d07f884-* | test-cpu_alarm_low-***   | alarm | low    | True  | True     | ***     |No |
| 9e0339ca-* | check-cpu_alarm_high-*** | ok    | low    | True  | True     | ***     |No |
| ae8b756e-* | test-cpu_alarm_high-***  | ok    | low    | True  | True     | ***     |No |
+------------+--------------------------+-------+--------+-------+----------+---------+---+


根據所查詢到的alarm id (或是名稱), 我們可以用alarm-show查詢alarm的細節,
觀察所設定的threshold是否合理,

# ceilometer alarm-show 6cd64cce-3b52-439a-809b-f83d534e5192
+---------------------------+-------------------------------------------------------------+
| Property                  |Value                                                        |
+---------------------------+-------------------------------------------------------------+
| alarm_actions             | [u'http://192.168.0.70:8000/v1/signal/*****']               |
| alarm_id                  | 6cd64cce-3b52-439a-809b-f83d534e5192                        |
| comparison_operator       | lt                                                          |
| description               | Scale-down if the average CPU < 15% for 10 minutes          |
| enabled                   | True                                                        |
| evaluation_periods        | 1                                                           |
| exclude_outliers          | False                                                       |
| insufficient_data_actions | None                                                        |
| meter_name                | network.incoming.bytes                                      |
| name                      | check-cpu_alarm_low-fbe6asq7qwrx                            |
| ok_actions                | None                                                        |
| period                    | 60                                                          |
| project_id                | 898c4669a3734faab1813ec0b27c2a24                            |
| query                     | metadata.user_metadata.stack == 9be4b574-***                |
| repeat_actions            | True                                                        |
| severity                  | low                                                         |
| state                     | alarm                                                       |
| statistic                 | avg                                                         |
| threshold                 | 65000.0                                                     |
| type                      | threshold                                                   |
| user_id                   | bf76fd40a1a8450cb5328f21c5bd5f94                            |
+---------------------------+-------------------------------------------------------------+


利用sample-list可以看到不同VM所量測的meter數值,
由於資料庫中存有許多數值, 因此我們可以用timestamp作為篩選的判斷式:

# ceilometer sample-list -m network.incoming.bytes.rate -q 'timestamp>2016-08-10T04:10:00'
+--------------------+-----------------------------+-------+------+------+-----------+
| Resource ID        | Name                        | Type  |Volume| Unit | Timestamp |
+--------------------+-----------------------------+-------+------+------+-----------+
| instance-00000041* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:11:23 |
| instance-0000003a* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:11:23 |
| instance-0000002b* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:11:23 |
| instance-0000002a* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:11:23 |
| instance-00000029* | network.incoming.bytes.rate | gauge | 16.0 | B/s  | *04:11:23 |
| instance-00000041* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:10:23 |
| instance-0000003a* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:10:23 |
| instance-0000002b* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:10:23 |
| instance-0000002a* | network.incoming.bytes.rate | gauge | 0.0  | B/s  | *04:10:23 |
| instance-00000029* | network.incoming.bytes.rate | gauge | 17.11| B/s  | *04:10:23 |
+--------------------+-----------------------------+-------+------+------+-----------+

heat stack-show 可以看到目前的stack名稱和id

# heat stack-list
+-----------------+------------+-----------------+----------------------+
| id              | stack_name | stack_status    | creation_time        |
+-----------------+------------+-----------------+----------------------+
| 9be4b574-a0fe-* | check      | CREATE_COMPLETE | 2016-08-10T03:01:36Z |
+-----------------+------------+-----------------+----------------------+


透過stack id (或是名稱),
我們可以利用heat event-list查詢stack alarm的變更紀錄 ,

# heat event-list check
+-------------------+------------+----------------------+--------------------+------------+
| resource_name     | id         |resource_status_reason| resource_status    | event_time |
+-------------------+------------+----------------------+--------------------+------------+
| check             | bc3b927d-* | Stack CREATE started | CREATE_IN_PROGRESS | *03:01:36Z |
| asg               | 7b4ae407-* | state changed        | CREATE_IN_PROGRESS | *03:01:36Z |
| asg               | 5e4679ff-* | state changed        | CREATE_COMPLETE    | *03:01:51Z |
| scale_down_policy | 00b59076-* | state changed        | CREATE_IN_PROGRESS | *03:01:51Z |
| scale_up_policy   | c2b9b954-* | state changed        | CREATE_IN_PROGRESS | *03:01:51Z |
| scale_down_policy | e973af5d-* | state changed        | CREATE_COMPLETE    | *03:01:52Z |
| scale_up_policy   | 76c9238f-* | state changed        | CREATE_COMPLETE    | *03:01:52Z |
| cpu_alarm_low     | db44fbd0-* | state changed        | CREATE_IN_PROGRESS | *03:01:52Z |
| cpu_alarm_high    | e470b998-* | state changed        | CREATE_IN_PROGRESS | *03:01:52Z |
| cpu_alarm_high    | 8a9a8947-* | state changed        | CREATE_COMPLETE    | *03:01:53Z |
| cpu_alarm_low     | e4880d68-* | state changed        | CREATE_COMPLETE    | *03:01:53Z |
| check             | 7853bc5d-* | Stack CREATE ***     | CREATE_COMPLETE    | *03:01:53Z |
| scale_down_policy | 9c599fb5-* | alarm state changed *| SIGNAL_COMPLETE    | *03:02:53Z |
| scale_down_policy | 80ca69d0-* | alarm state changed *| SIGNAL_COMPLETE    | *03:04:54Z |
+-------------------+------------+----------------------+--------------------+------------+

最後, 雖然這是一篇bug report的問題,
(也被證明是操作錯誤, 而非OpenStack的bug)
卻值得一看是如何trace問題, 並提供別人足夠的資訊debug,
也說明了許多heat和ceilometer協作的資訊, 值得一讀.

https://bugs.launchpad.net/ceilometer/+bug/1457923

留言

熱門文章

LTE筆記: RSRP, RSSI and RSRQ

[WiFi] WiFi 網路的識別: BSS, ESS, SSID, ESSID, BSSID

LTE筆記: 5G NR Measurement Events