[ORAN] AI/ML workflow description and requirements ~2

和 3GPP 不同, O-RAN 的標準並非兩階段的定義 (Study->Spec),

同時, 可能是因為組織相對沒有 3GPP 那麼龐大,

在 Spec 內容的呈現上, 出現許多像是在討論階段的文字,

例如, 在 AI/ML 這份討論中, 就說明了其想像的 AI/ML 設計準則.

這樣的文字, 因為過於模糊, 在 Spec 中其實不太適合,

但是反過來說, 也提供一種像是立法依據的角度讓我們了解 O-RAN 組織的想像.

以下就是 AI/ML 的設計準則, 會呈現原有的原文, 以及翻譯理解的內容:

Principle 1: In O-RAN we will always have some offline learning as a proposed best practice (even for reinforcement learning type of scenarios). In the current document, offline training means a model is first trained with offline data, and trained model is deployed in the network for inference. Online training refers to scenarios such as reinforcement learning, where the model ‘learns’ as it is executing in the network. However, even in the latter scenario, it is possible that some offline training may happen.

[準則1] 在 O-RAN 中所有的 ML model 都應該被預先訓練過 (offline learning), 就算是 reinforcement learning 這種以即時反饋來進行學習的架構, 都應該先以預先收集的資料進行模型的學習.

[註] 這一條準則應該是因為 O-RAN 架構中, 實際的 Action 會影響一真實網路的通訊效能, 因此無法容忍一個尚未收斂的網路進行布建.

Principle 2: A model needs to be trained and tested before deploying in the network. A completely untrained model will not be deployed in the network.

[準則2] 和準則1相同, 但加入預先測試的要求.

Principle 3: As a best practice, it would be useful if ML Applications are designed in a modular manner that are decoupled from one another. This includes their ability to share data without knowing each other’s data needs. It also implies that they need not understand the location or nature of a data source. For example, an ML Application that is consuming RAN data need not know whether that data is being provided directly via E2, or by some other ML Application on the same Inference Host that is consuming and re-publishing that RAN data.

[準則3] ML 的程式應該要進行模組化設計, 並能夠獨立布建. 針對 ML 需要的輸入資料, 應當要進行抽象化, 使其可以接入不同來源的輸入, 例如: E2 介面, 或是其他 ML 應用.

[註] 此處的要求應是給予 O-RAN 的營運方彈性決定布署的 ML 應用, 並能夠透過各模組的組合, 完成預想的應用情境. 然而, 一個共通的輸入介面, 也意味著一個共通的輸入資料標準, 使不同 ML 應用可以按此標準撰寫. 目前此標準似乎還未被定義...

Principle 4: Given that the criteria for determining the deployment scenario for a given ML Application may differ between service providers, as a best practice, it should be possible for a Service Provider to decide whether an ML Application should be deployed to a Non-RT RIC or a Near-RT RIC as its Inference Host.

[準則4] ML 的程式應該設計成可以在不同環境中布署, 例如: Non-RT RIC 或 Near-RT RIC, O-RAN 的服務提供者, 可以按照需求決定布署的地點.

[註] 考慮到同一個 ML 應用可能會於不同 O-RAN 服務提供這上運行, 對應不同的硬體條件, 在布署上需對應不同的計算框架. 此處和準則3相關, 都需要一個標準化的資料格式支持.

Principle 5: As a best practice, to improve the execution efficiency and inference performance in the inference host, the ML model for inference should be optimized and compiled with the consideration of the inference host hardware capability. A trade off between efficiency and inference accuracy need to be taken into consideration. Therefore, the optimization should take acceptable accuracy loss as one of the goals, and optimization parameters should be obtained based on this threshold.

[準則5] ML 程式必須考慮到不同的計算資源, 並在計算的效率與結果的精確度之間取得平衡. 在最好的狀況下, ML 程式應該要能夠依據硬體資源的限制, 重新編譯與最佳化程式. 基於此設定 ML 程式可以在有限的硬體支援下, 以較低準確度的模式運行.

[註] 在 O-RAN 的應用情境中, 許多應用都有即時性的需求, 為了滿足即時性以及硬體的限制 (例如: GPU 支援), ML 程式可能需要重新最佳化, 並降低模型的複雜度.

根據上述5個設計準則, 我們可以總結兩個重點:

ML 程式進行模組化設計, 標準化其輸出輸入格式
ML 程式應能對於計算資源的限制進行優化

針對第一個重點, 我們可以在後面的章節 (4.2, 4.3) 中了解 O-RAN 的想像,

其中, ML 程式 (下圖中的 A, B, C) 以串接或是匯流排的方式,

進行相依的計算, 以及輸入/輸出資料的讀取.

在 O-RAN 設想的應用情境下,

的確可以透過標準化 ML 程式, 來減少服務提供者的負擔, 並適用於不同硬體需求,

然而, 這樣架構的缺點則失去 ML 程式開發的彈性,

例如, 上述 A, B, C 三個程式都必須被定義, 甚至到輸入/輸出的格式,

讓不同的程式開發者的介面, 可以依據標準格式, 進行資料串接.

這樣的設計準則可以讓第三方應用快速開展, 並於不同平台上進行布建,

但也限制了粒三方似乎只是一個演算法的開發, 而減少對資料流的掌控,

若非大型網路服務提供商 (ISP) 是否有能力進行 ML 模組的布建, 並符合通訊需求,

則是此架構真正落實時的一大挑戰.

或許, 到時也需要另一個 O-RAN 系統整合商一起配合吧!

搜尋此網誌

Castle on a Cloud

[ORAN] AI/ML workflow description and requirements ~2

留言

張貼留言

熱門文章

LTE筆記: RSRP, RSSI and RSRQ

[WiFi] WiFi 網路的識別: BSS, ESS, SSID, ESSID, BSSID

LTE筆記: 5G NR Measurement Events