Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Cloud Natural Language API 的主要功能:
快速入門的 Lab 會示範實體辨識,從文字片段中識別實體。
# 定義環境變數供後續使用 export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value core/project) # 建立 my-natlang-sa 的 IAM 權限 gcloud iam service-accounts create my-natlang-sa \ --display-name "my natural language service account" gcloud iam service-accounts keys create ~/key.json \ --iam-account my-natlang-sa@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com export GOOGLE_APPLICATION_CREDENTIALS="/home/USER/key.json"
前往 Compute Engine 中的 VM 執行個體驗面。Lab 已經建立了一個執行個體,點擊 [SSH] 進行 SSH 連線。
在執行個體中執行實體辨識的要求。
gcloud ml language analyze-entities --content="Michelangelo Caravaggio, Italian painter, is known for 'The Calling of Saint Matthew'." > result.json cat result.json
利用 cat result.json
檢視結果。
{ "entities": [ { "mentions": [ { "text": { "beginOffset": 0, "content": "Michelangelo Caravaggio" }, "type": "PROPER" }, { "text": { "beginOffset": 33, "content": "painter" }, "type": "COMMON" } ], "metadata": { "mid": "/m/020bg", "wikipedia_url": "https://en.wikipedia.org/wiki/Caravaggio" }, "name": "Michelangelo Caravaggio", "salience": 0.82904786, "type": "PERSON" }, { "mentions": [ { "text": { "beginOffset": 25, "content": "Italian" }, "type": "PROPER" } ], "metadata": {}, "name": "Italian", "salience": 0.13981608, "type": "LOCATION" }, { "mentions": [ { "text": { "beginOffset": 56, "content": "The Calling of Saint Matthew" }, "type": "PROPER" } ], "metadata": { "mid": "/m/085_p7", "wikipedia_url": "https://en.wikipedia.org/wiki/The_Calling_of_St_Matthew_(Caravaggio)" }, "name": "The Calling of Saint Matthew", "salience": 0.031136045, "type": "EVENT" } ], "language": "en" }
上述 JSON 檔的資料欄位說明:
欄位名稱 | 說明 |
---|---|
name , type | 實體的名稱與類型 |
metadata | 中繼資料,如果該名稱有對應的 Wikipedia 資料,便會顯示對應網址 |
salience | 判斷該實體在全文中的重要性 |
mentions | 以不同方式指涉同一個實體的用詞 |
用中文來試試看。
gcloud ml language analyze-entities --content="米開朗基羅是一位義大利畫家,以聖瑪竇蒙召為徒聞名" > r esult-zh.json cat result-zh.json
{ "entities": [ { "mentions": [ { "text": { "beginOffset": 0, "content": "米開朗基羅" }, "type": "PROPER" } ], "metadata": { "mid": "/m/058w5", "wikipedia_url": "https://en.wikipedia.org/wiki/Michelangelo" }, "name": "米開朗基羅", "salience": 0.45181116, "type": "CONSUMER_GOOD" }, { "mentions": [ { "text": { "beginOffset": 45, "content": "聖瑪竇蒙" }, "type": "PROPER" } ], "metadata": {}, "name": "聖瑪竇蒙", "salience": 0.21992774, "type": "OTHER" }, { "mentions": [ { "text": { "beginOffset": 24, "content": "義大利" }, "type": "PROPER" } ], "metadata": { "mid": "/m/03rjj", "wikipedia_url": "https://en.wikipedia.org/wiki/Italy" }, "name": "義大利", "salience": 0.17133038, "type": "LOCATION" }, { "mentions": [ { "text": { "beginOffset": 33, "content": "畫家" }, "type": "COMMON" } ], "metadata": {}, "name": "畫家", "salience": 0.1569307, "type": "PERSON" }, { "mentions": [ { "text": { "beginOffset": 18, "content": "一" }, "type": "TYPE_UNKNOWN" } ], "metadata": { "value": "1" }, "name": "一", "salience": 0.0, "type": "NUMBER" } ], "language": "zh-Hant" }
可以看出對中文的分析並沒有非常精確。
因為是快速入門,所以 Lab 沒有詳細說明執行個體怎麼建制。