Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Cloud Natural Language API 的主要功能:
快速入門的 Lab 會示範實體辨識,從文字片段中識別實體。
# 定義環境變數供後續使用
export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value core/project)
# 建立 my-natlang-sa 的 IAM 權限
gcloud iam service-accounts create my-natlang-sa \
--display-name "my natural language service account"
gcloud iam service-accounts keys create ~/key.json \
--iam-account my-natlang-sa@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com
export GOOGLE_APPLICATION_CREDENTIALS="/home/USER/key.json"
前往 Compute Engine 中的 VM 執行個體驗面。Lab 已經建立了一個執行個體,點擊 [SSH] 進行 SSH 連線。
在執行個體中執行實體辨識的要求。
gcloud ml language analyze-entities --content="Michelangelo Caravaggio, Italian painter, is known for 'The Calling of Saint Matthew'." > result.json cat result.json
利用 cat result.json 檢視結果。
{
"entities": [
{
"mentions": [
{
"text": {
"beginOffset": 0,
"content": "Michelangelo Caravaggio"
},
"type": "PROPER"
},
{
"text": {
"beginOffset": 33,
"content": "painter"
},
"type": "COMMON"
}
],
"metadata": {
"mid": "/m/020bg",
"wikipedia_url": "https://en.wikipedia.org/wiki/Caravaggio"
},
"name": "Michelangelo Caravaggio",
"salience": 0.82904786,
"type": "PERSON"
},
{
"mentions": [
{
"text": {
"beginOffset": 25,
"content": "Italian"
},
"type": "PROPER"
}
],
"metadata": {},
"name": "Italian",
"salience": 0.13981608,
"type": "LOCATION"
},
{
"mentions": [
{
"text": {
"beginOffset": 56,
"content": "The Calling of Saint Matthew"
},
"type": "PROPER"
}
],
"metadata": {
"mid": "/m/085_p7",
"wikipedia_url": "https://en.wikipedia.org/wiki/The_Calling_of_St_Matthew_(Caravaggio)"
},
"name": "The Calling of Saint Matthew",
"salience": 0.031136045,
"type": "EVENT"
}
],
"language": "en"
}上述 JSON 檔的資料欄位說明:
| 欄位名稱 | 說明 |
|---|---|
name, type | 實體的名稱與類型 |
metadata | 中繼資料,如果該名稱有對應的 Wikipedia 資料,便會顯示對應網址 |
salience | 判斷該實體在全文中的重要性 |
mentions | 以不同方式指涉同一個實體的用詞 |
用中文來試試看。
gcloud ml language analyze-entities --content="米開朗基羅是一位義大利畫家,以聖瑪竇蒙召為徒聞名" > r esult-zh.json cat result-zh.json
{
"entities": [
{
"mentions": [
{
"text": {
"beginOffset": 0,
"content": "米開朗基羅"
},
"type": "PROPER"
}
],
"metadata": {
"mid": "/m/058w5",
"wikipedia_url": "https://en.wikipedia.org/wiki/Michelangelo"
},
"name": "米開朗基羅",
"salience": 0.45181116,
"type": "CONSUMER_GOOD"
},
{
"mentions": [
{
"text": {
"beginOffset": 45,
"content": "聖瑪竇蒙"
},
"type": "PROPER"
}
],
"metadata": {},
"name": "聖瑪竇蒙",
"salience": 0.21992774,
"type": "OTHER"
},
{
"mentions": [
{
"text": {
"beginOffset": 24,
"content": "義大利"
},
"type": "PROPER"
}
],
"metadata": {
"mid": "/m/03rjj",
"wikipedia_url": "https://en.wikipedia.org/wiki/Italy"
},
"name": "義大利",
"salience": 0.17133038,
"type": "LOCATION"
},
{
"mentions": [
{
"text": {
"beginOffset": 33,
"content": "畫家"
},
"type": "COMMON"
}
],
"metadata": {},
"name": "畫家",
"salience": 0.1569307,
"type": "PERSON"
},
{
"mentions": [
{
"text": {
"beginOffset": 18,
"content": "一"
},
"type": "TYPE_UNKNOWN"
}
],
"metadata": {
"value": "1"
},
"name": "一",
"salience": 0.0,
"type": "NUMBER"
}
],
"language": "zh-Hant"
}可以看出對中文的分析並沒有非常精確。
因為是快速入門,所以 Lab 沒有詳細說明執行個體怎麼建制。