Elasticsearchチュートリアル -- 起動~データ投入~クエリまで

Elasticsearch勉強メモ

出典: 

起動

docker container run --rm -d \
  --health-cmd='curl localhost:9200/_health' \
  --health-interval=5s \
  -p 9200:9200 -p 9300:9300 \
  -e "discovery.type=single-node" \
  docker.elastic.co/elasticsearch/elasticsearch:7.6.2
  • さいきんdockerのhealthcheckを覚えた
CONTAINER ID        IMAGE                                                 COMMAND                  CREATED             STATUS                    PORTS                                            NAMES
cc3e9750e00a        docker.elastic.co/elasticsearch/elasticsearch:7.6.2   "/usr/local/bin/dock…"   23 seconds ago      Up 22 seconds (healthy)   0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp   silly_banzai
  • 9300はレプリケーションに使うらしい

Index some documents

curl -XPUT localhost:9200/customer/_doc/1 -d '
{
  "name": "John Doe"
}
'
  • ヘッダがないぞと怒られる
{"error":"Content-Type header [application/x-www-form-urlencoded] is not supported","status":406}
  • Content-Type:application/json ヘッダ必須
curl -XPUT localhost:9200/customer/_doc/1 -H 'Content-Type:application/json' -d '
{
  "name": "John Doe"
}
'
{"_index":"customer","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}
  • 読み出し
curl localhost:9200/customer/_doc/1
{"_index":"customer","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":
{
  "name": "John Doe"
}
  • ?pretty=trueをつけると整形される

    • シェルに解釈されないようにクォート
curl 'localhost:9200/customer/_doc/1?pretty=true'
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "John Doe"
  }
}

Indexing documents in bulk

curl -H 'Content-Type:application/json' -XPUT 'localhost:9200/bank/_bulk?pretty&refresh' --data-binary '@accounts.json'
curl 'localhost:9200/_cat/indices?v'
health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   bank     ButQrX7MTD2d7Ng5ZwR_-g   1   1       1000            0      414kb          414kb
yellow open   customer 0WRs2HR8TAK3OwANxEMOLw   1   1          1            0      3.4kb          3.4kb

Start Searching

[https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-search.html:embed:cite]

curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}
'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "0",
        "_score" : null,
        "_source" : {
          "account_number" : 0,
          "balance" : 16623,
          "firstname" : "Bradshaw",
          "lastname" : "Mckenzie",
          "age" : 29,
          "gender" : "F",
          "address" : "244 Columbus Place",
          "employer" : "Euron",
          "email" : "bradshawmckenzie@euron.com",
          "city" : "Hobucken",
          "state" : "CO"
        },
        "sort" : [
          0
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "account_number" : 1,
          "balance" : 39225,
          "firstname" : "Amber",
          "lastname" : "Duke",
          "age" : 32,
          "gender" : "M",
          "address" : "880 Holmes Lane",
          "employer" : "Pyrami",
          "email" : "amberduke@pyrami.com",
          "city" : "Brogan",
          "state" : "IL"
        },
        "sort" : [
          1
        ]
      },
...
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : null,
        "_source" : {
          "account_number" : 9,
          "balance" : 24776,
          "firstname" : "Opal",
          "lastname" : "Meadows",
          "age" : 39,
          "gender" : "M",
          "address" : "963 Neptune Avenue",
          "employer" : "Cedward",
          "email" : "opalmeadows@cedward.com",
          "city" : "Olney",
          "state" : "OH"
        },
        "sort" : [
          9
        ]
      }
    ]
  }
}
  • デフォルトで先頭10件取ってくる
  • 読み方

    • "took" : 4,
    • かかった時間(ms)
    • "_shards"
    • 検索対象となったシャード内訳

  • fromsizeで取得先頭位置・件数を指定
curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ],
  "from": 10,
  "size": 3
}
'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : null,
        "_source" : {
          "account_number" : 10,
          "balance" : 46170,
          "firstname" : "Dominique",
          "lastname" : "Park",
          "age" : 37,
          "gender" : "F",
          "address" : "100 Gatling Place",
          "employer" : "Conjurica",
          "email" : "dominiquepark@conjurica.com",
          "city" : "Omar",
          "state" : "NJ"
        },
        "sort" : [
          10
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "11",
        "_score" : null,
        "_source" : {
          "account_number" : 11,
          "balance" : 20203,
          "firstname" : "Jenkins",
          "lastname" : "Haney",
          "age" : 20,
          "gender" : "M",
          "address" : "740 Ferry Place",
          "employer" : "Qimonk",
          "email" : "jenkinshaney@qimonk.com",
          "city" : "Steinhatchee",
          "state" : "GA"
        },
        "sort" : [
          11
        ]
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "12",
        "_score" : null,
        "_source" : {
          "account_number" : 12,
          "balance" : 37055,
          "firstname" : "Stafford",
          "lastname" : "Brock",
          "age" : 20,
          "gender" : "F",
          "address" : "296 Wythe Avenue",
          "employer" : "Uncorp",
          "email" : "staffordbrock@uncorp.com",
          "city" : "Bend",
          "state" : "AL"
        },
        "sort" : [
          12
        ]
      }
    ]
  }
}
  • もう少しマシな検索をしてみる
  • addressmillまたはlaneを含むやつで、一致度の高いものトップ3
curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": { "match": { "address": "mill lane" } },
  "sort": [
    { "_score": "desc" }
  ],
  "size": 3
}
'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 19,
      "relation" : "eq"
    },
    "max_score" : 9.507477,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "136",
        "_score" : 9.507477,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "winnieholland@neteria.com",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "forbeswallace@pheast.com",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "345",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "parkerhines@baluba.com",
          "city" : "Blackgum",
          "state" : "KY"
        }
      }
    ]
  }
}
  • 確かにmilllaneを両方含むもの = スコアの高いものがトップに出てきた
  • 複合条件: boolを使う
curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}
'
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 43,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "474",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 474,
          "balance" : 35896,
          "firstname" : "Obrien",
          "lastname" : "Walton",
          "age" : 40,
          "gender" : "F",
          "address" : "192 Ide Court",
          "employer" : "Suremax",
          "email" : "obrienwalton@suremax.com",
          "city" : "Crucible",
          "state" : "UT"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "479",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 479,
          "balance" : 31865,
          "firstname" : "Cameron",
          "lastname" : "Ross",
          "age" : 40,
          "gender" : "M",
          "address" : "904 Bouck Court",
          "employer" : "Telpod",
          "email" : "cameronross@telpod.com",
          "city" : "Nord",
          "state" : "MO"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "549",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 549,
          "balance" : 1932,
          "firstname" : "Jacqueline",
          "lastname" : "Maxwell",
          "age" : 40,
          "gender" : "M",
          "address" : "444 Schenck Place",
          "employer" : "Fuelworks",
          "email" : "jacquelinemaxwell@fuelworks.com",
          "city" : "Oretta",
          "state" : "OR"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "878",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 878,
          "balance" : 49159,
          "firstname" : "Battle",
          "lastname" : "Blackburn",
          "age" : 40,
          "gender" : "F",
          "address" : "234 Hendrix Street",
          "employer" : "Zilphur",
          "email" : "battleblackburn@zilphur.com",
          "city" : "Wanamie",
          "state" : "PA"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "885",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 885,
          "balance" : 31661,
          "firstname" : "Valdez",
          "lastname" : "Roberson",
          "age" : 40,
          "gender" : "F",
          "address" : "227 Scholes Street",
          "employer" : "Delphide",
          "email" : "valdezroberson@delphide.com",
          "city" : "Chilton",
          "state" : "MT"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "948",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 948,
          "balance" : 37074,
          "firstname" : "Sargent",
          "lastname" : "Powers",
          "age" : 40,
          "gender" : "M",
          "address" : "532 Fiske Place",
          "employer" : "Accuprint",
          "email" : "sargentpowers@accuprint.com",
          "city" : "Umapine",
          "state" : "AK"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "998",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 998,
          "balance" : 16869,
          "firstname" : "Letha",
          "lastname" : "Baker",
          "age" : 40,
          "gender" : "F",
          "address" : "206 Llama Court",
          "employer" : "Dognosis",
          "email" : "lethabaker@dognosis.com",
          "city" : "Dunlo",
          "state" : "WV"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "40",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 40,
          "balance" : 33882,
          "firstname" : "Pace",
          "lastname" : "Molina",
          "age" : 40,
          "gender" : "M",
          "address" : "263 Ovington Court",
          "employer" : "Cytrak",
          "email" : "pacemolina@cytrak.com",
          "city" : "Silkworth",
          "state" : "OR"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "165",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 165,
          "balance" : 18956,
          "firstname" : "Sims",
          "lastname" : "Mckay",
          "age" : 40,
          "gender" : "F",
          "address" : "205 Jackson Street",
          "employer" : "Comtour",
          "email" : "simsmckay@comtour.com",
          "city" : "Tilden",
          "state" : "DC"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "177",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 177,
          "balance" : 48972,
          "firstname" : "Harris",
          "lastname" : "Gross",
          "age" : 40,
          "gender" : "F",
          "address" : "468 Suydam Street",
          "employer" : "Kidstock",
          "email" : "harrisgross@kidstock.com",
          "city" : "Yettem",
          "state" : "KY"
        }
      }
    ]
  }
}
  • must, should_scoreに影響する
  • must_not_scoreに影響しない

    • 検索結果に現れるか現れないかにしか影響しない
  • filterもスコアに影響しない
curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "age": "40"
          }
        }
      ],
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  },
  "sort": [
    { "_score": "desc" }
  ],
  "size": 3
}
'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 7,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "432",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 432,
          "balance" : 28969,
          "firstname" : "Preston",
          "lastname" : "Ferguson",
          "age" : 40,
          "gender" : "F",
          "address" : "239 Greenwood Avenue",
          "employer" : "Bitendrex",
          "email" : "prestonferguson@bitendrex.com",
          "city" : "Idledale",
          "state" : "ND"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "477",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 477,
          "balance" : 25892,
          "firstname" : "Holcomb",
          "lastname" : "Cobb",
          "age" : 40,
          "gender" : "M",
          "address" : "369 Marconi Place",
          "employer" : "Steeltab",
          "email" : "holcombcobb@steeltab.com",
          "city" : "Byrnedale",
          "state" : "CA"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "523",
        "_score" : 1.0,
        "_source" : {
          "account_number" : 523,
          "balance" : 28729,
          "firstname" : "Amalia",
          "lastname" : "Benjamin",
          "age" : 40,
          "gender" : "F",
          "address" : "173 Bushwick Place",
          "employer" : "Sentia",
          "email" : "amaliabenjamin@sentia.com",
          "city" : "Jacumba",
          "state" : "OK"
        }
      }
    ]
  }
}
  • mustで書き換えてみるとスコアが違ってくる

    • 1.0 -> 2.0
curl -H 'Content-Type:application/json' localhost:9200/bank/_search?pretty -d '
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "age": "40"
          }
        },
        {
          "range": {
            "balance": {
              "gte": 20000,
              "lte": 30000
            }
          }
        }
      ]
    }
  },
  "sort": [
    { "_score": "desc" }
  ],
  "size": 3
}
'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 7,
      "relation" : "eq"
    },
    "max_score" : 2.0,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "432",
        "_score" : 2.0,
        "_source" : {
          "account_number" : 432,
          "balance" : 28969,
          "firstname" : "Preston",
          "lastname" : "Ferguson",
          "age" : 40,
          "gender" : "F",
          "address" : "239 Greenwood Avenue",
          "employer" : "Bitendrex",
          "email" : "prestonferguson@bitendrex.com",
          "city" : "Idledale",
          "state" : "ND"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "477",
        "_score" : 2.0,
        "_source" : {
          "account_number" : 477,
          "balance" : 25892,
          "firstname" : "Holcomb",
          "lastname" : "Cobb",
          "age" : 40,
          "gender" : "M",
          "address" : "369 Marconi Place",
          "employer" : "Steeltab",
          "email" : "holcombcobb@steeltab.com",
          "city" : "Byrnedale",
          "state" : "CA"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "523",
        "_score" : 2.0,
        "_source" : {
          "account_number" : 523,
          "balance" : 28729,
          "firstname" : "Amalia",
          "lastname" : "Benjamin",
          "age" : 40,
          "gender" : "F",
          "address" : "173 Bushwick Place",
          "employer" : "Sentia",
          "email" : "amaliabenjamin@sentia.com",
          "city" : "Jacumba",
          "state" : "OK"
        }
      }
    ]
  }
}