All Articles

AWS ElasticSearch Service์— ์žˆ๋˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋กœ์ปฌ ES๋กœ ๋ฐฑ์—…ํ•˜๊ธฐ + ํ•œ๊ธ€ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ ๋ถ™์ด๊ธฐ

์ƒํ™ฉ: ๊ณผ์ œํ•  ๋•Œ ์“ฐ๋˜ AWS ElasticSearch Service ๊ฐ€ ์š”๊ธˆ์ด ๋งŽ์ด ๋‚˜์™€์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐฑ์—…ํ•˜๊ณ ์ž ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ํ˜น์‹œ ๋‚˜์ค‘์— ์ฝ”๋“œ๋ฅผ ๋Œ๋ ค์•ผํ•  ์ˆ˜ ์žˆ์œผ๋‹ˆ ์—ฐ๊ตฌ์‹ค์— ๋…ธ๋Š” ์ปดํ“จํ„ฐ์— ES ๋ฅผ ๊น”์•„์„œ ์˜ฎ๊ธฐ๊ณ  ์‹ถ๋‹ค. ๋ธ”๋กœ๊ทธ ํฌ์ŠคํŒ…๋“ค์ด ๋งŽ์€๋ฐ ์€๊ทผํžˆ ์ตœ๊ทผ ๋ฌธ์„œ๊ฐ€ ์—†์œผ๋‹ˆ ๋‚ด๊ฐ€ ํ•œ ๋ฒˆ ์ ์–ด๋ณด์ž.

๋„์ปค ์ด๋ฏธ์ง€ ์ฐพ์•„๋ณด๊ธฐ

๊ณต์‹ ์‚ฌ์ดํŠธ์— ์นœ์ ˆํ•˜๊ฒŒ ์‹œ์ž‘ํ•  ๋„์ปค ์ด๋ฏธ์ง€์™€ ์„ค๋ช…์ด ๋‚˜์™€์žˆ๋‹ค. ์—ฌ๊ธฐ์— ํ•„์š”์— ๋”ฐ๋ผ ๋ช‡ ์ค„์„ ๋ฐ”๊พผ๋‹ค. ๋ฐ”๊พผ ์‚ฌํ•ญ์€

  • ๋žจ์„ ์ปจํ…Œ์ด๋„ˆ 2 ๊ธฐ๊ฐ€ / JVM 1 ๊ธฐ๊ฐ€๋กœ ๋Š˜๋ฆผ (์ด ์ปดํ“จํ„ฐ๋Š” ๋žจ์ด 32 ๊ธฐ๊ฐ€๋‹ค) -> environment ์˜ ES_JAVA_OPTS ๋ž‘ mem_limit์„ ์ˆ˜์ •
  • ๋ณผ๋ฅจ์„ ๋„์ปค๊ฐ€ ์„ค์น˜๋œ ํด๋” ๋Œ€์‹  ํ•˜๋“œ๋””์Šคํฌ๋กœ (SSD ๋Š” ์†Œ์ค‘ํ•˜๋‹ˆ๊นŒ) -> volumes
  • ํ‚ค๋ฐ”๋‚˜ ์ด๋ฏธ์ง€๋ฅผ ๋ถ™์ž„์„ธ๊ฐ€์ง€ ์˜€๋‹ค.
version: '2'
services:
  elasticsearch1:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.1
    container_name: elasticsearch1
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 2g
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    networks:
      - esnet
  elasticsearch2:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.6.1
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
      - "discovery.zen.ping.unicast.hosts=elasticsearch1"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 2g
    volumes:
      - esdata2:/usr/share/elasticsearch/data
    networks:
      - esnet
  kibana:
    image: docker.elastic.co/kibana/kibana:5.6.1
    depends_on: ['elasticsearch1']
    environment:
      SERVER_NAME: lab-kibana
      ELASTICSEARCH_URL: "http://elasticsearch1:9200"
    ports:
      - 5601:5601
    networks:
      - esnet

volumes:
  esdata1:
      driver_opts:
          type: none
          device: /hdd/lg_elasticsearch/volumes/esdata1
          o: bind
  esdata2:
      driver_opts:
          type: none
          device: /hdd/lg_elasticsearch/volumes/esdata2
          o: bind

networks:
  esnet:

์ด๋Ÿฌ๊ณ  docker-compose up ์„ ์น˜๊ณ  5601ํฌํŠธ๋กœ ํ‚ค๋ฐ”๋‚˜๋ฅผ ๋“ค์–ด๊ฐ€๋ฉด ๋‹น์—ฐํžˆ ์ž˜ ๋œ๋‹ค. ํ‚ค๋ฐ”๋‚˜์— ๋“ค์–ด๊ฐ€๋ณด๋‹ˆ AWS ์—์„œ ์ œ๊ณตํ•˜๋Š” ES ๋ณด๋‹ค ์•ฝ๊ฐ„ ๊ธฐ๋Šฅ์ด ๋” ๋งŽ์•˜๋‹ค(์ด๋ฅผํ…Œ๋ฉด ๋ชจ๋‹ˆํ„ฐ๋ง?). TaskRabbit ์—์„œ ๋งŒ๋“  elasticsearch-dump๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ๋“ค์„ ์ง‘์–ด๋„ฃ๋Š”๋‹ค.

๊ทธ๋ž˜์„œ AWS ์— ์žˆ๋Š” ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ๋‚ ๋ ค๋„ ๋˜๋‚˜?

์ฐœ์ฐœํ•˜๋‹ˆ๊นŒ ์›๋ž˜ AWS ์— ๋“ค์–ด์žˆ๋˜ 12 ๋งŒ๊ฐœ ์งœ๋ฆฌ ์ธ๋ฑ์Šค๋งŒ ๊ฐ„๋‹จํžˆ ๋„ฃ์–ด๋ณด์•˜๋‹ค. (๋‹ค๋ฅธ ํ•œ ๊ฐœ๋Š” 4700 ๋งŒ๊ฐœ๋ผ์„œ ์‹œ๊ฐ„์ด ์˜ค๋ž˜๊ฑธ๋ฆผโ€ฆ)

  • ์–ด์จŒ๋“  ๋ฐ์ดํ„ฐ๋Š” ๋‹ค ๋„์ปค ๋ณผ๋ฅจ์— ์žˆ์œผ๋‹ˆ๊นŒ docker-compose up && down ์œผ๋กœ gracefully ๊ป๋‹ค ํ‚ค๋ฉด ๋ฌธ์„œ๋Š” ๋ฉ€์ฉกํžˆ ๋“ค์–ด์žˆ๋‹ค.
  • Ctrl-C ๋ฅผ ์—ฐํƒ€ํ•ด์„œ ๊ฐ•์ œ๋กœ ๊บผ๋„ ๋‹ค์‹œ ํ‚ค๋ฉด ๋ฌธ์„œ๋Š” ์‚ด์•„์žˆ๋‹ค. ์ด๊ฒŒ ์›๋ž˜ ๊ทธ๋ ‡๊ฒŒ ๋˜๋Š”๊ฒƒ์ด ์ •์ƒ์ธ์ง€ ์•„๋‹ˆ๋ฉด ๊ทธ๋ƒฅ ์šด์ด ์ข‹์€ ๊ฒƒ์ธ์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค.
  • ์–ด์ฐจํ”ผ ๋‘˜์ด ๋™์‹œ์— ์ฃฝ์„ํ…๋ฐ ๋…ธ๋“œ๋ฅผ ์™œ ๋‘ ๊ฐœ ๋„์šฐ์ง€? Status: Yellow ๊ฐ€ ๋ณด๊ธฐ ์‹ซ์–ด์„œ? ๊ฒฐ๊ตญ ๋กœ์ปฌ ์ปดํ“จํ„ฐ์˜ ๋„์ปค ๋ณผ๋ฅจ์„ ๋ฐฑ์—… ์žฅ์†Œ๋กœ ์‚ผ๊ธฐ์—๋Š” ์ฐœ์ฐœํ•˜๋‹ค. ํ•˜๋“œ๋””์Šคํฌ๋„ ์ž˜ ๋ชป ๋ฏฟ๊ฒ ๊ณ .

๊ฒฐ๊ตญ

๋ฐ์ดํ„ฐ๋Š” ๋‹ค gzipped json ์œผ๋กœ dump ๋ฅผ ๋– ์„œ s3 ์— ์˜ฌ๋ ค๋†“์•„์„œ ์ตœ์•…์˜ ์ƒํ™ฉ์„ ๋Œ€๋น„ํ•˜๊ณ  (70 ๊ธฐ๊ฐ€๊ฐ€ 16 ๊ธฐ๊ฐ€๋กœ ์ค„์—ˆ๋‹ค), ์—ฐ๊ตฌ์‹ค ์ปดํ“จํ„ฐ์˜ ES ์—๋„ ์ธ๋ฑ์‹ฑ์„ ๋‹ค์‹œ ํ•ด๋†“์•„์„œ ์“ธ ์ˆ˜ ์žˆ๊ฒŒ ํ•ด๋†“์•˜๋‹ค.

ํ•˜๋‚˜๋งŒ ๋”

AWS ์—์„œ ์ œ๊ณตํ•˜๋Š” ES ๋ฅผ ์“ธ ์ ์—๋Š” ํ”Œ๋Ÿฌ๊ทธ์ธ์„ ๋ชป ๊น”์•„์„œ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ๋ชป ๋ถ™ํ˜”๋Š”๋ฐ ์™ ์ง€ ์ด๋ฒˆ์—๋Š” ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค. ๋ธ”๋กœ๊ทธ์—์„œ ํŠœํ† ๋ฆฌ์–ผ๋„ ๋งŽ์ด ๋ณธ ๊ฒƒ ๊ฐ™์œผ๋‹ˆ ๊ณ ๊ณ .

(์‚ฝ์งˆ์ฃผ์˜) ์ผ๋‹จ ์ปจํ…Œ์ด๋„ˆ๋กœ ๋“ค์–ด๊ฐ€์„œ ํ”Œ๋Ÿฌ๊ทธ์ธ์„ ๊น”์•„์ค€๋‹ค.

docker exec -it <์ปจํ…Œ์ด๋„ˆ ์ด๋ฆ„> /bin/bash

๋ณด๋‹ˆ๊นŒ ์€์ „ํ•œ๋‹ข ์ด๋ž€ ๋ฌผ๊ฑด์„ ๋งŽ์ด ์“ฐ๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ์„ค๋ช…์„ ์ฐธ์กฐํ•ด์„œ ์‚ฌ์šฉ ์ค‘์ธ ES ๋ฒ„์ „๊ณผ ํ•„์š”ํ•œ ๋ถ„์„๊ธฐ ๋ฒ„์ „์— ๋งž๊ฒŒ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋Œ๋ ค์ค€๋‹ค.

bash <(curl -s https://bitbucket.org/eunjeon/seunjeon/raw/master/elasticsearch/scripts/downloader.sh) -e 5.6.1 -p 5.4.1.0

์Šคํฌ๋ฆฝํŠธ๊ฐ€ ๋Œ๋‹ค๊ฐ€ ์ค‘๊ฐ„์— ์ปจํ…Œ์ด๋„ˆ์— zip ์ด ์•ˆ ๊น”๋ ค์žˆ๋‹ค๊ณ  ์—๋Ÿฌ๊ฐ€ ๋‚œ๋‹ค. ๊ทธ๋ƒฅ ๋žฉํƒ‘์—์„œ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์••์ถ•ํŒŒ์ผ์„ ์›น์— ์˜ฌ๋ฆฌ๋Š”๊ฒŒ ํŽธํ•œ ๊ฒƒ ๊ฐ™๋‹ค. ํ•˜์—ฌํŠผ ์ƒ์„ฑ๋œ elasticsearch-analysis-seunjeon-5.4.1.0.zip ํŒŒ์ผ์„ ์ปจํ…Œ์ด๋„ˆ์— ๋„ฃ์—ˆ์œผ๋ฉด

./bin/elasticsearch-plugin install file://`pwd`/elasticsearch-analysis-seunjeon-5.4.1.0.zip

ํ•˜๊ณ  ์˜ˆ์ œ๋Œ€๋กœ ์ธ๋ฑ์Šค๋ฅผ ๋งŒ๋“œ๋ ค๋Š”๋ฐ tokenizer ๊ฐ€ ์ธ์‹์ด ๋˜์ง€ ์•Š๋Š”๋‹ค. ์•„๋ฌด๋ž˜๋„ ์žฌ์‹œ์ž‘์„ ํ•ด์•ผํ•˜๋Š” ๊ฒƒ ๊ฐ™์€๋ฐ, ๋„์šด ๋„์ปค์—์„œ ์‹คํ–‰๋˜๊ณ  ์žˆ๋Š” ES ๋ฅผ ์–ด๋–ป๊ฒŒ ๋‹ค์‹œ ํ‚ค์ง€?


์ƒ๊ฐํ•ด๋ณด๋‹ˆ ์ด๋ ‡๊ฒŒ ์ปจํ…Œ์ด๋„ˆ ํ•˜๋‚˜๋งˆ๋‹ค ๋“ค์–ด๊ฐ€์„œ ๊นŒ๋Š” ๊ฒƒ๋„ ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹ˆ๊ณ , ํ”Œ๋Ÿฌ๊ทธ์ธ์„ ์„ค์น˜ํ•˜๊ณ  ๋‹ค์‹œ ์ผœ๋Š๋‹ˆ ๊ทธ๋ƒฅ ์„ค์น˜๋ฅผ ํ•œ ๋‹ค์Œ์— ์ผœ๋Š”๊ฒŒ ๋‚˜์€ ๊ฒƒ ๊ฐ™์•„์„œ ๊ฐ„๋‹จํžˆ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๊ฐ€ ์ด๋ฏธ ๊น”๋ ค์žˆ๋Š” ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค์–ด์„œ ๊ทธ๊ฑธ ๊ฐ€์ง€๊ณ  ์ปจํ…Œ์ด๋„ˆ๋ฅผ ๋„์šฐ๊ธฐ๋กœ ํ–ˆ๋‹ค.

FROM  docker.elastic.co/elasticsearch/elasticsearch:5.6.1
RUN wget https://www.dropbox.com/s/n59vve9sfguztjg/elasticsearch-analysis-seunjeon-5.4.1.0.zip
RUN ./bin/elasticsearch-plugin install file://`pwd`/elasticsearch-analysis-seunjeon-5.4.1.0.zip

Dockerfile ์„ ๋งŒ๋“ค๊ณ  compose ์— ์ž…๋ ฅํ•˜๋ฉด (์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ด๋ฏธ์ง€๊ฐ€ ๋…ธ๋“œ ๊ฐฏ์ˆ˜๋งŒํผ ๋งŒ๋“ค์–ด์ ธ์„œ ๋‚ญ๋น„๊ฐ€ ์‹ฌํ•˜๊ณ  ๋จผ์ € ์ด๋ฏธ์ง€๋ฅผ ๋งŒ๋“ค๊ณ  ๊ทธ๊ฑธ reference ํ•˜๋Š”๊ฒŒ ์ข‹์„๋“ฏํ•˜๋‹ค)

build: . #์ด๋ฏธ์ง€ ๋Œ€์‹ 

์ž˜ ๋„์›Œ์ง€๊ณ , tokenizer ๋„ ์ž˜ ์ธ์‹๋œ๋‹ค.

๊ทธ๋ž˜์„œ ์ž˜ ๋˜๋‚˜?

mapping ์—์„œ analyzer ๋ฅผ ์„ธํŒ… ์•ˆํ•˜๊ณ  ๋ฐ์ดํ„ฐ์—์„œ โ€˜์žฅ์‚ฌ๊พผโ€™์„ ๊ฒ€์ƒ‰ํ•˜๋ฉด 8 ๊ฑด์ด ๊ฒ€์ƒ‰๋˜๋˜ ๊ฒƒ์ด ์ „ ์„ค์ •์„ ํ•ด์ฃผ๊ณ  ๋‚˜๋ฉด (ํ•„๋“œ์— ๊ฑธ์–ด๋„ ๋˜๊ณ  ๋งตํ•‘์— ๊ฑธ์–ด๋„ ๋˜๊ณ  ๋“ฑ๋“ฑ ๋ฐฉ๋ฒ•์ด ๋‹ค์–‘ํ•จ ๋งํฌ) ํ›„ ์กฐ๊ธˆ ๋” ๋‚˜์˜จ๋‹ค.

๊ฒฐ๋ก 

  • ๋จธ์‹  ํ•˜๋‚˜์—์„œ ๋…ธ๋“œ ์—ฌ๋Ÿฌ ๊ฐœ๋Š” ์—ฐ์Šต์šฉ์— ๋ถˆ๊ณผํ•จ์„ ๋’ค๋Šฆ๊ฒŒ ๊นจ๋‹ฌ์Œ.
  • AWS ์—์„œ ํ•ด์ฃผ๋Š” ES Service ๋ณด๋‹ค๋Š” ๋ชป ๋ฏธ๋ฅ์ง€๋งŒ, ๊ทธ๋ž˜๋„ Docker Swarm ๊ฐ™์€ ๊ฒƒ์„ ์จ์„œ ๋จธ์‹  ์—ฌ๋Ÿฌ ๋Œ€์— ๋…ธ๋“œ๋ฅผ ๋Œ๋ฆฌ๊ณ  ์Šคํ† ๋ฆฌ์ง€๋„ EBS ๊ฐ™์€ ๊ฒƒ์„ ๋ฌผ๋ฆฌ๋ฉด ์ข€ ๋ฏฟ์„๋งŒ ํ•˜์ง€ ์•Š์„๊นŒ.
  • ์‹ค์‹œ๊ฐ„ + ๊ฒ€์ƒ‰์ด ํ•„์š”์—†์œผ๋ฉด ๊ทธ๋ƒฅ AWS Athena / GCP BigQuery ๊ฐ™์€๊ฒŒ ์งฑ์ด๊ฒ ๋‹ค.. ์š”์ฆ˜ ๋ธ”๋กœ๊ทธ๋“ค ๋ณด๋ฉด ElasticSearch ๋กœ ์‹œ์ž‘ํ–ˆ๋‹ค ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ์ฐพ์€ ์ด์•ผ๊ธฐ๊ฐ€ ๋งŽ์ด ๋‚˜์˜ค๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ์ฒ˜์Œ์— ์ž‘๊ฒŒ ์‹œ์ž‘ํ•˜๋”๋ผ๋„ ์ตœ์†Œํ•œ S3 ๋ž‘ ElasticSearch ๋‘ ๊ณณ์— ์ €์žฅ์„ ํ•ด์•ผ ๋‚˜์ค‘์— ๋ญ”๊ฐ€ ํ•ด๋ณผ ์ˆ˜ ์žˆ์„๋“ฏ.
  • (Update) ์ด ์ฃผ์ œ ๊ด€ํ•ด์„œ NDC18์˜ ์•ผ์ƒ์˜ ๋•… ๋“€๋ž‘๊ณ ์˜ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์ด์•ผ๊ธฐ: ๋กœ๊ทธ ์‹œ์Šคํ…œ ๊ตฌ์ถ• ๊ฒฝํ—˜ ๊ณต์œ ๊ฐ€ ๊ฐ€์žฅ ์ •๋ฆฌ๊ฐ€ ์ž˜ ๋˜์žˆ๋Š” ๊ฑฐ ๊ฐ™๋‹ค. ์ €๊ฑธ ๋‹ค ํ˜ผ์ž ํ•˜์‹œ๋Š” ๊ฒƒ๋„ ๋Œ€๋‹จํ•˜๊ณ , ๊ณ„์† ์ข‹์€ ์„œ๋น„์Šค๋ฅผ ๋‚ด๋†“๋Š” ์•„๋งˆ์กด๋„ ๋Œ€๋‹จํ•˜๋‹ค.
  • (Update) ์ด์ œ ๊ณต์‹ ํ•œ๊ธ€ ํ˜•ํƒœ์†Œ ๋ถ„์„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ์„ฑ๋Šฅ์ด ๋” ์ข‹๋‹ค๊ณ . ๋งํฌ

์ฐธ์กฐํ•œ ๊ธ€๋“ค

  1. ์•ˆ์ •์ ์ธ ์„œ๋น„์Šค ์šด์˜์„ ์œ„ํ•œ ์„œ๋ฒ„ ๋ชจ๋‹ˆํ„ฐ๋ง #2
  2. elasticsearch-analysis-seunjeon

Published Sep 21, 2017

If I keep marking the dots, someday they will ๐Ÿ”—๐Ÿ”—