-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Summary
Missing log messages in Elasticsearch indices when running openshift-test.sh
script.
Details
Note: the fix from PR #5 needs to be applied/merged first.
Assume the following ENV variables:
export USE_FLUENTD=true
export NMESSAGES=110
export NPROJECTS=10
When the test openshift-test.sh
is executed it fails (time-outs) on verification of expected records in the index for the last project (i.e. project-09
). Specifically:
error: 110 messages not found in this-is-project-09
Further investigation reveals that this index is missing some records (note the index this-is-project-09.1.2016.06.21
containing only 88
documents instead of 110
):
# curl http://10.40.2.198:9200/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open this-is-project-07.1.2016.06.21 5 1 110 0 52.6kb 52.6kb
yellow open this-is-project-09.1.2016.06.21 5 1 88 0 42.6kb 42.6kb
yellow open .operations.2016.06.21 5 1 110 0 37.9kb 37.9kb
yellow open this-is-project-10.1.2016.06.21 5 1 110 0 53.2kb 53.2kb
yellow open this-is-project-06.1.2016.06.21 5 1 110 0 52.9kb 52.9kb
yellow open this-is-project-08.1.2016.06.21 5 1 110 0 53.1kb 53.1kb
yellow open this-is-project-04.1.2016.06.21 5 1 110 0 53.1kb 53.1kb
yellow open this-is-project-03.1.2016.06.21 5 1 110 0 52.6kb 52.6kb
yellow open this-is-project-02.1.2016.06.21 5 1 110 0 52.6kb 52.6kb
yellow open this-is-project-01.1.2016.06.21 5 1 110 0 53.1kb 53.1kb
yellow open this-is-project-05.1.2016.06.21 5 1 110 0 52.6kb 52.6kb
Note the size of source log files in the /tmp/tmp.pKiGhuIsLw/data/docker
folder below, all are equal (which is expected and it means they all contain the same number of log messages):
+ ls -R -alrtF /tmp/tmp.pKiGhuIsLw
/tmp/tmp.pKiGhuIsLw:
total 240
drwx------. 4 root root 100 Jun 21 13:47 ./
-rw-r--r--. 1 root root 243210 Jun 21 13:47 orig
drwxr-xr-x. 2 root root 80 Jun 21 13:47 config/
drwxr-xr-x. 3 root root 120 Jun 21 13:47 data/
drwxrwxrwt. 13 root root 280 Jun 21 13:50 ../
/tmp/tmp.pKiGhuIsLw/config:
total 12
drwx------. 4 root root 100 Jun 21 13:47 ../
-rw-r--r--. 1 root root 583 Jun 21 13:47 syslog-input.conf
-rw-r--r--. 1 root root 4582 Jun 21 13:47 fluent.conf
drwxr-xr-x. 2 root root 80 Jun 21 13:47 ./
/tmp/tmp.pKiGhuIsLw/data:
total 40
drwx------. 4 root root 100 Jun 21 13:47 ../
drwxr-xr-x. 2 root root 240 Jun 21 13:47 docker/
-rw-r--r--. 1 root root 30030 Jun 21 13:47 messages
-rw-r--r--. 1 root root 52 Jun 21 13:47 system.pos
drwxr-xr-x. 3 root root 120 Jun 21 13:47 ./
-rw-r--r--. 1 root root 1740 Jun 21 13:47 es-containers.log.pos
/tmp/tmp.pKiGhuIsLw/data/docker:
total 320
drwxr-xr-x. 2 root root 240 Jun 21 13:47 ./
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-01_this-is-project-01_this-is-container-01-4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-02_this-is-project-02_this-is-container-02-53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-03_this-is-project-03_this-is-container-03-1121cfccd5913f0a63fec40a6ffd44ea64f9dc135c66634ba001d10bcf4302a2.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-04_this-is-project-04_this-is-container-04-7de1555df0c2700329e815b93b32c571c3ea54dc967b89e81ab73b9972b72d1d.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-05_this-is-project-05_this-is-container-05-f0b5c2c2211c8d67ed15e75e656c7862d086e9245420892a7de62cd9ec582a06.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-06_this-is-project-06_this-is-container-06-06e9d52c1720fca412803e3b07c4b228ff113e303f4c7ab94665319d832bbfb7.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-07_this-is-project-07_this-is-container-07-10159baf262b43a92d95db59dae1f72c645127301661e0a3ce4e38b295a97c58.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-08_this-is-project-08_this-is-container-08-aa67a169b0bba217aa0aa88a65346920c84c42447c36ba5f7ea65f422c1fe5d8.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-09_this-is-project-09_this-is-container-09-2e6d31a5983a91251bfae5aefa1c0a19d8ba3cf601d0e8a706b4cfa9661a6b8a.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-10_this-is-project-10_this-is-container-10-917df3320d778ddbaa5c5c7742bc4046bf803c36ed2b050f30844ed206783469.log
drwxr-xr-x. 3 root root 120 Jun 21 13:47 ../
Further we can see that there was possible issue in pushing the data to Elasticsearch (there was 1 rejected bulk request - bulk.rejected
):
# curl http://10.40.2.198:9200/_cat/thread_pool?v
host ip bulk.active bulk.queue bulk.rejected index.active index.queue index.rejected search.active search.queue search.rejected
15dfd31fb381 172.17.0.2 0 0 1 0 0 0 0 0 0
However, Elasticsearch log does not show any errors. We can see that after the cluster is started expected indices were created and mapping was updated as documents are indexed. That is all:
# docker logs --tail=2000 15dfd31fb381
CLUSTER_NAME not set - using viaq
+ set -e
+ PORT=9200
+ HOST=localhost
+ LOG_FILE=elasticsearch_connect_log.txt
+ RETRY_COUNT=30
+ RETRY_INTERVAL=1
+ retry=30
+ max_time=30
+ timeouted=false
+ '[' -z '' ']'
+ echo CLUSTER_NAME not set - using viaq
+ export CLUSTER_NAME=viaq
+ CLUSTER_NAME=viaq
+ mkdir -p /elasticsearch/viaq
+ '[' -n '' ']'
+ regex='^([[:digit:]]+)([GgMm])$'
+ [[ 8G =~ ^([[:digit:]]+)([GgMm])$ ]]
+ num=8
+ unit=G
+ [[ G =~ [Gg] ]]
+ (( num = num * 1024 ))
+ [[ 8192 -lt 512 ]]
+ ES_JAVA_OPTS=' -Xms256M -Xmx4096m'
+ /usr/share/elasticsearch/bin/elasticsearch
+ add_index_template
+ wait_for_port_open
+ rm -f elasticsearch_connect_log.txt
+ echo -n 'Checking if Elasticsearch is ready on localhost:9200 '
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
Checking if Elasticsearch is ready on localhost:9200 .[2016-06-21 11:47:32,270][INFO ][node ] [Mister Sensitive] version[1.5.2], pid[11], build[62ff986/2015-04-27T09:21:06Z]
[2016-06-21 11:47:32,271][INFO ][node ] [Mister Sensitive] initializing ...
[2016-06-21 11:47:32,275][INFO ][plugins ] [Mister Sensitive] loaded [], sites []
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
..[2016-06-21 11:47:34,351][INFO ][node ] [Mister Sensitive] initialized
[2016-06-21 11:47:34,353][INFO ][node ] [Mister Sensitive] starting ...
[2016-06-21 11:47:34,487][INFO ][transport ] [Mister Sensitive] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.17.0.2:9300]}
[2016-06-21 11:47:34,499][INFO ][discovery ] [Mister Sensitive] elasticsearch/ABy2trHPR2-4ve1jp0SiEQ
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
...[2016-06-21 11:47:37,520][INFO ][cluster.service ] [Mister Sensitive] new_master [Mister Sensitive][ABy2trHPR2-4ve1jp0SiEQ][15dfd31fb381][inet[/172.17.0.2:9300]], reason: zen-disco-join (elected_as_master)
[2016-06-21 11:47:37,545][INFO ][http ] [Mister Sensitive] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.2:9200]}
[2016-06-21 11:47:37,545][INFO ][node ] [Mister Sensitive] started
[2016-06-21 11:47:37,595][INFO ][gateway ] [Mister Sensitive] recovered [0] indices into cluster_state
+ (( retry -= 1 ))
+ (( retry == 0 ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' -f elasticsearch_connect_log.txt ']'
+ grep -q 200 elasticsearch_connect_log.txt
+ echo '- connection successful'
- connection successful
+ curl -v -X GET 'http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=30s'
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to localhost port 9200 (#0)
* Trying ::1...
* Connected to localhost (::1) port 9200 (#0)
> GET /_cluster/health?wait_for_status=yellow&timeout=30s HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 252
<
{ [data not shown]
100 252 100 252 0 0 15247 0 --:--:-- --:--:-- --:--:-- 14823
* Connection #0 to host localhost left intact
+ curl -v -X PUT -d@/usr/share/elasticsearch/config/com.redhat.viaq-template.json http://localhost:9200/_template/viaq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to localhost port 9200 (#0)
* Trying ::1...
* Connected to localhost (::1) port 9200 (#0)
> PUT /_template/viaq HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
> Content-Length: 7320
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
} [data not shown]
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 21
<
{ [data not shown]
100 7341 100 21 100 7320 246 86075 --:--:-- --:--:-- --:--:-- 87142
* Connection #0 to host localhost left intact
{"cluster_name":"elasticsearch","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"number_of_pending_tasks":0}{"acknowledged":true}[2016-06-21 11:47:49,686][INFO ][cluster.metadata ] [Mister Sensitive] [.operations.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:49,881][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-02.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:49,963][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-07.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,030][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-01.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,088][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-10.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,134][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-05.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,184][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-06.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,243][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-03.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,293][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-08.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,374][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-09.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,432][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-04.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:53,609][INFO ][cluster.metadata ] [Mister Sensitive] [.operations.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,695][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-05.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,789][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-01.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,868][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-06.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,933][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-02.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,981][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-07.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,035][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-08.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,084][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-03.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,146][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-10.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,148][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-04.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,252][INFO ][cluster.metadata ] [Mister Sensitive] [this-is-project-09.1.2016.06.21] update_mapping [fluentd] (dynamic)