Skip to content

Missing records from project logs #6

@lukas-vlcek

Description

@lukas-vlcek

Summary

Missing log messages in Elasticsearch indices when running openshift-test.sh script.

Details

Note: the fix from PR #5 needs to be applied/merged first.

Assume the following ENV variables:

export USE_FLUENTD=true
export NMESSAGES=110
export NPROJECTS=10

When the test openshift-test.sh is executed it fails (time-outs) on verification of expected records in the index for the last project (i.e. project-09). Specifically:

error: 110 messages not found in this-is-project-09

Further investigation reveals that this index is missing some records (note the index this-is-project-09.1.2016.06.21 containing only 88 documents instead of 110):

# curl http://10.40.2.198:9200/_cat/indices?v
health status index                           pri rep docs.count docs.deleted store.size pri.store.size 
yellow open   this-is-project-07.1.2016.06.21   5   1        110            0     52.6kb         52.6kb 
yellow open   this-is-project-09.1.2016.06.21   5   1         88            0     42.6kb         42.6kb 
yellow open   .operations.2016.06.21            5   1        110            0     37.9kb         37.9kb 
yellow open   this-is-project-10.1.2016.06.21   5   1        110            0     53.2kb         53.2kb 
yellow open   this-is-project-06.1.2016.06.21   5   1        110            0     52.9kb         52.9kb 
yellow open   this-is-project-08.1.2016.06.21   5   1        110            0     53.1kb         53.1kb 
yellow open   this-is-project-04.1.2016.06.21   5   1        110            0     53.1kb         53.1kb 
yellow open   this-is-project-03.1.2016.06.21   5   1        110            0     52.6kb         52.6kb 
yellow open   this-is-project-02.1.2016.06.21   5   1        110            0     52.6kb         52.6kb 
yellow open   this-is-project-01.1.2016.06.21   5   1        110            0     53.1kb         53.1kb 
yellow open   this-is-project-05.1.2016.06.21   5   1        110            0     52.6kb         52.6kb 

Note the size of source log files in the /tmp/tmp.pKiGhuIsLw/data/docker folder below, all are equal (which is expected and it means they all contain the same number of log messages):

+ ls -R -alrtF /tmp/tmp.pKiGhuIsLw
/tmp/tmp.pKiGhuIsLw:
total 240
drwx------.  4 root root    100 Jun 21 13:47 ./
-rw-r--r--.  1 root root 243210 Jun 21 13:47 orig
drwxr-xr-x.  2 root root     80 Jun 21 13:47 config/
drwxr-xr-x.  3 root root    120 Jun 21 13:47 data/
drwxrwxrwt. 13 root root    280 Jun 21 13:50 ../

/tmp/tmp.pKiGhuIsLw/config:
total 12
drwx------. 4 root root  100 Jun 21 13:47 ../
-rw-r--r--. 1 root root  583 Jun 21 13:47 syslog-input.conf
-rw-r--r--. 1 root root 4582 Jun 21 13:47 fluent.conf
drwxr-xr-x. 2 root root   80 Jun 21 13:47 ./

/tmp/tmp.pKiGhuIsLw/data:
total 40
drwx------. 4 root root   100 Jun 21 13:47 ../
drwxr-xr-x. 2 root root   240 Jun 21 13:47 docker/
-rw-r--r--. 1 root root 30030 Jun 21 13:47 messages
-rw-r--r--. 1 root root    52 Jun 21 13:47 system.pos
drwxr-xr-x. 3 root root   120 Jun 21 13:47 ./
-rw-r--r--. 1 root root  1740 Jun 21 13:47 es-containers.log.pos

/tmp/tmp.pKiGhuIsLw/data/docker:
total 320
drwxr-xr-x. 2 root root   240 Jun 21 13:47 ./
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-01_this-is-project-01_this-is-container-01-4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-02_this-is-project-02_this-is-container-02-53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-03_this-is-project-03_this-is-container-03-1121cfccd5913f0a63fec40a6ffd44ea64f9dc135c66634ba001d10bcf4302a2.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-04_this-is-project-04_this-is-container-04-7de1555df0c2700329e815b93b32c571c3ea54dc967b89e81ab73b9972b72d1d.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-05_this-is-project-05_this-is-container-05-f0b5c2c2211c8d67ed15e75e656c7862d086e9245420892a7de62cd9ec582a06.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-06_this-is-project-06_this-is-container-06-06e9d52c1720fca412803e3b07c4b228ff113e303f4c7ab94665319d832bbfb7.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-07_this-is-project-07_this-is-container-07-10159baf262b43a92d95db59dae1f72c645127301661e0a3ce4e38b295a97c58.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-08_this-is-project-08_this-is-container-08-aa67a169b0bba217aa0aa88a65346920c84c42447c36ba5f7ea65f422c1fe5d8.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-09_this-is-project-09_this-is-container-09-2e6d31a5983a91251bfae5aefa1c0a19d8ba3cf601d0e8a706b4cfa9661a6b8a.log
-rw-r--r--. 1 root root 29810 Jun 21 13:47 this-is-pod-10_this-is-project-10_this-is-container-10-917df3320d778ddbaa5c5c7742bc4046bf803c36ed2b050f30844ed206783469.log
drwxr-xr-x. 3 root root   120 Jun 21 13:47 ../

Further we can see that there was possible issue in pushing the data to Elasticsearch (there was 1 rejected bulk request - bulk.rejected):

# curl http://10.40.2.198:9200/_cat/thread_pool?v
host         ip         bulk.active bulk.queue bulk.rejected index.active index.queue index.rejected search.active search.queue search.rejected 
15dfd31fb381 172.17.0.2           0          0             1            0           0              0             0            0               0 

However, Elasticsearch log does not show any errors. We can see that after the cluster is started expected indices were created and mapping was updated as documents are indexed. That is all:

# docker logs --tail=2000 15dfd31fb381
CLUSTER_NAME not set - using viaq
+ set -e
+ PORT=9200
+ HOST=localhost
+ LOG_FILE=elasticsearch_connect_log.txt
+ RETRY_COUNT=30
+ RETRY_INTERVAL=1
+ retry=30
+ max_time=30
+ timeouted=false
+ '[' -z '' ']'
+ echo CLUSTER_NAME not set - using viaq
+ export CLUSTER_NAME=viaq
+ CLUSTER_NAME=viaq
+ mkdir -p /elasticsearch/viaq
+ '[' -n '' ']'
+ regex='^([[:digit:]]+)([GgMm])$'
+ [[ 8G =~ ^([[:digit:]]+)([GgMm])$ ]]
+ num=8
+ unit=G
+ [[ G =~ [Gg] ]]
+ (( num = num * 1024 ))
+ [[ 8192 -lt 512 ]]
+ ES_JAVA_OPTS=' -Xms256M -Xmx4096m'
+ /usr/share/elasticsearch/bin/elasticsearch
+ add_index_template
+ wait_for_port_open
+ rm -f elasticsearch_connect_log.txt
+ echo -n 'Checking if Elasticsearch is ready on localhost:9200 '
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
Checking if Elasticsearch is ready on localhost:9200 .[2016-06-21 11:47:32,270][INFO ][node                     ] [Mister Sensitive] version[1.5.2], pid[11], build[62ff986/2015-04-27T09:21:06Z]
[2016-06-21 11:47:32,271][INFO ][node                     ] [Mister Sensitive] initializing ...
[2016-06-21 11:47:32,275][INFO ][plugins                  ] [Mister Sensitive] loaded [], sites []
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
..[2016-06-21 11:47:34,351][INFO ][node                     ] [Mister Sensitive] initialized
[2016-06-21 11:47:34,353][INFO ][node                     ] [Mister Sensitive] starting ...
[2016-06-21 11:47:34,487][INFO ][transport                ] [Mister Sensitive] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.17.0.2:9300]}
[2016-06-21 11:47:34,499][INFO ][discovery                ] [Mister Sensitive] elasticsearch/ABy2trHPR2-4ve1jp0SiEQ
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' false == false ']'
+ echo -n .
+ sleep 1
...[2016-06-21 11:47:37,520][INFO ][cluster.service          ] [Mister Sensitive] new_master [Mister Sensitive][ABy2trHPR2-4ve1jp0SiEQ][15dfd31fb381][inet[/172.17.0.2:9300]], reason: zen-disco-join (elected_as_master)
[2016-06-21 11:47:37,545][INFO ][http                     ] [Mister Sensitive] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.2:9200]}
[2016-06-21 11:47:37,545][INFO ][node                     ] [Mister Sensitive] started
[2016-06-21 11:47:37,595][INFO ][gateway                  ] [Mister Sensitive] recovered [0] indices into cluster_state
+ ((  retry -= 1  ))
+ ((  retry == 0  ))
+ curl -s --max-time 30 -o elasticsearch_connect_log.txt localhost:9200
+ '[' -f elasticsearch_connect_log.txt ']'
+ grep -q 200 elasticsearch_connect_log.txt
+ echo '- connection successful'
- connection successful
+ curl -v -X GET 'http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=30s'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* About to connect() to localhost port 9200 (#0)
*   Trying ::1...
* Connected to localhost (::1) port 9200 (#0)
> GET /_cluster/health?wait_for_status=yellow&timeout=30s HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 252
< 
{ [data not shown]
100   252  100   252    0     0  15247      0 --:--:-- --:--:-- --:--:-- 14823
* Connection #0 to host localhost left intact
+ curl -v -X PUT -d@/usr/share/elasticsearch/config/com.redhat.viaq-template.json http://localhost:9200/_template/viaq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* About to connect() to localhost port 9200 (#0)
*   Trying ::1...
* Connected to localhost (::1) port 9200 (#0)
> PUT /_template/viaq HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:9200
> Accept: */*
> Content-Length: 7320
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
} [data not shown]
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Content-Length: 21
< 
{ [data not shown]
100  7341  100    21  100  7320    246  86075 --:--:-- --:--:-- --:--:-- 87142
* Connection #0 to host localhost left intact
{"cluster_name":"elasticsearch","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"active_primary_shards":0,"active_shards":0,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"number_of_pending_tasks":0}{"acknowledged":true}[2016-06-21 11:47:49,686][INFO ][cluster.metadata         ] [Mister Sensitive] [.operations.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:49,881][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-02.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:49,963][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-07.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,030][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-01.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,088][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-10.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,134][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-05.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,184][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-06.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,243][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-03.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,293][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-08.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,374][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-09.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:50,432][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-04.1.2016.06.21] creating index, cause [auto(bulk api)], templates [], shards [5]/[1], mappings [fluentd]
[2016-06-21 11:47:53,609][INFO ][cluster.metadata         ] [Mister Sensitive] [.operations.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,695][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-05.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,789][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-01.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,868][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-06.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,933][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-02.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:53,981][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-07.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,035][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-08.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,084][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-03.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,146][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-10.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,148][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-04.1.2016.06.21] update_mapping [fluentd] (dynamic)
[2016-06-21 11:47:54,252][INFO ][cluster.metadata         ] [Mister Sensitive] [this-is-project-09.1.2016.06.21] update_mapping [fluentd] (dynamic)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions