-
Notifications
You must be signed in to change notification settings - Fork 61
Elasticsearch
5.1 Install Python Elasticsearch Client
5.2 Install git and clone the repository
5.3 Put the movie_db data onto the Elasticsearch cluster
5.4 Perform a simple search in python
5.5 Additional notes
Elasticsearch is a distributed NoSQL JSON document database derived from Lucene. Elasticsearch provides a full-text search service and is used quite extensively with websites such as Quora, Github, StackExchange and many more. The The RESTful API provides a simple to use interface with the distributed database allowing simple integration with websites. In this dev-op we will be deploying Elasticsearch on an AWS cluster and perform a simple query.
We would recommend using t2.micro instances with Ubuntu Server 14.04 LTS (HVM), SSD Volume Type and take advantage of Amazon’s Free Tier program. Be sure to terminate the instances when you are finished to prevent AWS charges if you go over the 700 hour limit. For practice you can try spinning up 3 nodes for Elasticsearch.
Elasticsearch will be installed on all nodes with the same configuration.
Run the following on the all nodes by SSH-ing into each node:
node$ sudo apt-get update Install java-development-kit node$ sudo apt-get install openjdk-7-jdk Install Elasticsearch node$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.5.2.tar.gz -P ~/Downloads node$ sudo tar -xvf ~/Downloads/elasticsearch-1.5.2.tar.gz -C /usr/local node$ sudo mv /usr/local/elasticsearch-1.5.2 /usr/local/elasticsearch Set the ELASTICSEARCH_HOME environment variable and add to PATH in .profile node$ nano ~/.profile # Add the following export ELASTICSEARCH_HOME=/usr/local/elasticsearch export PATH=$PATH:$ELASTICSEARCH_HOME/bin node$ source ~/.profile Install AWS Cloud Plugin for Elasticsearch node$ sudo $ELASTICSEARCH_HOME/bin/plugin install elasticsearch/elasticsearch-cloud-aws/2.5.0 Configure Elasticsearch for node discovery node$ sudo nano $ELASTICSEARCH_HOME/config/elasticsearch.yml
Find out more about the Insight Data Engineering Fellows Program in New York and Silicon Valley, apply today, or sign up for program updates.
You can also read our engineering blog here.