First, start by cloning the repository using the following command:
git clone https://github.com/lapiceroazul4/App_Airbnb.git
It's essential to have Docker installed beforehand. We recommend using version 26.0.0. You can find more information about the installation process at the following link: Docker Installation.
Recommendation: Execute the following commands with superuser permissions. You can do this by running
sudo -i
on Unix systems or by running the command prompt as an administrator on Windows.
-
Navigate to the
Docker/Compose/
directory. -
Execute the following command:
docker compose up -d --build
. -
Verify that the services have been successfully deployed by running
docker ps
. -
At this point, the services should be deployed, and you can test the application at the following URL:
-
To log in, use the following credentials:
User: admin@admin.com
Pass: Admin
-
To check the HAProxy statistics, you can visit the following URL:
http://localhost:5080/haproxy?stats
using these credentials:User: admin
Pass: admin
Recommendation: The VMs must be named “serverAirbnb” and “workerAirbnb”, otherwise the deployment with Swarm will fail.
-
Navigate to the
Docker/Swarm/
directory. -
Execute the following command to initialize the Swarm:
docker swarm init --advertise-addr localhost
. Copy the token and insert it into the worker node. -
Deploy by executing:
docker stack deploy -c docker-compose.yml App_Airbnb
. -
Scale the web service by running:
docker service scale App_Airbnb_web1=3
. At this point, the web1 service will have 3 replicas; if you wish to change the number of replicas, simply replace the 3 with the desired value. -
If you want to scale another service, the process is similar and can be done as follows:
docker service scale App_Airbnb_'service_name'='number_of_replicas'
. -
To verify that the process was successful, you can execute:
docker service ls
, where you will see the name of the services, the number of replicas, and other additional information. -
Similar to the execution with Docker Compose, you can also verify the functionality from the URL, using the same route:
To log in, use the following credentials:
User: admin@admin.com
Pass: Admin
-
To check the HAProxy statistics, you can visit the following URL:
http://localhost:5080/haproxy?stats
using these credentials:User: admin
Pass: admin
Ensure that the latest version of Spark is installed on your VM. The current version is 3.5.1, which can be downloaded from this link.
- Log into your Ubuntu server.
- Install required Python libraries using pip
pip install -r requirements.txt
- Navigate to the directory of the cloned repository:
cd App_Airbnb/
- **Move the clusterAirbnbsApache/ directory to your desired execution path (note: you must update the file path in the application where the CSV is read):
sudo mv clusterAirbnbsApache/ /home/vagrant/
- Create a directory in /home/vagrant to store the results:
sudo mkdir clusterAirbnb
- Start the master node by navigating to the Spark sbin directory:
cd /home/vagrant/labSpark/spark-3.5.1-bin-hadoop3/sbin./start-master.sh
- Start a worker node in the same directory
./start-worker.sh spark://192.168.100.3:7077
- Launch the application:
cd /home/vagrant/labSpark/spark-3.5.1-bin-hadoop3/bin./spark-submit --master spark://192.168.100.3:7077 --conf ./spark.executor.memory=1g /home/vagrant/clusterAirbnbsApache/appReservas.py
- Move the clusterAirbnb/ directory to a shared folder:
mv clusterAirbnb/ /vagrant/
NOTE: You can now perform various operations with the CSV files. In our case, we upload these CSVs to the cloud using the script.py located in the Power BI folder. We recommend starting the page as admin and exploring the dashboards created.