Realtime Logs Processing With Apache Airflow, Kafka & Elastisearch
airflow commands
airflow init db
airflow migrate
airflow db init
airflow migrate
airflow webserver -p 8180
airflow scheduler
export AIRFLOW_HOME=${pwd}
airflow users create \
--username admin \
--firstname Vaibhav \
--lastname Bansal \
--role Admin \
--email vaibhav.bansal2020@gmail.com
-
Store Secrets in AWS Secrets Manager a. KAFKA_SASL_USERNAME b. KAFKA_SASL_PASSWORD c. KAFKA_BOOTSTRAP_SERVER d. ELASTICSEARCH_URL e. ELASTICSEARCH_API_KEY
-
Create account in confluent kafka -> Create Environment -> Provision Cluster -> Create Topic
-
Get bootstrap Server from Cluster Settings in Kafka
-
Get bootstrap server url, username and password from API keys (Generate if not available)
-
Create environment and cluster in confluent kafka
-
Create account in elasticsearch
-
Create index in elasticserarch with partitions with a particular name in dags and get the elasticsearch url and api key to be stored in aws secrets manager
-
CREATE IAM USER
-
CREATE S3 Bucket
-
Go to AWS MANAGER APACHE AIRFLOW -> Create New Environment -> Link it to your dags -> Run Dag On Airflow
-
Configure the path for dag folder, S3 bucket, & requirements file
-
Create VPC and Add Secret Manager Read-Write policy to role in MWAA