This tool enables containerized load testing of Azure Data Explorer (ADX) clusters by executing KQL queries from Docker containers. The KQL query is kept outside and is exposed to the tool as an online Python script (configured by QUERY_SCRIPT_URL). The script must contain a function get_query() which returns the KQL query. Having a function allows us to randomize aspects of the query. Here is an example Python script.
For reporting query performance, ADX's built-in management command .show queries can be used. In order to measure E2E query latency, an optional Application Insights instrumentation can also be provided via APPINSIGHTS_INSTRUMENTATIONKEY.
The tool requires Azure AD for Authentication. Please follow this document on how to provision an AAD application and assign it relevant permissions on the ADX cluster.
CLUSTER_QUERY_URL=https://<ADX_CLUSTER>.<REGION>.kusto.windows.net
CLIENT_ID=<AAD_CLIENT_ID>
CLIENT_SECRET=<AAD_SECRET>
TENANT_ID=<AAD_TENANT>
DATABASE_NAME=adx_db
QUERY_SCRIPT_URL=https://.../query.py
TEST_ID=my_stressful_test
QUERY_CONSISTENCY=weakconsistency
APPINSIGHTS_INSTRUMENTATIONKEY=<APPINSIGHTS_INSTRUMENTATIONKEY>
QUERIES_TOTAL=100
- If TEST_IDis not provided, a guid will be generated.
- Default QUERY_CONSISTENCYvalue isweakconsistency. This document describes Query consistency in detail.
- Application Insights instrumentation will be ignored if APPINSIGHTS_INSTRUMENTATIONKEYis not provided.
- The tool will run indefinitely if QUERIES_TOTALis not provided.
Create a .env file with above configuration, then run;
docker run -it --rm --env-file .env syedhassaanahmed/azure-kusto-load-test
Generating concurrent load requires a Kubernetes (k8s) cluster. Here are some of the options to create a cluster;
- Use the in-built Kubernetes in Docker Desktop for Windows.
- Install Minikube.
- Create an AKS cluster in Azure.
Once the k8s cluster is up and running, modify the above environment variables in deployment.yaml file and run the following;
kubectl apply -f deployment.yaml
Logs from a successfully running deployment can be viewed by;
kubectl logs -l app=adx-load-test
To stop the load tests;
kubectl delete deployment adx-load-test
For a given test run my_stressful_test;
E2E duration of all completed queries.
customMetrics
| where name == "query_time" and customDimensions.test_id == "my_stressful_test"
| summarize percentiles(value, 5, 50, 95) by bin(timestamp, 1m)
| render timechartDuration of all completed queries as measured by the ADX query engine.
.show queries 
| where Database == "<DATABASE_NAME>" 
    and State == "Completed"
    and Text endswith "TEST_ID=my_stressful_test"
| extend Duration = Duration / time(1s)
| summarize percentiles(Duration, 5, 50, 95) by bin(StartedOn, 1m)
| render timechartNumber of queries/second issued during a test run.
.show queries 
| where Database == "<DATABASE_NAME>" and Text endswith "TEST_ID=my_stressful_test"
| summarize TotalQueriesSec=count() by bin(StartedOn, 1s)  
| render timechartCorrelation between average query duration and disk misses
.show queries
| where Database == "<DATABASE_NAME>"
    and State == "Completed"
    and Text endswith "TEST_ID=my_stressful_test"
| summarize DiskMisses = avg(toint(CacheStatistics.Disk.Misses)), 
    Duration = avg(Duration / 1s) 
    by bin(StartedOn, 1m)
| render timechartFind Test IDs for tests executed in last 7 days
.show queries 
| where Database == "<DATABASE_NAME>" and StartedOn > ago(7d)
| parse Text with * "TEST_ID=" TestId
| where isnotempty(TestId)
| distinct TestId