This project is made available under a modified MIT license. See the LICENSE file.
- This is not a Qlik Supported Project/Product.
- Contributions such as Issues, Pull Request and additional codes are welcomed.
- Qlik Inc. or Qlik Support has no affiliation with this project. The initial version was developed by Clever Anjos who is currently employed as Principal Solution Architect at Qlik Partner Engineering Team.
The purpose of this script is use the AWS Cost Explorer API endpoint to retrieve cost data about one (or more) AWS accounts.
Based on a configuration file, this script will generate csv files that can be ingested by BI tools like Qlik Sense. We provide a qvf that is a template for that data consumption.
This project was inspired by this AWS sample
Since the AWS Cost API only permits the extraction of two dimensions at same time you need to make some decisions before deploying this project. Define which dimensions pairs you want to collect. For example (Subscription/Service and Service/(Tag - Cost Center). After that you need to configure your reports.json file (check "Configuration file" for more details). After that, you need to change your Qlik Sense app to retrieve the data you are extracting.
There are several options to run this script.
One option is use this script as a standalone program to be executed in one server or machine with access to a S3 bucket.
This script may be deployed as well as a AWS Lambda function. Using this serveless approach is possible because the footprint to execute the script is really tiny (128Mb for small AWS accounts) increasing accordingly to how many records you retrieve from the API. For example, daily detailed extractions consumes more memory than monthly ones.
After you decide how you are executing this script, you need to configure the file reports.json. This file contains one array of objects with these parameters:
Most of the parameters relates to API Parameters please read this documentation carefully before creating your reports.
- Report - This is the report name. You report will be extracted as report.csv
- GroupBy - One or two (the AWS API only allow up to 2 dimensions) dimensions or tags to accumulate your report
- Granularity - "DAILY"/"MONTHLY"
- Split - "true" (the script will send one request by day) / "false" (the script will send one request for calculated start -- end timeframe)
- Metric - Check the API documentation for possible metrics to be extracted
You need to set some environment variables before executing this script:
- S3_BUCKET - S3 bucket name used to store the extracted data in a csv format
- *MONTHS - How many months prior to today to extract
- CURRENT_MONTH - true(only the current month is extracted)/false(MONTHS parameter will be used to calculate the dates)
We provide one template to consume the data extracted by the script. You can import it into your Qlik Sense environment and modify to adjust to your needs. Some screenshots below (intentionally blurred)
- An S3 bucket to store the final csv result. If this is not present, the file will be stored into the TEMP directory
- An AWS access key pair for IAM user with permissiont to connect to Cost Explorer API and the S3 bucket
- Python 3.10
- Pandas 1.4.2
- Boto 3
One efficient way of executing this script is using AWS Lambda servless archictecture that allows the execution with a tiny footprint. Please refer to this documentation about how to create a lambda function
One important mention is that we are using one layer provided by this project
There are costs involved in data extraction from Cost Explorer API:
- Cost Explorer API calls
- AWS Lambda Invocation (if you choose this option)
- Usually Free
- Amazon S3
- Minimal usage
- Jun 2022 - Initial version
- Q3 2022 - AutoML integration (Planned)
Feel free to suggest any improvements. We will evaluate and implement when your suggestion was accepted. Please use the Feature Request