Data Generated Cube is a Python project designed to facilitate the generation and analysis of Magic: The Gathering cubes from Cube Cobra and Scryfall data sources. This tool is particularly useful for data scientists, analysts and the quantitatively minded enthusiast who seeks a deeper understanding of cube construction within the cube format.
I maintain a cube generated using this pipeline, but it is capable of creating cubes of any size and category.
- Python 3.7 or higher
- Dependencies listed in requirements.txt
-
Clone the repository:
git clone https://github.com/l0gr1thm1k/data-generated-cube.git
-
Navigate to the project directory. The precise path you use will depend on where you cloned the repository. For example, if you cloned the repository to your home directory, you would use the following command:
cd data-generated-cube
-
Install the required dependencies:
pip install -r requirements.txt
The pipeline works by doing the following.
- Create a cube configuration file.
- running the
__main__.py
script with the path to your configuration file as an argument.
The cube creation pipeline takes in a JSON configuration file of the following form.
{
"cubeName": "example_data_generated_cube",
"cardBlacklist": null,
"cardCount": 360,
"cubeCategory": "Vintage",
"cubeIds": [
"modovintage",
"wtwlf123",
"synergy",
"LSVCubeInit",
"AlphaFrog"
],
"overwrite": true,
"stages": [
"scrape",
"create",
"analyze"
],
"useCubeCobraBucket": true
}
You can see full example configuration files here. Here is a table breaking down each key in the configuration file:
Key | Description | Example | Notes |
---|---|---|---|
cubeName | The name of the cube you are generating. | example_data_generated_cube | any string will do |
cardBlacklist | A list of card names that you do not want to include in the cube. | ["Black Lotus", "Mox Pearl"] | Can be null or a list of string values |
cardCount | The number of cards you want in the cube. | 360 | This will generate the cube at the target size but you may still sample cubes at +/- 10% of this size |
cubeCategory | The cube category you want to generate. | Vintage | This is the cube category you want to generate. Options are Vintage, Powered, Unpowered, Pauper, Peasant, Budget, Silver-bordered, Commander, Battle Box, Multiplayer, Judge Tower |
cubeIds | A list of cube IDs from Cube Cobra that you want to include in the cube. | ["modovintage", "wtwlf123", "synergy", "LSVCubeInit", "AlphaFrog"] | This is a list of cube IDs from Cube Cobra that you want to include in the cube. It can be the shortID which are generally human readable or the long IDs, which are GUID values. |
overwrite | A boolean value indicating whether you want to overwrite the cube if it already exists. | true | If true, the cube will be overwritten if it already exists. If false, the cube will not be overwritten if it already exists. |
stages | A list of stages you want to run in the pipeline. | ["scrape", "create", "analyze"] | This is a list of stages you want to run in the pipeline. Options are scrape, create, and analyze. You can skip 'scrape' for example if you just want to regenerate the cube with previously crawled data. |
useCubeCobraBucket | A boolean value indicating whether you want to use the Cube Cobra bucket. | true | If true, the Cube Cobra bucket will be used. If false, the Cube Cobra bucket will not be used. |
The Cube Cobra bucket is a bucket in the Cube Cobra S3 bucket that contains all the cube data. Ths project uses the bucket data to streamline the process of gathering cubes to sample for the data generated cube. The bucket requires two variables to be set in your environment.
You will need to contact the admin of Cube Cobra Gwen Dekker in order to get your own access keys if you would like to use the AWS data. For a quicker result, I recommend setting this boolean value to false and supplying your own list of cube IDs in the configuration file.
To run the pipeline, you will need to run the __main__.py
script with the path to your configuration JSON as in
the main.py file.
-
Navigate to the project directory. Again this depends on where you cloned the repository.:
cd data-generated-cube
-
Update
__main__.py
to include the path to your configuration file.if __name__ == "__main__": main("path/to/your/configuration/file.json")
-
Run the
__main__.py
script.python __main__.py
- If you have questions related to this pipeline you can reach out to me here on GitHub.
- Questions related to Cube Cobra can be directed to the Cube Cobra Discord
- Questions related to Scryfall can be directed to the Scryfall GitHub
Many thanks to all the folks in the cube community who have made suggestions and improvements over the years to this pipeline. Special thanks to Gwen Dekker for access to the data and for his help in providing guidance on how to use the Cube Cobra bucket. Thanks to Keldan Campbell for suggesting updates to the sampling process based on cube frequency and recency of updates.