Add TBE data configuration reporter to TBE forward (v3) #4455

gchalump · 2025-07-08T18:31:36Z

Summary:
Re-land attempt of D75462895

Add TBE data configuration reporter to TBE forward call.

The reporter reports TBE data configuration at the SplitTableBatchedEmbeddingBagsCodegen forward call. The output is a TBEDataConfig object, which is written to a JSON file(s). The configuration of its environment variables and an example of its usage is described below.

Just Knobs for enablement

fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS is added for enablement of the reporter (https://www.internalfb.com/intern/justknobs/?name=fbgemm_gpu%2Ffeatures)
- Default is set to False, enable this flag to enable reporter.
- To enable it locally use:
```
jk canary set fbgemm_gpu/features:TBE_REPORT_INPUT_PARAMS --on --ttl 600
```

Environment Variables

The Reporter relies on several environment variables to control its behavior. Below is a description of each variable:

FBGEMM_REPORT_INPUT_PARAMS_INTERVAL:
- Description: Determines the interval at which reports are generated. This is specified in terms of the number of iterations.
- Example Value: 1 (report every iteration)
FBGEMM_REPORT_INPUT_PARAMS_ITER_START:
- *Description: Specifies the start of the iteration range to capture reports. Default 0.
- *Example Value: 0 (start reporting from the first iteration)

-   **FBGEMM_REPORT_INPUT_PARAMS_ITER_END**:
  -   ***Description**: Specifies the end of the iteration range to capture reports. Use `-1` to report until the last iteration. Default -1.
  -   ***Example Value**: `-1` (report until the last iteration)

FBGEMM_REPORT_INPUT_PARAMS_BUCKET:
- Description: Specifies the name of the Manifold bucket where the report data will be saved.
- Example Value: tlparse_reports
FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX:
- Description: Defines the path prefix where the report files will be stored. Path will be created if not exist.
- Example Value: tree/tests/

Example Usage

Below is an example command demonstrating how to use the FBGEMM Reporter with specific environment variable settings:

FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2  FBGEMM_REPORT_INPUT_PARAMS_ITER_START=3
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/ buck2 run mode/opt //deeplearning/fbgemm/fbgemm_gpu/bench:split_table_batched_embeddings -- device --iters 2

Explanation

The above setting will report iter 3 and iter 5

FBGEMM_REPORT_INPUT_PARAMS_INTERVAL=2: The reporter will generate a report every 2 iterations.
FBGEMM_REPORT_INPUT_PARAMS_ITER_START=0: The reporter will start generating reports from the first iteration.
FBGEMM_REPORT_INPUT_PARAMS_ITER_END=-1 (Default): The reporter will continue to generate reports until the last iteration interval.
FBGEMM_REPORT_INPUT_PARAMS_BUCKET=tlparse_reports: The reports will be saved in the tlparse_reports bucket.
FBGEMM_REPORT_INPUT_PARAMS_PATH_PREFIX=tree/tests/: The reports will be stored with the path prefix tree/tests/. For Manifold make sure all folders within the path exist.

Note on Benchmark example

Note that with the --iters 2 option, the benchmark will execute 6 forward calls (2 iterations plus 1 warmup) for the forward benchmark and another 3 calls (2 iterations plus 1 warmup) for the backward benchmark. Iteration starts from 0.

Other includes changes in this Diff:

Updates build dependency of tbe_data_config* files
Remove shutil and numpy.random lib as it cause uncompatiblity error.
Add non-OSS test, writing extracted config data json file to Manifold

Differential Revision: D76992907

netlify · 2025-07-08T18:31:41Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`e4fd0fe`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68714b9881741300089b0b83
😎 Deploy Preview	https://deploy-preview-4455--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.