171 upgrade tf 2190 #172

david-thrower · 2025-04-11T22:26:57Z

Summary of changes:

Replaced the text embedding base model with an interleaved Rotaty Positional Embedding iRoPE in the Phishing detection NLP proof of concept
Proof of concept that the entire model can scale at O(n) timing as sequence length increases infinitely.

Benchmarks:

seq_len     val_bianry_accuracy       min/model        total_min         timing_relative_to_1024                      Commit_SHA
3072          0.955                    65.942           329.715                    2.817                     4bc217b36d1baf8b9a81fe53000e9e13e6e87801
1536          0.96                     37.27             186.36                    1.591                     286ba81a1e51493d748ded727bd602a4398248a8
1024          0.952                    23.42             117.08                    1.0                       9893bfc55d4f7b753eff79e0c5c70e4992c61085

Upgraded tf to 2.19.0
Upgraded Jax
Upgraded model architecture in both phishing CICD example and CIFAR10 example to accommodate tf 2.19.0.
Removed obsolete BERT embedding CICD test.

Comment temporarily disable time-consuming workflows. Comment out BERT based text classification workflow possibly permanently, as this is obsolete.

Add branch to workflow.

Added a baseline fine tuning of the full GPT2 to compare against Cerebros text classifier.

Forgot to add dropout.

Amendments to Cerebros model.

Reduce seq length to accelerate job completion.

Up timeout to 300 min.

Correct history indexing error.

Temporary test to fast forward to cerebros model.

Comment out an artifact of GPT test so we this can lint and run.

Fix errors from trying to work too fast ...

Re-corrected the metrics BinaryAccuracy to correct AI introduced error.

Correct metric to rank by (binary accuracy) ...

Uncomment out GPT test ...

Upped number of trials to 5.

Make seq len 750, fix typo.

Try 1024 seq len.

Added branch to the workflow...

Added a positional embedding and a LayerNorm to the text embedding.

Missed position embedding in copy and paste ...

Synchronize embedding dim across embeddings.

Corrected import of PositionEmbedding.

Remove layernorm, concat instead of add.

Try addition to merge embeddings without LayerNorm

Restore optimal run with position embedding. Reduce max levels to fit the optimal run and reduce overhead. Test this to see if it works. if successful, add back the commented out comparison and PR. Then open an issue to optimize the params around this new model. We may need to run this on Katib to optimize the hyperparameters, as the model is fundamentally different than the original and can probably be optimized considerably.

Hard set levels to the known optimum.

Corrected hard set on levels to correct optima.

Restore the best model yet.

Add back the CICD test for image CLS. Prepare for PR.

Added back baseline workflow in best trial thus far.

Added all CICD tests to be used back to best NLP configuration.

Upgrade tf

Upgrade tensorflow-text to v 2.19.0

Add branch to workflows.

Typo on requirements.txt

Test to fast forward to Cerebros NLP test and check for compatibility.

Attempt to correct issue with tf v 2.19.0 graph scope.

Another attempt to resolve tf v 2.19.0 graph scope compatibility...

Run a full CICD run.

AI suggested tf 2.15.0 -> 2.19.0 compat fix.

Add back the baseline GPT2 task.

Fix a typo in string termination ...

Uncommented out CICD test that was left commented out by error.

david-thrower · 2025-04-11T22:57:01Z

All checks have passed on this version of the code:

All tests except the stand - alone - val - set Ames (accidentally omitted, because the tests commented out for dev were not un - commned out): https://github.com/david-thrower/cerebros-core-algorithm-alpha/actions/runs/14403085349/job/40393037501
Ames (completed) + another repeat of all other tests (in progress at this time) https://github.com/david-thrower/cerebros-core-algorithm-alpha/actions/runs/14413200622/job/40425374243?pr=172

Aidyn-Lopez · 2025-04-11T23:34:25Z

Looks great, and I approve.

Aidyn-Lopez

Looks great, and I approve.

david-thrower added 30 commits March 22, 2025 14:16

Update automerge.yml

30164c7

Comment temporarily disable time-consuming workflows. Comment out BERT based text classification workflow possibly permanently, as this is obsolete.

Update automerge.yml

8904966

Add branch to workflow.

Update phishing_email_detection_gpt2.py

c7e8b30

Added a baseline fine tuning of the full GPT2 to compare against Cerebros text classifier.

Update phishing_email_detection_gpt2.py

b790e64

Update phishing_email_detection_gpt2.py

15ec9c2

Forgot to add dropout.

Update phishing_email_detection_gpt2.py

0cfb488

Amendments to Cerebros model.

Update phishing_email_detection_gpt2.py

6f86959

Reduce seq length to accelerate job completion.

Update automerge.yml

830a2dc

Up timeout to 300 min.

Update phishing_email_detection_gpt2.py

407f90c

Correct history indexing error.

Update phishing_email_detection_gpt2.py

d5bdbce

Temporary test to fast forward to cerebros model.

Update phishing_email_detection_gpt2.py

d8db0f1

Comment out an artifact of GPT test so we this can lint and run.

Update phishing_email_detection_gpt2.py

014b3c3

Fix errors from trying to work too fast ...

Update phishing_email_detection_gpt2.py

0b67f88

Re-corrected the metrics BinaryAccuracy to correct AI introduced error.

Update phishing_email_detection_gpt2.py

a480dfd

Correct metric to rank by (binary accuracy) ...

Update phishing_email_detection_gpt2.py

0e72e61

Uncomment out GPT test ...

Update phishing_email_detection_gpt2.py

3cd5945

Upped number of trials to 5.

Update phishing_email_detection_gpt2.py

6a9e88d

Make seq len 750, fix typo.

Update phishing_email_detection_gpt2.py

f24a858

Try 1024 seq len.

Update automerge.yml

4e15756

Added branch to the workflow...

Update phishing_email_detection_gpt2.py

9a4db15

Added a positional embedding and a LayerNorm to the text embedding.

Update phishing_email_detection_gpt2.py

59cfa23

Missed position embedding in copy and paste ...

Update phishing_email_detection_gpt2.py

d928a54

Synchronize embedding dim across embeddings.

Update phishing_email_detection_gpt2.py

3c25a22

Corrected import of PositionEmbedding.

Update phishing_email_detection_gpt2.py

88a1bd5

Remove layernorm, concat instead of add.

Update phishing_email_detection_gpt2.py

42d9c4f

Try addition to merge embeddings without LayerNorm

Update phishing_email_detection_gpt2.py

cdb4455

Hard set levels to the known optimum.

Update phishing_email_detection_gpt2.py

048eb1b

Corrected hard set on levels to correct optima.

Update phishing_email_detection_gpt2.py

b800cf7

Restore the best model yet.

Update automerge.yml

7930a2d

Add back the CICD test for image CLS. Prepare for PR.

david-thrower added 13 commits April 10, 2025 11:08

Update phishing_email_detection_gpt2.py

6df20aa

Added back baseline workflow in best trial thus far.

Update automerge.yml

794fc23

Added all CICD tests to be used back to best NLP configuration.

Update requirements.txt

3e467fe

Upgrade tf

Update cicd-requirements.txt

96897ae

Upgrade tensorflow-text to v 2.19.0

Update automerge.yml

a66e6a6

Add branch to workflows.

Update requirements.txt

5c58d65

Typo on requirements.txt

Update phishing_email_detection_gpt2.py

2c417fb

Test to fast forward to Cerebros NLP test and check for compatibility.

Update phishing_email_detection_gpt2.py

44854be

Attempt to correct issue with tf v 2.19.0 graph scope.

Update phishing_email_detection_gpt2.py

f46ad78

Another attempt to resolve tf v 2.19.0 graph scope compatibility...

Update automerge.yml

4806719

Run a full CICD run.

Update cifar10-example.py

783368e

AI suggested tf 2.15.0 -> 2.19.0 compat fix.

Update phishing_email_detection_gpt2.py

3b9ffc0

Add back the baseline GPT2 task.

Update phishing_email_detection_gpt2.py

97591d3

Fix a typo in string termination ...

david-thrower linked an issue Apr 11, 2025 that may be closed by this pull request

upgrade-tf-2.19.0 #171

Closed

david-thrower marked this pull request as draft April 11, 2025 22:38

Update automerge.yml

1de46fc

Uncommented out CICD test that was left commented out by error.

david-thrower marked this pull request as ready for review April 11, 2025 23:00

david-thrower requested a review from Aidyn-Lopez April 11, 2025 23:00

Aidyn-Lopez approved these changes Apr 11, 2025

View reviewed changes

david-thrower merged commit 6dfb427 into main Apr 12, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

171 upgrade tf 2190 #172

171 upgrade tf 2190 #172

Uh oh!

david-thrower commented Apr 11, 2025 •

edited

Loading

Uh oh!

david-thrower commented Apr 11, 2025

Uh oh!

Aidyn-Lopez commented Apr 11, 2025

Uh oh!

Aidyn-Lopez left a comment

Uh oh!

Uh oh!

Uh oh!

171 upgrade tf 2190 #172

171 upgrade tf 2190 #172

Uh oh!

Conversation

david-thrower commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of changes:

Benchmarks:

Uh oh!

david-thrower commented Apr 11, 2025

All checks have passed on this version of the code:

Uh oh!

Aidyn-Lopez commented Apr 11, 2025

Uh oh!

Aidyn-Lopez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

david-thrower commented Apr 11, 2025 •

edited

Loading