-
Notifications
You must be signed in to change notification settings - Fork 52
SQLAlchemy proof-of-concept #1152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
OKAY, so I finally ran some performance tests for SQLAlchemy, for that I used DICOM archiving, which is a script in which most time is taken by file compression, but which still does a good number of SQL queries, notably a lot of small
* I ignore the first run because the files were not in the hard drive cache at this point. The results can vary by a few seconds on each run, nevertheless, there is a 1s slowdown in the SQLAlchemy runs compared to the raw SQL ones, but it still faster than our current script. Those are certainly very broad results so I can do more granular benchmarks if that is really really wanted. RAW results
|
DO NOT MERGE YET, I'll change the |
f65fc6b
to
f43ef19
Compare
Yay ! The PR is ready to be merged 🥳. Obviously SQLAlchemy is not called in the code yet, but it does define the namespace, and a few classes I want to use in the new DICOM archive PR, which I think is a good beginning now that the decision was taken. I also took the liberty or re-ordering the requirements to be alphabetical. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maximemulder I understand that dicom_archive makes more sense than tarchive but I would stick to the table names in the ORM since I highly doubt those tables are going to be renamed.
Also, dicom_archive_file should be tarchive_files (missing the s) as there is an s at the end of the table.
@cmadjar I have a very high preference for not doing these changes. Given our messy database, I think one of the advantages of using an ORM is to abstract some of this messiness away to have a more consistent and coherent codebase without changing the schema itself. I obviously do not want to hide structural details like the association between phantoms and candidates, but I think we should go for more coherent and consistent naming where we can.
Other than clearer code (IMO), I think there are a few arguments to be made on why these occasional renamings are a good idea :
* BTW, I also added a comment on the |
Okay, if anyone wants to see SQLAlchemy in use in the codebase, I have this branch for which I will make a PR sometime soon. |
* remove trailing whitespaces (#1149) Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * SQLAlchemy proof-of-concept (#1152) * add sqlalchemy * fix lints * add comment --------- Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * fix (#1168) Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * integration test (#1138) * test * main Loris here * ignore Loris python * clone Loris * rm loris * override test * override test * override RB data * fix folder * test php files * test * try * test * try install loris mri * os * test * rm modules * Update flake8_python_linter.yml * Small linting improvements (#1167) * better linting * remove tab in comment --------- Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * Improve Python subject determination and validation error handling (#1146) * improve subjects determination and validation * use tuple for database query arguments * query on single line --------- Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * Add literal regex leading r (#1176) Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * fix (#1178) Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * LORIS-MRI tooling configuration (#1170) * add pyproject.toml * keep flake8 (for now) * double type checking * Use Pyright as main type checker * remove comment * test * test * change configuration * add python environment variables * try factorization * test * test * test * test * remove temporary things * test ruff error * ruff output github * further test errors * test pyright output github * test * test * test * change comment * remove test errors * test --------- Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * update docker install (#1171) Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> * add RB override tables for the MRI side (#1173) * Remove php 8.1 and 8.2 from the LORIS tests to keep only test on PHP 8.3 (#1175) * remove php 8.1 and 8.2 testing * update to 8.3 in Dockerfile.test.php8 * ADD minc tools install in gitactions (#1179) * todo install loris mri * test * run pytest in gitaction (#1172) * test * ttt * test * test * test * test * test * test * test * test * test * test * test * test * test * test * test * test * test * Delete test_example.py * rename * done * Delete test/Dockerfile.test.py * test (#1187) * unit tests poc (#1188) * add more pytest global rules (#1194) * Stricter linter configuration (#1193) * stricter ruff config * do not ignore N818 * lint for sorted imports * add python scripts directory (#1196) * Bunch of SQLAlchemy definitions (#1192) * bunch of sqlalchemy definitions * add project orm definition * fix candidate to site query function * optimize docker image (#1195) * fix pytest config (#1197) * fix python scripts (#1199) * remove mysql-connector (#1200) * Improve subject configuration structure (#1150) * refacor subject config * remove suspicious phantom code * change naming * fix rebase lints * cleaner database url * Cecile CandID path join bug fix Co-authored-by: Cécile Madjar <cecile.madjar@mcin.ca> --------- Co-authored-by: Cécile Madjar <cecile.madjar@mcin.ca> * Integration test to check that the ORM definitions match the LORIS SQL database (#1198) * orm sync integration test * add docker health check * use integration config file * test error * Revert "test error" This reverts commit d1a70e6. * Clean up Docker database scripts (#1202) * clean docker database * remove docker warning * remove unused attribute (#1201) * try fix perl ci * Add file, file_parameter and scanner SQLAlchemy models (#1207) * add tables * rename file model * rename parameter_file * fix lints (#1210) * Unify LORIS-MRI logging (#1191) * new env * change print order in database log * create notification type if needed * fix typing pyright update * Integration test for `run_dicom_archive_loader.py` (#1203) add integration test put keys in github repo debugging add database check add comments and debugging remove s3 keys change converter in test check file tree fix doc fix docs 2 rename variable go back to using s3 keys fix path try fix * rename database sub-namepaces (#1214) * fix warnings (#1215) * Factorize get subject session (#1190) * factorize subject session * re-add site log * Migrate "get all sites from the database" to the SQLAlchemy abstraction (#1216) * Fix new PEP585 Ruff lint (#1218) * Clean-up database scan type (#1139) Complementary PR of LORIS #9304 * Migrate DICOM archive files to new database abstraction (#1217) * wip * use self.dicom_archive.id instead of self.tarchive_id * fix todo * Fix SQLAlchemy file model (#1221) * Improve Python requirements order (#1222) * add python documentation (#1223) * Update SQLAlchemy bindings to match LORIS#9556 (#1229) * trigger_tests * try update * fix old queries for integration tests * Migrate `ImagingUpload` and `MriUploadDB` to the new database abstraction (#1224) * migrate mri upload to sqlalchemy * fix concurrency bug * fix clean up * display mri upload ids use dicom archive id instead of tarchiveid because it is prettier ruff check * migrate scanner to new abstraction (#1232) * Fix install script (#1234) * fix install script and move files related to install into the install directory * fix install script and move files related to install into the install directory * fix install script and move files related to install into the install directory * fix paths in imaging_install_test.sh * fix paths in imaging_install_test.sh * fix SQL error in install script * fix tests * fix tests * fix github action * add cpanfile * copy cpanfile in MRI dockerfile * add new dependency in order to be able to source the Digest::BLAKE2 in the cpanfile * fix Digest::BLAKE2 installation though cpanfile * move config templates to a templates directory and requirements files into a requirements directory * remove copy * fix test * fix test * fix test * fix test * Maxime's feedback * specify python version (#1237) * fix large file hash memory (#1236) * add lib.util module (#1238) * New DICOM study import script (#1117) * rewrite dicom archive * update * remove --year and --target options * get session from config file * remove unused upload argument * fix little oopsie for --session * handle large files hash * fix unhandled modalities * use new lib.util module * Add support to be able to read headers from enhanced DICOMs (#7) * print * print * print * print * add support for enhanced DICOMs * add support for enhanced DICOMs * rework associate files with series * sort dicom series and files before inserting in the database * handle null scanner --------- Co-authored-by: Cécile Madjar <cecile.madjar@mcin.ca> * refactoring of CandID on the non-ORM python side * tests * fix insertion in mri_protocol_violation_scans * alphabetical order is hard sometimes * FIx `get_candidate_id` documentation --------- Co-authored-by: Maxime Mulder <maxime-mulder@outlook.com> Co-authored-by: Maxime Mulder <maxime.mulder@mcgill.ca> Co-authored-by: Shen <kongtiaowangshen@gmail.com>
This PR introduces SQLAlchemy in LORIS-MRI, which will be discussed at the next imaging meeting. It builds on #1151, so diff against that branch. It does not contain queries yet, only some definition files for a few tables, but if the proposal is accepted there are a few queries I will want to convert fast !
Caveat for existing projects
In release 27.* of LORIS-MRI, we will support both the SQLAlchemy and the old direct SQL way of querying the database but as of 28.* of LORIS-MRI, all libraries under
database_lib
and the librarydatabase.py
will be removed so we encourage you to update your code accordingly.Ensure you install SQLAlchemy>=2.0.0 after upgrading the code using
pip install sqlalchemy>=2.0.0
orpip upgrade sqlalchemy
if SQLAlchemy is already listed in your pip packages but not using version >=2.0.0