Re-factored the scripts that fetch the data from the central Db and r… #713

DennisGibz · 2025-10-28T13:18:11Z

…emoved the sub-query which was elementing some records into the ODS esspeciary the ct_Patient

Summary by Sourcery

Refactor the CT_DOCKET ODS scripts to replace nested subqueries with CTE-driven pipelines that leverage row_number for record deduplication and reintroduce previously omitted records by removing the faulty filtering subquery.

Bug Fixes:

Remove the subquery that was omitting records in CT_PatientVisits and other scripts to ensure complete data load

Enhancements:

Refactor MERGE logic in CT_DOCKET scripts to use CTE-based source, ordered, and max-ordered pipelines for deduplication across patient visits, ART patients, patients, pharmacy, and status loads
Standardise record filtering and deduplication by introducing row_number partitions instead of nested subqueries
Streamline and simplify SQL scripts by removing redundant subqueries and consolidating logic into reusable CTE patterns

…emoved the sub-query which was elementing some records into the ODS esspeciary the ct_Patient

sourcery-ai · 2025-10-28T13:18:18Z

Reviewer's Guide

This PR refactors the ODS CT_DOCKET SQL scripts by replacing embedded subqueries used for deduplication and max-date selection with more readable CTE-based pipelines that use row_number() partitions to identify and merge only the latest records.

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientVisits

sequenceDiagram
    participant DWAPICentral
    participant CTEs
    participant ODS_Care_CT_PatientVisits
    DWAPICentral->>CTEs: Extract PatientVisit data
    CTEs->>CTEs: Partition and rank records by SiteCode, PatientPK, VisitDate, VisitID
    CTEs->>CTEs: Select latest record per partition (rank=1)
    CTEs->>ODS_Care_CT_PatientVisits: Merge latest records into ODS table

Sequence diagram for the new CTE-based ETL pipeline in Load_CT_ARTPatient

sequenceDiagram
    participant DWAPICentral
    participant CTEs
    participant ODS_Care_CT_ARTPatients
    DWAPICentral->>CTEs: Extract PatientArt data
    CTEs->>CTEs: Partition and rank records by SiteCode, PatientPK
    CTEs->>CTEs: Select latest record per partition (rank=1)
    CTEs->>ODS_Care_CT_ARTPatients: Merge latest records into ODS table

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientPharmacy

sequenceDiagram
    participant DWAPICentral
    participant CTEs
    participant ODS_Care_CT_PatientPharmacy
    DWAPICentral->>CTEs: Extract PatientPharmacy data
    CTEs->>CTEs: Partition and rank records by SiteCode, PatientPK, DispenseDate, Drug, VisitID
    CTEs->>CTEs: Select latest record per partition (rank=1)
    CTEs->>ODS_Care_CT_PatientPharmacy: Merge latest records into ODS table

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientStatus

sequenceDiagram
    participant DWAPICentral
    participant CTEs
    participant ODS_Care_CT_PatientStatus
    DWAPICentral->>CTEs: Extract PatientStatus data
    CTEs->>CTEs: Partition and rank records by SiteCode, PatientPK, ExitDate, ExitReason
    CTEs->>CTEs: Select latest record per partition (rank=1)
    CTEs->>ODS_Care_CT_PatientStatus: Merge latest records into ODS table

Class diagram for refactored ETL CTE pipeline structure

classDiagram
    class DWAPICentral {
        +PatientExtract
        +PatientVisitExtract
        +PatientArtExtract
        +PatientPharmacyExtract
        +PatientStatusExtract
        +Facility
    }
    class CTE_Pipeline {
        +row_number() partitioning
        +rank selection
        +deduplication
    }
    class ODS_Care_CT_Patient {
        +PatientID
        +PatientPK
        +SiteCode
        +FacilityName
        +... (other patient attributes)
    }
    class ODS_Care_CT_PatientVisits {
        +PatientID
        +PatientPK
        +SiteCode
        +FacilityName
        +VisitID
        +VisitDate
        +... (other visit attributes)
    }
    class ODS_Care_CT_ARTPatients {
        +PatientID
        +PatientPK
        +SiteCode
        +FacilityName
        +... (other ART attributes)
    }
    class ODS_Care_CT_PatientPharmacy {
        +PatientID
        +PatientPK
        +SiteCode
        +FacilityName
        +Drug
        +DispenseDate
        +... (other pharmacy attributes)
    }
    class ODS_Care_CT_PatientStatus {
        +PatientID
        +PatientPK
        +SiteCode
        +FacilityName
        +ExitDate
        +ExitReason
        +... (other status attributes)
    }
    DWAPICentral --> CTE_Pipeline
    CTE_Pipeline --> ODS_Care_CT_Patient
    CTE_Pipeline --> ODS_Care_CT_PatientVisits
    CTE_Pipeline --> ODS_Care_CT_ARTPatients
    CTE_Pipeline --> ODS_Care_CT_PatientPharmacy
    CTE_Pipeline --> ODS_Care_CT_PatientStatus

File-Level Changes

Change	Details	Files
Refactor deduplication logic to CTE-based ranking	Introduce source, ordered, and max-ordered CTEs to isolate and rank raw extracts Remove nested subqueries for max IDs and created dates in favor of row_number() windows Standardize gender != 'Unknown' and facility code >0 filters across all scripts Update MERGE USING clauses to consume CTE results directly	`Scripts/ODS/CT_DOCKET/CT_PatientVisits.sql` `Scripts/ODS/CT_DOCKET/Load_CT_ARTPatient.sql` `Scripts/ODS/CT_DOCKET/Load_CT_Patient.sql` `Scripts/ODS/CT_DOCKET/CT_PatientPharmacy.sql` `Scripts/ODS/CT_DOCKET/CT_PatientStatus.sql`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In Load_CT_ARTPatient.sql the final MERGE is using the Ordered_ct_PatientArt_source CTE instead of the MaxOrdered_ct_PatientArt_source CTE, so you’re merging all rows rather than only the latest ones.
In CT_PatientPharmacy.sql the CTE names (e.g. Ordered_ct_patient_source) don’t clearly reflect the pharmacy context—consider renaming them to Ordered_ct_PatientPharmacy_source to avoid confusion or collision.
All scripts repeat large column lists in multiple CTEs and the MERGE—consider centralizing or auto-generating the column list (e.g. via a view or script) to reduce duplication and maintenance overhead.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In Load_CT_ARTPatient.sql the final MERGE is using the Ordered_ct_PatientArt_source CTE instead of the MaxOrdered_ct_PatientArt_source CTE, so you’re merging all rows rather than only the latest ones.
- In CT_PatientPharmacy.sql the CTE names (e.g. Ordered_ct_patient_source) don’t clearly reflect the pharmacy context—consider renaming them to Ordered_ct_PatientPharmacy_source to avoid confusion or collision.
- All scripts repeat large column lists in multiple CTEs and the MERGE—consider centralizing or auto-generating the column list (e.g. via a view or script) to reduce duplication and maintenance overhead.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

nobert-mumo · 2025-10-30T16:26:20Z

@DennisGibz just to confirm is the main change replacing the sub queries with the new CTEs? The sourcery-ai gives some overview but it would be nice to provide a context what necessitated the change for better review.

Re-factored the scripts that fetch the data from the central Db and r…

d9f26bd

…emoved the sub-query which was elementing some records into the ODS esspeciary the ct_Patient

DennisGibz requested review from Marymary-dev, nobert-mumo and nthusi-codes October 28, 2025 13:18

sourcery-ai bot reviewed Oct 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Re-factored the scripts that fetch the data from the central Db and r… #713

Re-factored the scripts that fetch the data from the central Db and r… #713

Uh oh!

DennisGibz commented Oct 28, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Oct 28, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

nobert-mumo commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Re-factored the scripts that fetch the data from the central Db and r… #713

Are you sure you want to change the base?

Re-factored the scripts that fetch the data from the central Db and r… #713

Uh oh!

Conversation

DennisGibz commented Oct 28, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientVisits

Sequence diagram for the new CTE-based ETL pipeline in Load_CT_ARTPatient

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientPharmacy

Sequence diagram for the new CTE-based ETL pipeline in CT_PatientStatus

Class diagram for refactored ETL CTE pipeline structure

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

nobert-mumo commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DennisGibz commented Oct 28, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Oct 28, 2025 •

edited

Loading