Skip to content

Conversation

pylipp
Copy link
Contributor

@pylipp pylipp commented Jun 27, 2025

@aerinsol feel free to check out the SQL query

@pylipp
Copy link
Contributor Author

pylipp commented Jun 27, 2025

@pylipp pylipp force-pushed the incoming-shipments-in-movedboxes-data branch from dac1879 to 27d89d6 Compare July 2, 2025 09:40
Copy link

codecov bot commented Jul 2, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.43%. Comparing base (8d1157e) to head (2c55f98).
Report is 41 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2208   +/-   ##
=======================================
  Coverage   74.43%   74.43%           
=======================================
  Files         276      276           
  Lines       19067    19068    +1     
  Branches     1744     1744           
=======================================
+ Hits        14192    14193    +1     
  Misses       4831     4831           
  Partials       44       44           
Flag Coverage Δ
frontend 65.46% <ø> (ø)
sharedComponents 66.88% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pylipp pylipp requested review from HaGuesto and aerinsol July 2, 2025 10:23
@pylipp pylipp marked this pull request as ready for review July 2, 2025 10:24
Copy link
Member

@aerinsol aerinsol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarifications requested.

shipment sh
ON
d.shipment_id = sh.id AND
d.removed_on IS NULL AND
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be on a WHERE statement for clarity - I would keep the JOINS together and the WHERE statement at the end as that's the convention. Unless you have a reason to do things that way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the operation would be more performant when the filtering of WHERE is already included in the JOIN conditions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't work like that - it's actually an important conceptual difference that ties into indexing and Pk selection, and I recommend you look at the tech behind join optimization if you're going to try to write SQL that way.

sh.receiving_started_on IS NOT NULL AND
sh.canceled_on IS NULL AND
sh.state != %s
JOIN products sp ON sp.id = d.source_product_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why an inner join for source products and a left join for target products?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to show data of all boxes in the shipment that's being received, regardless of being reconciled or not.
While the shipment is in receiving state, boxes that have not yet been reconciled, have NULL target product. Using an inner join for target products would exclude those boxes.
The source product however is always not NULL because it's copied and set at the moment the box the added to the shipment on the source side (i.e. when a shipment detail is created)

-- Collect information about all boxes sent to the specified base as target, that
-- were not removed from the shipment during preparation
SELECT
DATE(dps.receiving_started_on) AS moved_on,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the shipment receive start date. Food for thought - should we use the shipment receive start date, or the reconciliation date? Because technically none of the contents of the shipment have been verified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you want to avoid that data is being displayed that is supposed to be in the shipment but has not been reconciled. I like the idea, it would also simplify the entire query quite a lot (no need for distinguished reconciled and not-yet reconciled boxes).

Just one scenario that could be a bit confusing: assume reconciliation is being start on April 30th and finished on May 1st. When users apply a time range filter on the statviz page to show only April data, they won't see the boxes reconciled in May included, and they might think "why is only half of the shipment displayed that I received in April"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed: use detail.received_on date, and only include reconciled boxes

UNION ALL
-- Collect information about all boxes sent to the specified base as target, that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think for the sankey to be coherent, we will need to see the created boxes / items for stuff registered. Is this already in the larger stats queries?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the same query for the Sankey, we've already included boxes that are created in a Donated location (we assume these boxes have been gone out to a different org, and are not part of the warehouse anymore).

Do you mean that in the Sankey you want to have an "incoming flow" of created boxes from the left? I agree that this is more consistent once we show boxes from incoming shipments, too. However then shouldn't we also include the count of boxes that are stationary to the warehouse? (I'm thinking of a simple Sankey showing 100 created InStock boxes coming from the left, and 10 shipped boxes leaving to the right. Either the user thinks "alright, this means, I have 90 InStock boxes currently", or they wonder "what's with the difference of 90 boxes?", or maybe they don't wonder at all 😄 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly! I'm thinking we keep an inflows of already instock items as well. Though as I think of it, we probably need a visual sankey mockup to see what's effective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed: think of data structure to include incoming shipments in CreatedBoxes query

JOIN products sp ON sp.id = d.source_product_id
LEFT JOIN products tp ON tp.id = d.target_product_id
) dps -- details/products/sizes
JOIN camps c ON c.id = dps.source_base_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Currently when a base or org is deleted, we delete also the name in the DB right? So if a partner leaves, we'll simply have a null org in this data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. When a partner org or base is deactivated, we set the deleted_on timestamp but keep the name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I want to bring this up in standup as well since we've had issues with the statistics in the past

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aerinsol I forgot to bring this up. I'm not sure which issues you mean since we do keep the base name after deactivating.

@pylipp
Copy link
Contributor Author

pylipp commented Jul 7, 2025

Some clarifications requested.

Thanks @aerinsol for the valuable feedback. Please confirm for 2. and 3. below

  1. I'll move a couple of conditions from JOIN to WHERE
  2. I can update the newest query part to include only boxes that were already reconciled
  3. I'll create another trello card to include created InStock boxes in the data

@aerinsol
Copy link
Member

aerinsol commented Jul 7, 2025

I confirm 2, I think for 3 I need to understand the current shape of the data first. We can discuss during standup.

@pylipp pylipp marked this pull request as draft July 29, 2025 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants