How to use batch_size and num_samples for HGTLoader ? #6707
LaurentBerder
started this conversation in
General
Replies: 1 comment 31 replies
-
It's interesting that the amount of shipper, consignee and country is so low for this. Is every booking connected to one of these? |
Beta Was this translation helpful? Give feedback.
31 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I'm working with a HeteroData that modelizes bookings, looking roughly like the following for each booking:

Shippers, Consignees and Countries may be connected to different bookings, but a container is only linked to a single booking (though as you can see, a booking may have numerous containers).
I'd like to use the
HGTLoader
class to get subgraphs containing random bookings, but with all their direct connections (all their containers, their shipper, consignee and countries).But I don't understand how the
num_samples
andbatch_size
arguments work in the class (I'm setting different values for the different node types, getting very different results but none looking like what I want), and the documentation doesn't explain it.I seem to understand that
batch_size
is roughly the number of nodes (excludinginput_nodes
) that we end up with. But as far as thenum_samples
, I have no idea how to implement it.Here's an example of results with the following arguments:
{'booking': 29, 'container': 28, 'shipper': 2, 'consignee': 2, 'country': 6}
That looks quite promising, except I've never been able to find a setting that would gie me more than a single container per booking!
How should I approach this?
Beta Was this translation helpful? Give feedback.
All reactions