Replies: 2 comments 7 replies
-
We just added the |
Beta Was this translation helpful? Give feedback.
-
Hi, Sorry I'm extremely confused about the parameters to pass. I've looked into the source and debugged multiple variations and keep getting different assertion errors each time. Would you be able to help me frame the loader to my problem? For your convenience I've updated the discussion above , under section Replicate, with a code for creating a small scale data object ( kwargs = {'batch_size': 1024, 'num_workers': 6, 'persistent_workers': True}
num_neigh= {key: [30] * 2 for key in data.edge_types}
loader = LinkNeighborLoader(
data=g1,
num_neighbors= ???,
edge_label_index= ???,
edge_label= ???,
replace =False,
directed = True,
is_sorted = False,
neighbor_sampler=None,
**kwargs) How would these be framed if the data object is a graph like g1 Editing this as a separate comment. If possible could you also take a look at the help required in section for those 3 questions. would love to get your feedback on the dataset construction and custom transform |
Beta Was this translation helpful? Give feedback.
-
Hi PyG Community,
I've been wracking my brains on something for too long and would like to request some help.
I had recently build a HeteroData object, for an Edge_label prediction task and was able to successfully train it. Now that I've increased the amount of data and rebuilt the graph, i run into
RuntimeError: CUDA out of memory. Tried to allocate 11.44 GiB (GPU 0; 15.78 GiB total capacity; 6.47 GiB already allocated; 5.47 GiB free; 8.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I have tried almost every hack and suggestion posted on multiple forums for such an issue , like
and using
with torch.no_grad():
for model.eval since annotations/decorators over methods might have not worked.It seems like the main issue, for many people running different models is
Batch_size
. Basically the size of data you put on the GPU is making it run out of memory. This is the current data size. Not sure if it is big or there are other datasets bigger than this, that work just fine due to sampling or data loading into batches.For now, my train loop doesn't have any batching(I followed some examples) so i can't directly vary batch_size to test. I am planning on making it , but i'm sure that will solve the CUDA memory issue. Nothing else has changed and i've tested the same on a larger VM with more memory and gpus
Problem
I haven't been able to sample or use any DataLoader for this HeteroData, since there was nothing that splits on edges. Most of them were node based. To make a train test split, I had to use indexing and slice the edge_index, edge_attr and edge_label sequentially since
RandomLinkSplit
wouldn't have worked in this case. I have temporal features on the edge attributes and the goal was to learn from one time period and generalize to a future time period. So it works fine. I don't mind some sequential correlation and wanted to someday merge with Pytorch geometric temporal to test few theories.Replicate
If you want to replicate a smaller scale graph
**Solution? **
(See Clarification first, to know/view how the data object was made and please do correct me if i made any errors or missed something. I feel i did)
I can see 2 ways to batch it
The model will still get the full x_dict and a smaller (edge_index_dict, edge_attr_dict)
a) I can do it like this maybe ? and only use those indices to access and make a runtime edge_dict, but it seems inefficient
b) Every time I try using a loader from the set of available and relevant options, or just the base one, i get an error like this which makes me think there is something wrong with my graph creation class, or some method i'm missing. Please let me know if that is the case. I have attached a template of how i made the Dataset, following all the tutorials and documentation
This feels like the saner option, but will require code changes and rerun(not a big deal) and will also help me tag train and test graphs easily. Each graph will be individually small enough to run without an issue.
Help Required in
I would like to request some help in determining
Clarification
Let me know if I'm missing anything . I am going by a list of csv files and instead of making a Data list object with multiple graphs , i've combined everything to make one big graph. Think 1 yearly graph instead of 365 daily graphs. the reason behind this was that i wanted to standardize and use a minmax scaler for feature values and i couldn't see a transform that does this. Normalization is across row features but i wanted to scale each feature value and doing it on whole data makes sense rather than on subsets.
Correct me if i'm wrong and if there is a way to transform data object with features of varying size to standardize it after making the data object
.Another reason for ding a yearly graph is to have node entities and indexes easily
This is how i've constructed my data. Let me know if i'm missing any methods
Please do let me know if you require any additional details or would like to discuss . In the mean time i will be testing different solutions to see if i can make it work temporarily with the hacks
Beta Was this translation helpful? Give feedback.
All reactions