-
Notifications
You must be signed in to change notification settings - Fork 4k
.Net: [MEVD] Updated Nullable handling in VectorStoreRecordModelBuilder #11543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net: [MEVD] Updated Nullable handling in VectorStoreRecordModelBuilder #11543
Conversation
@dmytrostruk to help me understand, what exact scenarios aren't working before this PR? The current logic "unwraps"
That's great, it's indeed good to start having test coverage for the model builder. |
@roji Sure, I described the scenarios in PR description, but I will provide more details here. What currently works:
What doesn't work:
Hope that provides more clarity. I was thinking if we actually should support all of these cases but taking into account that we are already doing some Please let me know if that makes sense. Thanks! |
Why would we need to allow that? In other words, why not just require connectors to specify non-nullable types in SupportedDataPropertyTypes, and then unwrap internally wherever needed? That was the intention in any case - SupportedDataPropertyTypes is just a way for connectors to specify to the model builder what types are allowed etc.
I see. Though this may be something we'd need to think about a bit more: if a connector supports |
I think that the only way how to require connectors to specify non-nullable types in
Today, Weaviate connector supports Maybe it would be better to remove nullable unwrapping and force developers to specify non-nullable and nullable types explicitly and perform validation strongly against specified types. A little bit more code will be required in connectors, but it's more flexible approach (you can control nullable/non-nullable single and enumerable properties separately) and also it will be clear which types are supported. |
Sure, I agree we can improve this - I think it makes sense to simply throw for
The reason I added unwrapping in the first place, is that nullability generally isn't part of the type in the database. In other words, either the database supports nulls or it doesn't, and if does, this doesn't vary on a type-by-type basis (e.g. the database supports null for string fields but not for ints). In that sense, I think null is part of the type on the .NET side but not on the DB side. I'm less sure what happens with collections. Not every type that's supported is supported as an element in a collection (that's why we have SupportedEnumerableDataPropertyElementTypes), and I'm not sure if it's safe to assume that if nullable int is supported as a scalar, it's necessarily also supported as an array element. For example, I can imagine a list-of-strings type (like Redis tags) not supporting nullable strings (though a scalar string is probably nullable). Another option here (which is what I think we should seriously consider): rather than accepting a list of types, connectors can simply have their own subclass of the ModelBuilder (some already do), and simply expose a hook for property type validation which they'd override. In other words, instead of allowing connectors to pass in a list of Types, just allow them to write validation in arbitrary code (possibly providing a bit of support around detecting lists etc.). This would free us from having to think about all these questions. In any case... Just pointing out that none of this is user-facing, and we can do any improvements here later without any breaking changes. |
Thanks!
Taking this into account, is it okay if we temporarily mention nullable types in Weaviate |
Absolutely! That sounds like a good bugfix. BTW I do think we need to think a bit about validation and how the connector and the model builder interact around this... That's one of the reasons everything is flagged as [Experimental] in that area. But yeah, I'd rather we did this a bit later when we're under less pressure. |
Motivation and Context
Currently
VectorStoreRecordModelBuilder
handles the case whenSupportedDataPropertyTypes
has a non-nullable value type (e.g.bool
), but actual model contains nullable value type (bool?
). This allows to simplifySupportedDataPropertyTypes
configuration by specifying only non-nullable version of a type.This PR contains changes for the following cases that are not supported at the moment:
bool?
, PropertyType =bool?
- this one is simple, if nullable type is listed as supported, it should be possible to have it in record model.bool?
, PropertyType =bool
- this one is opposite to the default case. If nullable type is listed as supported, we should allow non-nullable type in record model.bool?
, PropertyType =List<bool>
- this one is the same as previous case, but for enumerable types.bool
, PropertyType =List<bool?>
- this one is the same as default case, but for enumerable types.Also added a
VectorData.UnitTests
project in order to be able to test abstraction logic without integration tests.Contribution Checklist