Data Quality checks failing with weird error #15
-
I am trying a very basic data transformation and data quality checks to review the framework. However while running the quality checks, it is failing with unexpected error. Can someone please guide me what could be wrong. I am trying the setup in WSL2 environment and dont have S3 support. NOTE: I had to downgrade the python dependencies to 3.9 compatibility as that is environment available in databricks. But still using
Code is working fine overall (hence no python dependency issues) but could not debug this error. My configuration is:
My csv looks like:
When I am running the library, getting below issue:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hi :) I tested using the containerised environment in 1.24.0 and was not able to reproduce. Here is the result log (it worked as expected): In regards to running it on a python 3.9 environment, I don't think we support that anymore, as I tried to build the requirements lockfiles again and rebuild the docker image using python 3.9 and was not able to install the lakehouse_engine due to not matching requirements. Databricks runtime is using Python 3.10 for a while now https://docs.databricks.com/aws/en/release-notes/runtime/13.3lts, so maybe you want to at least use that? we add used 3.10 for a long time but not sure if our current requirements still support that as we have been building and testing on 3.11 for a while now. Aren't you able to run 3.11 on databricks? or at least 3.10? |
Beta Was this translation helpful? Give feedback.
-
Found the issue. It was due to Error throwed me off and I was focused on finding issue in Thanks a lot. |
Beta Was this translation helpful? Give feedback.
Found the issue. It was due to
"unexpected_rows_pk"
key. I was passing single column which was not unique across my dataset. However when I passed the combination of two columns, everything worked fine."unexpected_rows_pk": ["BELNR","BUZEI"],
Error throwed me off and I was focused on finding issue in
dq_functions
.Btw I tested the solution with both python 3.12 and my degraded version of 3.9. In both cases, it is working fine.
Thanks a lot.