Is there a huge gap of reading performance between LanceVersion V2_0 and V2_1 #4421
Unanswered
ChongWei905
asked this question in
Q&A
Replies: 2 comments 3 replies
-
Hello, could you have a look as for above performance issue. @jackye1995 @majin1102 @westonpace |
Beta Was this translation helpful? Give feedback.
0 replies
-
@Xuanwo I think this is related to the perf gap we have found lately, could you confirm it's the same issue? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I've done some experiments and found that java sdk LanceFileWriter writes files with a bad performance on reading, while python sdk dataset.write_batches did well.
Then I figured out that its about the lance version. I tried to write V2_0 and V2_1 with same data(1,000,000 rows 10 cols all string data), and the reading performance has a huge gap:
V2_0:
random access time: 63.767ms
full scan time: 103.8345ms
V2_1:
random access time: 19.836317208s
full scan time: 2.189263209s
bellow is my read test code:
And here's how I wrote the files:
Beta Was this translation helpful? Give feedback.
All reactions