LVEE: Lance Video Encoding Extension #4320
Replies: 4 comments 8 replies
-
If the user already has the video stored in a GOP-aware format (e.g. MP4) and they know the keyframe offset they want to retrieve, wouldn't the blob API already allow them to do this using a video reader library? |
Beta Was this translation helpful? Give feedback.
-
I originally thought having the video encoding was actually quite aligned with the principle of Lance V2 as you wrote in the blog and serves as an example of adding more purpose-built encodings in a flexible way:
but maybe I misunderstood 😅 sounds like the pluggable encoding in your sense is more "fundamental" (if that is the right word) for features like delta encoding, compared to the video encodings that are proposed here? |
Beta Was this translation helpful? Give feedback.
-
This reminds me of the JSON/VARIANT discussion #3841, where we can generate new columns vs implement it as a self-contained variant type. My impression is that, having new columns would always be more clear since user can reference it easily, and things like predicate pushdown could happen naturally without special handling in the reader. Maybe that means we should have dedicated types like |
Beta Was this translation helpful? Give feedback.
-
Thank you @westonpace and @jackye1995 for joining the discussion. I will take a closer look at the video's use case and put together a new proposal if need. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to introduce Lance Video Encoding Extension (LVEE) as a pluggable module for the Lance columnar format. LVEE delivers two key capabilities:
With LVEE, Lance retains its column‑oriented strengths while offering sub‑second, frame‑level random access that modern multi‑modal applications require.
Motivation
Industry patterns
Video
torchvision.io.VideoReader
All adopt a “encode early → index offsets → external search” model. VEE brings the same model inside Lance’s transactional, columnar ecosystem.
Goals & Non‑Goals
Goals
lvee.blob
andlvee.gop
– and support write, read, and compaction conversion between them.get_frames_in_range(start_ms, end_ms)
rewrite_video(codec="h265", gop_size=48)
lance‑vee
pylance‑vee
as with an optional dependency forpylance
Non‑Goals
Alternatives
1. Keep using the generic Lance Blob API
Pros
Cons
2. Ship only
lvee.gop
and skiplvee.blob
Pros
Cons
3. Dual-path
lvee.blob
→lvee.gop
(this proposal)Pros
lvee.blob
; background compaction upgrades only hot data tolvee.gop
.Cons
What do you think?
Beta Was this translation helpful? Give feedback.
All reactions