Skip to content

[VL] VeloxBatchResizer may cause OOM when there exists type of variable length #10892

@jiangjiangtian

Description

@jiangjiangtian

Backend

VL (Velox)

Bug description

There is a job failure because of oom in our test and its memory usage statistics is as follows:

Image

We can see that VeloxBatchResizer uses most of the memory. I find that most of the columns of the table is variable length, like VARCHAR, ARRAR, MAP etc.

I find there is a method called maybeReserve in velox's MemoryPool. Maybe we can call this method in VeloxBatchResizer before calling the append method to try to allocate memory. If the process fails, then we don't append the batch. In this way, we need to do some code modification to make the memory allocation not to throw OutOfMemoryException.

Gluten version

No response

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions