-
Notifications
You must be signed in to change notification settings - Fork 701
Add Arena option to BulkUpsert, add hidden --send-format option #20061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Because `TValue&& rows` is provided as rvalue reference, we can move its value into the request rather than making a copy of the data. Speed boost: ~130MiB/s -> ~168-200MiB/s.
…sertUnretryable (leaving BulkUpsert unchanged)
…na, 1-to-1 allocation)" This reverts commit b9bd7cd.
Implement additional APIs that work with arena-allocated proto messages. The implementation creates a new protobuf::Arena per batch request and builds Ydb::Value inside this arena inside TCsvParser::BuildListOnArena. Ydb::Value gets created once and then moved into Ydb::Table::BulkUpsertRequest, which is allocated on the same arena (allocation in the same arena prevents copying). Left undone/requires modifications: 1. TODO: I had to copy-paste RunDeferred/Run methods (see: RunDeferredOnArena/RunOnArena) because arena-allocated messages are returned as pointers, but the API expected rvalue/lvalue references on the request message. If the pointer is deferefenced, there will be a single copy of the message inside TGRpcConnectionsImpl::Run when WithServiceConnection is called and request is moved into capture parameter of a callback. In the best case, there must be no code duplication (the copy-pasted methods differ only in the type of accepted request: they accept TRequest*). 2. TODO: ensure correctness. Speed boost: ~130MiB/s -> ~250MiB/s
Notes: 1. Creates custom converter in `csv_parser.cpp` to build `Ydb::Value` on the arena. 2. Creates new implementer of `TValueBuilderBase` to build `TArenaAllocatedValue` in `value.cpp`/`value.h`. 3. Introduces `TValueHolder` interface with two implementations: - `StackAllocatedValueHolder` with stack-allocated `Ydb::Value`. - `TArenaAllocatedValueHolder` with arena-allocated `Ydb::Value`. Current drawbacks: - `TValueHolder` has virtual calls, which is costy; try to employ static polymorphism. - `FieldToValueOnArena` creates `TCsvToYdbOnArenaConverter`, which is a copy of `TCsvToYdbConverter`. Verdict: didn't produce any performance improvements (likely due to virtual calls in `TValueHolder`).
…ildListOnArena` This commit fixes the issue with virtual calls in `TValueHolder`, integrated in the commit 396d98f. Average speed: 254-257 MiB/s (33.1-33.6 sec spent) There seems to be no performance difference in terms of upload speed. The only difference is that the percental fraction of CPU load that `TCsvParser::BuildListOnArena` method takes: 1. this commit: 42.7% of the total CPU time. 2. commit 396d98f: 43.1% of the total CPU time. 3. without both commits: 49.3% of the total CPU time.
…hars counting Benchmarking results (see commit 6201814 for details): 1. Previous solution (at commit 6201814): Elapsed: 17.7 sec. Total read size: 8.33GiB. Average processing speed: 482MiB/s. 2. Current solution: Elapsed: 10.6 sec. Total read size: 8.33GiB. Average processing speed: 804MiB/s (up to 817MiB/s). Cherry-picked the commit: `773a3a5` See: Vladislav0Art@773a3a5
…Types with Protobuf Arena
… quote chars counting" This reverts commit d3961f2.
Requires testing via a real instance of YDBD to ensure that the refactoring is correct.
🟢 |
const TCallMeta& meta, | ||
IQueueClientContext* context) | ||
{ | ||
if (IsPointer_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsPointer_ кажется не нужен, можно std::get_if использовать
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ |
⚪ Test history | Ya make output | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
Changelog entry
Changelog category
Description for reviewers
Collaboration based on PRs:
#19667
and
#20005