-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Implement CLP type parser to support special chars besides _
in field name (addresses yscope/velox#13).
#67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release-0.293-clp-connector
Are you sure you want to change the base?
Conversation
WalkthroughAdds a CLP-specific type parser invocation in the CLP connector source, links the new Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Connector as ClpPrestoToVeloxConnector
participant Parser as fbclp::parseClpType
participant Velox as Velox Types
Client->>Connector: toVeloxColumnHandle(clpColumn)
Connector->>Parser: parseClpType(clpColumn.columnType)
Parser-->>Connector: velox::TypePtr
Connector->>Velox: construct Velox ColumnHandle with TypePtr
Connector-->>Client: VeloxColumnHandle
rect rgba(0,128,0,0.06)
note right of Parser: New CLP-specific parser invocation
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (1)
1575-1585
: Silence unused parameter to avoid -Wunused-parameter.
typeParser
is no longer used in the CLP path and may trigger warnings in stricter builds.Apply one of the following:
-std::unique_ptr<velox::connector::ColumnHandle> -ClpPrestoToVeloxConnector::toVeloxColumnHandle( - const protocol::ColumnHandle* column, - const TypeParser& typeParser) const { +std::unique_ptr<velox::connector::ColumnHandle> +ClpPrestoToVeloxConnector::toVeloxColumnHandle( + const protocol::ColumnHandle* column, + const TypeParser& /*typeParser*/) const {Or, if you prefer keeping the name:
auto clpColumn = dynamic_cast<const protocol::clp::ClpColumnHandle*>(column); VELOX_CHECK_NOT_NULL( clpColumn, "Unexpected column handle type {}", column->_type); + (void)typeParser; // unused in CLP path return std::make_unique<connector::clp::ClpColumnHandle>(
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt
(1 hunks)presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp
(2 hunks)presto-native-execution/velox
(1 hunks)
🔇 Additional comments (2)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (1)
23-23
: CLP parser include looks correct.Header path aligns with the new function usage below.
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt (1)
20-22
: Add of velox_type_fbclp_parser is appropriate; confirm target availability.Linking the CLP type parser is required for
parseClpType
. Please confirm the updated Velox submodule/export definesvelox_type_fbclp_parser
across all build variants used in CI/CD.
presto-native-execution/velox
Outdated
@@ -1 +1 @@ | |||
Subproject commit 7ae51198424559a7ed6d245ce4c4214ef7047d9d | |||
Subproject commit 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Gate merge on Velox PR #31 and ensure CI updates submodules.
- Merge this only after the referenced Velox PR is on the target branch, then re-sync the submodule to that exact commit.
- Make sure CI/CD uses a recursive submodule update (git submodule update --init --recursive) to avoid stale pointers and flaky builds.
🤖 Prompt for AI Agents
In presto-native-execution/velox around lines 1 to 1: only merge this change
after Velox PR #31 has been landed to the target branch; after that, update the
velox submodule to the exact commit created by that PR (use git submodule
set-branch or checkout the specific SHA and commit the updated gitlink) so the
submodule pointer is deterministic; update CI configs (pipeline scripts or
.github/workflows) to run git submodule update --init --recursive before
building to ensure all nested submodules are populated and avoid stale pointers;
verify the CI run uses the updated submodule SHA and rerun CI to confirm no
flaky failures.
💡 Verification agent
🧩 Analysis chain
Verify the bumped Velox commit actually carries the CLP parser header and CMake target you depend on.
Run the script below to confirm:
- Submodule is on 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd
- Header velox/type/fbclp/ClpTypeParser.h exists
- CMake target velox_type_fbclp_parser is present upstream
- Presto code includes the new header, calls parseClpType, and links the new library
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Init/verify velox submodule…"
git submodule update --init --recursive presto-native-execution/velox >/dev/null
echo "Expected commit: 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd"
actual_commit="$(cd presto-native-execution/velox && git rev-parse HEAD)"
echo "Actual commit: ${actual_commit}"
test "${actual_commit}" = "028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd"
echo "Check for ClpTypeParser header…"
fd -a 'ClpTypeParser.h' presto-native-execution/velox | sed 's/^/ /'
rg -nP --glob 'presto-native-execution/velox/**' -m1 \
'\bvelox/type/fbclp/ClpTypeParser\.h\b' || { echo "Header path not found"; exit 1; }
echo "Check Velox CMake target…"
rg -nP --glob 'presto-native-execution/velox/**/CMakeLists.txt' -C2 \
'\bvelox_type_fbclp_parser\b'
echo "Verify Presto side wiring…"
rg -nP -C2 '\b#include\s*<\s*velox/type/fbclp/ClpTypeParser\.h\s*>' presto-native-execution
rg -nP -C2 '\bparseClpType\s*\(' presto-native-execution
rg -nP -C2 '\bvelox_type_fbclp_parser\b' presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt
Length of output: 2339
Critical: Presto isn't wired to Velox's CLP parser — header include, parse call, and CMake link are missing
- Verified: Velox submodule is at 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd; header present at presto-native-execution/velox/velox/type/fbclp/ClpTypeParser.h and velox_type_fbclp_parser is defined in presto-native-execution/velox/velox/type/fbclp/CMakeLists.txt.
- Missing in Presto: no occurrences of #include <velox/type/fbclp/ClpTypeParser.h>, no parseClpType(…) calls, and presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt does not link velox_type_fbclp_parser.
- Action: add the header include and call parseClpType where CLP types are parsed, and add velox_type_fbclp_parser to presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt link list.
🤖 Prompt for AI Agents
In presto-native-execution/velox (header at velox/type/fbclp/ClpTypeParser.h)
and in the Presto code that performs CLP type parsing (the source file where CLP
types are currently parsed), add the include directive #include
<velox/type/fbclp/ClpTypeParser.h> and replace or augment the existing parsing
logic to call velox::type::fbclp::parseClpType(...) (passing the same input and
context used by other Velox parsers) so CLP types are actually parsed via
Velox's CLP parser; then update
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt by adding
velox_type_fbclp_parser to the target_link_libraries entry for the connector
binary so the new include/parse call links correctly.
… into xwei/handle-special-chars
10289ae
to
4dc433d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
presto-native-execution/velox (1)
1-1
: Gate merge on Velox PR #31 and ensure CI updates submodules recursively.
- Merge only after Velox PR #31 lands on the target branch; then re-sync this submodule to that exact commit and re-run CI.
- In CI, run “git submodule update --init --recursive” before build to avoid stale pointers.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
presto-native-execution/velox
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (8.0.442, :presto-main-base)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: prestocpp-linux-build-for-test
- GitHub Check: prestissimo-worker-images-build
@@ -1 +1 @@ | |||
Subproject commit 7ae51198424559a7ed6d245ce4c4214ef7047d9d | |||
Subproject commit 1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify Velox SHA 1725ea8… has CLP parser and Presto is fully wired (include, parse call, link).
Please confirm the submodule points to the intended Velox commit and that Presto includes/uses the new parser and links the library.
Run from repo root:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Init submodules…"
git submodule update --init --recursive presto-native-execution/velox >/dev/null
expected_sha="1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e"
actual_sha="$(cd presto-native-execution/velox && git rev-parse HEAD)"
echo "Velox SHA: ${actual_sha}"
test "${actual_sha}" = "${expected_sha}"
echo "Check Velox CLP parser header and CMake target…"
rg -nP --glob 'presto-native-execution/velox/**' \
'\bvelox/type/fbclp/ClpTypeParser\.h\b' -m1
rg -nP --glob 'presto-native-execution/velox/**/CMakeLists.txt' -n \
'\bvelox_type_fbclp_parser\b' -m1
echo "Verify Presto side wiring…"
rg -nP -C2 '\b#include\s*<\s*velox/type/fbclp/ClpTypeParser\.h\s*>' presto-native-execution || { echo "Missing include"; exit 1; }
rg -nP -C2 '\bparseClpType\s*\(' presto-native-execution || { echo "Missing parseClpType call"; exit 1; }
rg -nP -n 'presto_cpp/main/connectors/CMakeLists\.txt' -C2 \
'\bvelox_type_fbclp_parser\b' || { echo "Missing link to velox_type_fbclp_parser"; exit 1; }
echo "OK."
Length of output: 1009
Velox submodule at expected SHA; Presto not wired — missing include, parse call, and link
Velox: HEAD = 1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e; ClpTypeParser header and CMake target exist under presto-native-execution/velox (velox/type/fbclp/ClpTypeParser.h referenced from ClpTypeParser.ll/tests; velox_type_fbclp_parser present in velox/type/fbclp/CMakeLists.txt).
Presto wiring (failures from verification):
- No occurrences of #include <velox/type/fbclp/ClpTypeParser.h> under presto-native-execution.
- No occurrences of parseClpType( under presto-native-execution.
- presto_cpp/main/connectors/CMakeLists.txt does not reference velox_type_fbclp_parser.
Required actions:
- Add the include and call parseClpType(...) in the Presto source that parses Velox types.
- Add velox_type_fbclp_parser to the appropriate target_link_libraries in presto_cpp/main/connectors/CMakeLists.txt.
🤖 Prompt for AI Agents
In presto-native-execution/velox (around the Velox-type parsing code in
presto-native-execution) add the missing include and parser call: insert
#include <velox/type/fbclp/ClpTypeParser.h> into the source file that performs
Velox type parsing and invoke parseClpType(...) where Velox CLP-encoded types
are decoded (pass the existing input buffer/context and propagate the returned
velox type or error). Then edit presto_cpp/main/connectors/CMakeLists.txt and
add velox_type_fbclp_parser to the appropriate target_link_libraries line for
the connector binary so the ClpTypeParser implementation is linked into the
build.
Description
This PR used the new type parser to parse the type, to catch those special chars.
This PR need to be merged after y-scope/velox#31 gets merged, and also sync the submodule version.
Checklist
breaking change.
Validation performed
Passed the CI.
Summary by CodeRabbit
New Features
Chores