Skip to content

Conversation

anlowee
Copy link

@anlowee anlowee commented Sep 12, 2025

Description

This PR used the new type parser to parse the type, to catch those special chars.

This PR need to be merged after y-scope/velox#31 gets merged, and also sync the submodule version.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Passed the CI.

Summary by CodeRabbit

  • New Features

    • Improved handling of CLP column types in the Presto-to-Velox connector via a dedicated parser, improving type-parsing accuracy.
  • Chores

    • Updated the Velox dependency to a newer commit for stability and improvements.
    • Adjusted connector link configuration to include an additional required component to ensure runtime completeness.

Copy link

coderabbitai bot commented Sep 12, 2025

Walkthrough

Adds a CLP-specific type parser invocation in the CLP connector source, links the new velox_type_fbclp_parser library in the connector CMakeLists, and advances the Velox submodule pointer. No public APIs or signatures changed.

Changes

Cohort / File(s) Change Summary
Build system: connector linking
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt
Reformatted the target_link_libraries call and added velox_type_fbclp_parser to the link list.
CLP type parsing in connector
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp
Added include velox/type/fbclp/ClpTypeParser.h and replaced typeParser.parse(...) with facebook::velox::type::fbclp::parseClpType(...) in toVeloxColumnHandle.
Submodule update
presto-native-execution/velox
Updated Velox submodule pointer from 7ae51198424559a7ed6d245ce4c4214ef7047d9d to 1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Connector as ClpPrestoToVeloxConnector
  participant Parser as fbclp::parseClpType
  participant Velox as Velox Types

  Client->>Connector: toVeloxColumnHandle(clpColumn)
  Connector->>Parser: parseClpType(clpColumn.columnType)
  Parser-->>Connector: velox::TypePtr
  Connector->>Velox: construct Velox ColumnHandle with TypePtr
  Connector-->>Client: VeloxColumnHandle

  rect rgba(0,128,0,0.06)
    note right of Parser: New CLP-specific parser invocation
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly and concisely states the primary change—implementing a CLP type parser to support special characters in field names—and references the related issue (yscope/velox#13). It is in imperative form, uses the Conventional Commits "feat:" prefix, and matches the changes in the raw summary (switch to CLP parser and add velox_type_fbclp_parser). Therefore the title accurately summarizes the main change and is appropriate for the PR.
Description Check ✅ Passed The PR description includes a Description section, a checklist, and a "Validation performed" note stating CI passed, and it explicitly calls out the Velox submodule dependency and required follow-up (yscope/velox#31 and submodule sync). However, it does not include explicit "Motivation and Context", "Impact", and "Test Plan" headings as required by the repository template and lacks detailed test steps or examples. Because the core intent and dependency are documented and CI passed, the description is mostly complete but missing some template-required detail.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@anlowee anlowee marked this pull request as ready for review September 15, 2025 16:02
@anlowee anlowee requested a review from wraymo September 15, 2025 16:02
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (1)

1575-1585: Silence unused parameter to avoid -Wunused-parameter.

typeParser is no longer used in the CLP path and may trigger warnings in stricter builds.

Apply one of the following:

-std::unique_ptr<velox::connector::ColumnHandle>
-ClpPrestoToVeloxConnector::toVeloxColumnHandle(
-    const protocol::ColumnHandle* column,
-    const TypeParser& typeParser) const {
+std::unique_ptr<velox::connector::ColumnHandle>
+ClpPrestoToVeloxConnector::toVeloxColumnHandle(
+    const protocol::ColumnHandle* column,
+    const TypeParser& /*typeParser*/) const {

Or, if you prefer keeping the name:

   auto clpColumn = dynamic_cast<const protocol::clp::ClpColumnHandle*>(column);
   VELOX_CHECK_NOT_NULL(
       clpColumn, "Unexpected column handle type {}", column->_type);
+  (void)typeParser; // unused in CLP path
   return std::make_unique<connector::clp::ClpColumnHandle>(
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02fe345 and 1c38e95.

📒 Files selected for processing (3)
  • presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt (1 hunks)
  • presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (2 hunks)
  • presto-native-execution/velox (1 hunks)
🔇 Additional comments (2)
presto-native-execution/presto_cpp/main/connectors/PrestoToVeloxConnector.cpp (1)

23-23: CLP parser include looks correct.

Header path aligns with the new function usage below.

presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt (1)

20-22: Add of velox_type_fbclp_parser is appropriate; confirm target availability.

Linking the CLP type parser is required for parseClpType. Please confirm the updated Velox submodule/export defines velox_type_fbclp_parser across all build variants used in CI/CD.

@@ -1 +1 @@
Subproject commit 7ae51198424559a7ed6d245ce4c4214ef7047d9d
Subproject commit 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Gate merge on Velox PR #31 and ensure CI updates submodules.

  • Merge this only after the referenced Velox PR is on the target branch, then re-sync the submodule to that exact commit.
  • Make sure CI/CD uses a recursive submodule update (git submodule update --init --recursive) to avoid stale pointers and flaky builds.
🤖 Prompt for AI Agents
In presto-native-execution/velox around lines 1 to 1: only merge this change
after Velox PR #31 has been landed to the target branch; after that, update the
velox submodule to the exact commit created by that PR (use git submodule
set-branch or checkout the specific SHA and commit the updated gitlink) so the
submodule pointer is deterministic; update CI configs (pipeline scripts or
.github/workflows) to run git submodule update --init --recursive before
building to ensure all nested submodules are populated and avoid stale pointers;
verify the CI run uses the updated submodule SHA and rerun CI to confirm no
flaky failures.

💡 Verification agent

🧩 Analysis chain

Verify the bumped Velox commit actually carries the CLP parser header and CMake target you depend on.

Run the script below to confirm:

  • Submodule is on 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd
  • Header velox/type/fbclp/ClpTypeParser.h exists
  • CMake target velox_type_fbclp_parser is present upstream
  • Presto code includes the new header, calls parseClpType, and links the new library

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Init/verify velox submodule…"
git submodule update --init --recursive presto-native-execution/velox >/dev/null

echo "Expected commit: 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd"
actual_commit="$(cd presto-native-execution/velox && git rev-parse HEAD)"
echo "Actual commit:   ${actual_commit}"
test "${actual_commit}" = "028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd"

echo "Check for ClpTypeParser header…"
fd -a 'ClpTypeParser.h' presto-native-execution/velox | sed 's/^/  /'
rg -nP --glob 'presto-native-execution/velox/**' -m1 \
   '\bvelox/type/fbclp/ClpTypeParser\.h\b' || { echo "Header path not found"; exit 1; }

echo "Check Velox CMake target…"
rg -nP --glob 'presto-native-execution/velox/**/CMakeLists.txt' -C2 \
   '\bvelox_type_fbclp_parser\b'

echo "Verify Presto side wiring…"
rg -nP -C2 '\b#include\s*<\s*velox/type/fbclp/ClpTypeParser\.h\s*>' presto-native-execution
rg -nP -C2 '\bparseClpType\s*\(' presto-native-execution
rg -nP -C2 '\bvelox_type_fbclp_parser\b' presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt

Length of output: 2339


Critical: Presto isn't wired to Velox's CLP parser — header include, parse call, and CMake link are missing

  • Verified: Velox submodule is at 028ab93bc1f344d9f9ab83d6e86a3ccd49b46efd; header present at presto-native-execution/velox/velox/type/fbclp/ClpTypeParser.h and velox_type_fbclp_parser is defined in presto-native-execution/velox/velox/type/fbclp/CMakeLists.txt.
  • Missing in Presto: no occurrences of #include <velox/type/fbclp/ClpTypeParser.h>, no parseClpType(…) calls, and presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt does not link velox_type_fbclp_parser.
  • Action: add the header include and call parseClpType where CLP types are parsed, and add velox_type_fbclp_parser to presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt link list.
🤖 Prompt for AI Agents
In presto-native-execution/velox (header at velox/type/fbclp/ClpTypeParser.h)
and in the Presto code that performs CLP type parsing (the source file where CLP
types are currently parsed), add the include directive #include
<velox/type/fbclp/ClpTypeParser.h> and replace or augment the existing parsing
logic to call velox::type::fbclp::parseClpType(...) (passing the same input and
context used by other Velox parsers) so CLP types are actually parsed via
Velox's CLP parser; then update
presto-native-execution/presto_cpp/main/connectors/CMakeLists.txt by adding
velox_type_fbclp_parser to the target_link_libraries entry for the connector
binary so the new include/parse call links correctly.

@anlowee anlowee force-pushed the xwei/handle-special-chars branch from 10289ae to 4dc433d Compare September 16, 2025 17:10
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
presto-native-execution/velox (1)

1-1: Gate merge on Velox PR #31 and ensure CI updates submodules recursively.

  • Merge only after Velox PR #31 lands on the target branch; then re-sync this submodule to that exact commit and re-run CI.
  • In CI, run “git submodule update --init --recursive” before build to avoid stale pointers.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10289ae and 4dc433d.

📒 Files selected for processing (1)
  • presto-native-execution/velox (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-resource-manager)
  • GitHub Check: test (17.0.13, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
  • GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
  • GitHub Check: test (8.0.442, :presto-main-base)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
  • GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
  • GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
  • GitHub Check: maven-checks (8.0.442)
  • GitHub Check: maven-checks (17.0.13)
  • GitHub Check: prestocpp-linux-build-for-test
  • GitHub Check: prestissimo-worker-images-build

@@ -1 +1 @@
Subproject commit 7ae51198424559a7ed6d245ce4c4214ef7047d9d
Subproject commit 1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Verify Velox SHA 1725ea8… has CLP parser and Presto is fully wired (include, parse call, link).

Please confirm the submodule points to the intended Velox commit and that Presto includes/uses the new parser and links the library.

Run from repo root:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Init submodules…"
git submodule update --init --recursive presto-native-execution/velox >/dev/null

expected_sha="1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e"
actual_sha="$(cd presto-native-execution/velox && git rev-parse HEAD)"
echo "Velox SHA: ${actual_sha}"
test "${actual_sha}" = "${expected_sha}"

echo "Check Velox CLP parser header and CMake target…"
rg -nP --glob 'presto-native-execution/velox/**' \
  '\bvelox/type/fbclp/ClpTypeParser\.h\b' -m1
rg -nP --glob 'presto-native-execution/velox/**/CMakeLists.txt' -n \
  '\bvelox_type_fbclp_parser\b' -m1

echo "Verify Presto side wiring…"
rg -nP -C2 '\b#include\s*<\s*velox/type/fbclp/ClpTypeParser\.h\s*>' presto-native-execution || { echo "Missing include"; exit 1; }
rg -nP -C2 '\bparseClpType\s*\(' presto-native-execution || { echo "Missing parseClpType call"; exit 1; }
rg -nP -n 'presto_cpp/main/connectors/CMakeLists\.txt' -C2 \
  '\bvelox_type_fbclp_parser\b' || { echo "Missing link to velox_type_fbclp_parser"; exit 1; }

echo "OK."

Length of output: 1009


Velox submodule at expected SHA; Presto not wired — missing include, parse call, and link

Velox: HEAD = 1725ea8dba85015f9ae84e38e3993c8cd8f6ef5e; ClpTypeParser header and CMake target exist under presto-native-execution/velox (velox/type/fbclp/ClpTypeParser.h referenced from ClpTypeParser.ll/tests; velox_type_fbclp_parser present in velox/type/fbclp/CMakeLists.txt).

Presto wiring (failures from verification):

  • No occurrences of #include <velox/type/fbclp/ClpTypeParser.h> under presto-native-execution.
  • No occurrences of parseClpType( under presto-native-execution.
  • presto_cpp/main/connectors/CMakeLists.txt does not reference velox_type_fbclp_parser.

Required actions:

  • Add the include and call parseClpType(...) in the Presto source that parses Velox types.
  • Add velox_type_fbclp_parser to the appropriate target_link_libraries in presto_cpp/main/connectors/CMakeLists.txt.
🤖 Prompt for AI Agents
In presto-native-execution/velox (around the Velox-type parsing code in
presto-native-execution) add the missing include and parser call: insert
#include <velox/type/fbclp/ClpTypeParser.h> into the source file that performs
Velox type parsing and invoke parseClpType(...) where Velox CLP-encoded types
are decoded (pass the existing input buffer/context and propagate the returned
velox type or error). Then edit presto_cpp/main/connectors/CMakeLists.txt and
add velox_type_fbclp_parser to the appropriate target_link_libraries line for
the connector binary so the ClpTypeParser implementation is linked into the
build.

@anlowee anlowee marked this pull request as draft October 6, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant