Skip to content

Conversation

wraymo
Copy link
Member

@wraymo wraymo commented Jun 25, 2025

Description

Overview

The current Presto–CLP connector PR introduces the coordinator-side implementation, along with a placeholder (dummy) worker implementation. Detailed information about the overall design is available in the corresponding RFC. This Velox PR focuses on the worker-side logic.

The Velox CLP connector enables query execution on CLP archives. The Velox worker receives split information and the associated KQL query from the Presto coordinator. For each split, it executes the KQL query against the relevant CLP archive to find matching messages and stores their indices.

To support lazy evaluation, the implementation creates lazy vectors that wrap a CLP column reader and the list of matching indices. When accessed during query execution, these vectors load and decode only the necessary data on demand.

Core Classes

ClpDataSource

This class extends DataSource and implements the addSplit and next methods. During initialization, it records the KQL query and archive source (S3 or local), then traverses the output type to map Presto fields to CLP projection fields. Only ARRAY(VARCHAR) and primitive leaf fields like BIGINT, DOUBLE, BOOLEAN and VARCHAR are projected.

When a split is added, a ClpCursor is created with the archive path and input source. The query is parsed and simplified into an AST. On next, the cursor finds matching row indices and, if any exist, returns a row vector composed of lazy vectors, which load data as needed during execution.

ClpCursor

This class manages the execution of a query over a CLP-S archive. It handles parsing and validation, loading schemas and archives, setting up projection fields, and filtering results. In CLP-S, records are partitioned by schemas. ClpCursor uses ClpQueryRunner to initialize the execution context for each schema and evaluate the filters. It will skip archives where dictionary lookups for string filters return no matches and only scan the relevant schemas of a specific archive. For example, consider a log dataset with the following records.

{"a": "Hello", "b": 2}
{"a": "World", "b": 0, "c": false}
{"a": "World", "c": true}

The three log messages have varying schemas. If we run a KQL query a: World AND b: 0, it will skip loading the third message because it's schema does not match the query (there's no b field). And if the query is a: random AND b: 0, it will even skip scanning the first two records, because random cannot be found in the dictionary.

ClpQueryRunner

This class extends the generic CLP QueryRunner to support ordered projection and row filtering. It initializes projected column readers and returns filtered row indices for each batch.

ClpVectorLoader

In CLP, values are decoded and read from a BaseColumnReader. The ClpVectorLoader is custom Velox VectorLoader that loads vectors from CLP column readers. It supports integers, floats, booleans, strings, and arrays of strings. It's used by lazy vectors to load data on demand using the previously stored row indices.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Summary by CodeRabbit

Release Notes

en-CA

New Features

  • Added a CLP connector for reading and querying CLP archives from local file systems or S3.
  • Introduced CLP-specific configuration, including storage type selection (FS or S3).
  • Bundled new dependencies required for the CLP connector, managed via CMake.
  • Implemented lazy vector loading for efficient data access in CLP queries.
  • Added support for nested row types and predicate pushdown in CLP queries.

Tests

  • Added comprehensive test suite with example CLP archive data to validate connector functionality and predicate pushdown behavior.

Documentation

  • Updated connector documentation to include the CLP connector and its usage.
  • Added configuration documentation for CLP connector options.

Copy link

coderabbitai bot commented Jun 25, 2025

## Walkthrough

A new CLP connector is introduced into the Velox codebase, including its source, build, and test files. The build system is extended to support this connector and its dependencies, with new CMake modules for external libraries. Documentation is updated to describe the CLP connector and its configuration.

## Changes

| File(s)                                                                                           | Change Summary                                                                                                         |
|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|
| CMakeLists.txt, velox/connectors/CMakeLists.txt, velox/connectors/clp/CMakeLists.txt              | Added build logic and conditional inclusion for the CLP connector and its dependencies.                                |
| CMake/resolve_dependency_modules/*.cmake                                                        | New CMake modules added for dependency resolution: antlr4-runtime, clp, date, msgpack-cxx, nlohmann_json, spdlog, ystdlib_cpp. |
| CMake/resolve_dependency_modules/simdjson.cmake                                                 | Modified to use OVERRIDE_FIND_PACKAGE for simdjson dependency.                                                         |
| velox/connectors/clp/ClpColumnHandle.h                                                          | Added new ClpColumnHandle class for column metadata.                                                                   |
| velox/connectors/clp/ClpConfig.h, velox/connectors/clp/ClpConfig.cpp                             | Added ClpConfig class for connector configuration and storage type handling.                                           |
| velox/connectors/clp/ClpConnector.h, velox/connectors/clp/ClpConnector.cpp                       | Implemented ClpConnector and ClpConnectorFactory classes for connector creation and configuration.                     |
| velox/connectors/clp/ClpConnectorSplit.h                                                        | Introduced ClpConnectorSplit struct for representing data splits.                                                      |
| velox/connectors/clp/ClpDataSource.h, velox/connectors/clp/ClpDataSource.cpp                     | Implemented ClpDataSource for data reading, batch processing, and integration with CLP archives.                       |
| velox/connectors/clp/ClpTableHandle.h, velox/connectors/clp/ClpTableHandle.cpp                   | Added ClpTableHandle class for table and query metadata.                                                               |
| velox/connectors/clp/search_lib/CMakeLists.txt                                                  | Added static library target clp-s-search for CLP search functionality.                                                 |
| velox/connectors/clp/search_lib/ClpCursor.h, velox/connectors/clp/search_lib/ClpCursor.cpp       | Implemented ClpCursor for query execution and row retrieval from CLP archives.                                         |
| velox/connectors/clp/search_lib/ClpQueryRunner.h, velox/connectors/clp/search_lib/ClpQueryRunner.cpp | Implemented ClpQueryRunner for schema-based filtering and projected column reading.                                    |
| velox/connectors/clp/search_lib/ClpVectorLoader.h, velox/connectors/clp/search_lib/ClpVectorLoader.cpp | Added ClpVectorLoader for loading Velox vectors from CLP column readers, including array and primitive types.          |
| velox/connectors/clp/tests/CMakeLists.txt                                                       | Added test build configuration for CLP connector tests.                                                                |
| velox/connectors/clp/tests/ClpConnectorTest.cpp                                                 | Added unit tests for CLP connector, covering pushdown, hybrid, and nested filtering scenarios.                         |
| velox/connectors/clp/tests/examples/test_1.ndjson, velox/connectors/clp/tests/examples/test_2.ndjson | Added example NDJSON data files for test cases.                                                                        |
| velox/docs/develop/connectors.rst                                                               | Updated documentation to describe the CLP connector, its architecture, and usage.                                      |
| velox/docs/configs.rst                                                                          | Added documentation for the CLP connector configuration property `clp.storage-type`.                                   |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant QueryEngine
    participant ClpConnectorFactory
    participant ClpConnector
    participant ClpDataSource
    participant ClpCursor
    participant ClpVectorLoader

    QueryEngine->>ClpConnectorFactory: newConnector(id, config)
    ClpConnectorFactory->>ClpConnector: Construct with id, config
    QueryEngine->>ClpConnector: createDataSource(outputType, tableHandle, columnHandles, ctx)
    ClpConnector->>ClpDataSource: Construct with schema, handles, pool, config
    QueryEngine->>ClpDataSource: addSplit(split)
    ClpDataSource->>ClpCursor: Construct with inputSource, archivePath
    ClpDataSource->>ClpCursor: executeQuery(query, outputColumns)
    loop For each batch
        QueryEngine->>ClpDataSource: next(batchSize, future)
        ClpDataSource->>ClpCursor: fetchNext(batchSize, filteredRowIndices)
        alt rows found
            ClpDataSource->>ClpVectorLoader: createVector for each projected column
            ClpVectorLoader-->>ClpDataSource: Loaded Velox vector
            ClpDataSource-->>QueryEngine: RowVectorPtr
        else no rows
            ClpDataSource-->>QueryEngine: std::nullopt
        end
    end
sequenceDiagram
    participant ClpDataSource
    participant ClpCursor
    participant ClpQueryRunner

    ClpDataSource->>ClpCursor: executeQuery(query, outputColumns)
    ClpCursor->>ClpQueryRunner: init(schemaReader, columnMap)
    loop For each fetchNext call
        ClpDataSource->>ClpCursor: fetchNext(numRows, filteredRowIndices)
        ClpCursor->>ClpQueryRunner: fetchNext(numRows, filteredRowIndices)
        ClpQueryRunner-->>ClpCursor: filteredRowIndices
        ClpCursor-->>ClpDataSource: projectedColumns
    end
Loading

Suggested reviewers

  • kirkrodrigues

<!-- walkthrough_end -->


---

<details>
<summary>📜 Recent review details</summary>

**Configuration used: .coderabbit.yaml**
**Review profile: ASSERTIVE**
**Plan: Pro**


<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between b10d1aa07629c3ecb108342c78c3ce463735867d and a65d8186beeb502bc181c081552cde4ced5a38ac.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `velox/connectors/clp/search_lib/ClpQueryRunner.h` (1 hunks)

</details>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)</summary>

* GitHub Check: Build with GCC / Ubuntu release with resolve_dependency

</details>

</details>
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNxU3bABsvkCiQBHbGlcABpIcVwvOkgAIgAzEmouAEFaegBhABkrBQxyMXwKUNiUDFwKRWwGaQjYEkgAdyKAa0owRHglFGZuaLZy6nh8LHx4uoaANRIvfAAPSGzcpnzRAmLIMhUveAwiSCDKeRI50Txh0awl9AoGBClEDRh66zsmXv72ZFwXk6RxPbWfyIAiAZAJrisCusFEVaLtqEUOl0GvAPiQBrghiN0Bh6GgeF40DVYPgvEo+M0KG0+Ki+uj2FiME84FMZvMmq1KH5RCRJLVEH11GV4kVmIycXjIABpACKWQOwRckHilWYEyBIXwMIocIwCI2JzONG+L0OSrQRG0GBB6v80QkmFwixyNzufOVRU2RNg9kFYQlKCU5Xg8XgtTFuDdgLYiEQFtqcns612+x+vJpuPgNUeRilJHk724Iy+ZQqVRq9F2DC82CUHCMUAAVI2Ml5uAARahoADK+GwtxIza4AFE5jRcSaGgADTuY3v9mpTxiE2PhWmfYOAqdodLdv1LzD0KfkMdLtg/RSPSAASSdu3U8DQOwAXrVGuofWnpXKFUcA843Skew+wHSAAApuwAZn4PhZgYJ8AEpwjFbhkCsYECGVMMyW+LVrl4fARDEC4sJmWhkDAhxuCLCgAX2KcUhsGwUgATTAyZGIyAAJRiEIPXEeAoVEHyA3BZG4WodjaSApwAIWvABxa8ADloCncIZwAeQAVVkrJh3UmTZM0zT9JSZTDMPGSOJsbjGKnJCAwYfxqFqfEp1bbgMn7RAiiXEU+CSO5fR2TRIE0rB4J8FMZJPXBDKFfxcH7a10D8fBGkgKRCj4Qt8EQGIxkgQln3kbL1hNahivwHdIFoLt+CwJQxVxDQG0gZtPO8ihfIoIdIAAWUweNkDNY5TgYc5sXwKQ+CWMBu1de4E3kbhnE6QErIdHZ6ro38hOkcJZh3GLEDudE0GQKyAOWxBwgK3A9uwbhBMItYLjugNQy8GghMBYFvFwK9b0gbACuQDy2xlRVZBsbBVgoJdMJIbbsFcrCfsoZAJL4M76jFAN8G4cRmHgV9kDOzAMBixNEBaeBqJi+AKDtFHHSWvlkE9PGLpkS7CqarNxBGZx5FmfAWmeq6BJ5gmIyjIgngAMU9E40DRNcnTQCR8C6Cn4PyGL/CYHVkEaEkCvsc6xWQWgtQwfAnXlz9TRhmCmnqfw6qFi5ReqiWpcgWRsPoB3IGd6Q2oMJsWyhmG4YR/rR3Hcj1VIcghIYZ1cinaGjgTjOka1KiaKdWFKBiAiiOFrArMqTLvt+lM12p8QnzJmKq7WGImBrZgsBc8lpfoJKUpixuK5HjKyjhbMeC5ARqDuKOY886YcqyGryX6lJGDBgg1WmWYFindf1k3ndKCR2AquO1PdvxFUWGzhQ+4HpIh/u57S6Z8oSFIHq4R4jHSBuEAQ+BSRJGtPdCoKZPrXRZmgWQXNxggj+kQYGTotgCGiMgEqZU1hFFwtVWqD9Gp1QugJMGp11gxHrjPLMkc3AvHXPSQYNd7DfyKEDYqaBSqbFRuKRMzkkh7XwVlQhPU6hVUaFQF6EJSSOHfpfKRVlna/1nodEhup9hWSUEwHRdUGojC8PIc2ZB0AMGzAVegtB+wxTGpsCaU0mTPAaLYbkiR/AYDnpgSANglYZA9HwGaygfAUM6EQJqJBMTwC8MPfgeAdjkEnAoL21ZLrgwDGmZm3IBQjE6GoUKYZkC7E2HqHBMUSDxFDAwMM5R9rjSNCRbE1wbqcyjhYRYLAMQU0cGKFwzCGgmILD09g5CfgeG8L4fwhwbSoCfmqL8YMuSxEwLMRoJASAlA9lqWZwQQQxB+FVL850GAtHAacUpWAvweKUGdISxMSICAaGctotBmQIGQO8DE9hKAPHQFdbkpNcRckwoWaINB1RvJ2PMmgzBrnqjudIZyDMa7hHNlmH0k0WbsFMcCq0oMfH1HOXQT5rySUtFhXeHxNZ7l5FDBQEFujaCezIDUCIWoViwIEC4yARBsDImSVo3Ys8hiAnMWmPgpyb57BRDIFy9NASegdv6Kyz1doxQKNIOMSo7aTQxIyKOyktQ7jhDXJ8ChWBfHCHCM6YNOgFPCJ6fwEgwyNCul7Xg0h2Dkqtb81AAgDrjCsiKSaNjgLRHxdiMgDh0HSKdDK0QVL/goGxpUHG+KBnvM5Z46IYgkU2AAOTfJYIKTAHL3w/DyDyvlAqhW7DcgJfVjgGQcI1a5HM0cwCGAMCYKAZB6BFTQHgQg6dlA0HoD89gXBeD8GEO9AFwjFDKFUOoLQOh9B9vAFAdwpTpY4AIMQMgE6e5jPKFwKgmUHBOCVMu8kKg1CaG0LoHtRht2mAMBkIabQAD0wJSRSAAPpKAkqCnxsggPMCqLg39jovAUAACxgAoPDEmJANAMDFG0essQ8MGC6Ska8x6M6uXoDegZ8gip3EwKQRARhd7kEyt+tA0loN2OiBjac8GkMobQ6iDDWHWMkCXKgXYZY7EVlzbK2gnGvwAZrBwqyvK4mGKKl+cy0Asg2EQ34fjbAKFgaDAwAhPUSKIY0AARkghoAATE8EGD1kAOiEtsWoAV1SzUdRgcI3YeK2YAKwADZGCUpveEKyvkFwNG0jYeU6mXiae07p1DwY2B+vY94BoKygX1DbFyeI8NiLYinFIY+QGFPAdA4O9lkH+xeCLlhPUL4Ghfjto0DAd9ICxayA5xN9RokZOBGlKcSsYl3AyCMcc8UIjOFIE6PUbAjw8eQ6l9DS4q0uwaAKUQIYwz0B6+EN54WAxRdAg4AQcJjbrHkA9XNU41sCd/RkaiU4MswYaAFOeX5dggifISDhRMOEMVUsloDKluzQBSFkLIjXNLKQDLsHWbQIZjcjLASbf9yhAZ/SQFIDo4luca+1zrW9wizRDLICLAkVNknVElnTem0ufdVMBaLfrvoolpbWJtZRqw8/5WjHUubvVSAacwQGDNONVhrN5hJfZcCeB4fiP7yVkAxjjKQBh8E9pflp/QAi1jOmWBSJjKgNdiFtdEISc3H1+DjBOKXQqfBPCVKzuwB80h2rKRIMx3H4cPtcZkixv9lWSAgZIEZ2rUGPuIDg+UBDq39OCewyJ9A6QYibcgGHR33CYiFZ8Rb92LnHw4ITPmEYmR/cl7cwktHE2pvjP0TbxkOY8OxAbJ+kPJB/3SEA+H6r4GTMx449IX91ZuCYdT7h/DhHiPjvNzECjftqOyro8w1ATHFj+8y5xh5aKKGhhSeqQ3Orc2JHRwGfX6pYgT9KIP4z8hFkoB4QpdQXFsACG5EWToN2+ugyyVG3Gwx0b2x3bGt2cDT0wh2zqXiHkHk0j3ynUCKHkB63/CwlOCnR6SFBvkQB9A838AIkk3gCKXUCpyaA/HsBvh8GnmrBGCqQqWiA+UgG7BRjCUgGvyBztztDI1zWNFAScSNFr2p3oBLz7GQHmkWjynIHKE9QaEdx2DqSiFuxiVzU0iViVgDFzwjUukYG9AaBrzL2IX+2nhWRCVmiEiUCeBSHiF+mVGAJii/DOxqHCCcNRWJnQGokHSBSu0kSf1Zy/Av3OnIxAg5Uwi/G7wD1Hx4GoB9GunSCBQu18JyngK1Gv1OSKFeTLWLAaQS1eTbFemrj9X3w8NQAFWcCUHoEwlF3GQlx+ily535zl2N0gFN1+lbz4JeGb2cA6KKlz1omdx4E/0UPKUiBKW919232EyiKyyDw8lx1718i8Cq0jxqwgxHyyzjwnyn2EwPAz3oCzxzzmCd3oAL2K1Sk9EMNwQ4LwEP3vEBHgnOiymcFL2uI804KeQKVOxiSehenrxAKxydA81OQKIf1qzanb07zACMG70WP7wjyj3WN3zH12hT2Exnw7znxI1PXI36RX3GBozlXowMEY0mMiORLmNRJ2LaFEyun2NzRanjFtD70UxImU0FTJBijyJklRKXB2CDT9jBIgwkTM2xBswAAYrN/8lAj9agrj3NPQvwvMLhfN/NgtQtk0TtItQiYs4t7d1QpxeTioSCqAXAjsCkKhqhdcXg0DaBZBFssx/t5BqFAQlSsYSJ5T3toikcJZahSs2Q5gKsWSVjETh96swJ2wUhoBhw+IRdHYPcnT08hB95i9KBKd1Qet/8nN1RIjftMRwkNcmT2NWstQpwsgUgAAtFiRrblSoXwWacBX/eQOxeNPM/7I1VopMagMGcOHVJk0VR040BNA0o0/k00sxOJXwF5G4uJIE1nZwjDNxGYzjHLKYtoEtGSf4zHabM8UQWVJAJZLUbor2L8Q01yJcIUkzCg6tJwiSWA/bbrPUqyY7RwZ1CwroGKTAcaf4ceUVMANac5Jkl5G+N1IoEQiYLAVPVJblcZbWbQQkMvYJAArndUfXFotoidO3cIroiA23L4vo44vPIdF3YYrMUYz3YkqAH3P3aYikznHk1yaktPM1GIMpeY4TOE5YgfVYofSDZEuPC86pe4ogesSAXQSASYF42vLgU+YcLITSAADSA0jOjKA1km0mvCyHbCA0mGHBsG7GvHh0MlkvkqUpUuHDUo0q0qAz8xSECyCyA24mHAyClG7G0gGmMt0tMuUqjIsu7B0lsgsp6ynAMDEqgAyH+3uzK3mCDKWJDLWLDIoC8AjN8r4lCvEu0ktiKiAPR23PYGUtwuYoEhyob0BJx2E3x3gqJyQtPKNMvInOrWVLFI0ElMs1OzvL2x7jC0cAhPw27WhK/QWLDwRISr4tj1/QRSIAApaDAAYDmDmCYoxIIxN3nxPUX1xNvSowJLXy9xJOzzJJ30D3oqnEmumtmvmqYo210JYqqK1EZK1y/FOqJBmrmoWHqryUUSUwEiCIQD2CzJiWcykqMJqpeCevOXOoWCaqwDAinAYGojAAAHYWqWqHJEceFbL7KNTzkb1iidTHz5RgTQbEAprnqIbZsKBF5wlUAVg0ErSYgXTUwXgobnjXMy8vTZiop4kZJorAzhr6qgN6sNtKCvw5tW0GkpwBpuwFIrAUgXLGsixptHwfB5AdxkybRLi0y4CMy4t/8FcldEjMQ1dezYx+zMwddHCb4nZiazrXqKDwlpzac5zn4FynhMqYpu8NySqASdzlztkX8KEhtahHrrbSbbb3qs9bzdtQwYg0DnzurmBkJnAWhSktYD0Th+dKj7BkF4VDNRrwK7Va9IBZIIEbQwYgKVDHoYoJapaZaXKgNtJuwLLjJNJIc4cNDlZ4RlbNZw5hNUkwaXr5qa1YKCcELONCaGg0L31lrWizcOjsKGhjzeiHdCKBjiKhi3dyLxBdqqKDraKjq4kGgbqZLYS+aeLH8NjYN+6IbLrryfQjiTjlQisi9LigbriXlZBK81yDDX6m0sCnAJwrYiZM8PwylAiD7eqO9+qYShrgzuLQyxrR849SdYAWoMAgMhBfImQhMcMDBISsSF9eDl8lRV9aNdrSSaLpISigSD601SxKhJMjktQZM5MXgFM+V2TVMuTxgvxkHUH0HMHc7eL/rlcMCYhmawIbNLNLMNBIIEJskXgTZgQixMxAQMb1SXy1Rx62dQJ2kpBij3CnQabLSxBUkScus0CtGYDOrRD3TsRn8vwxhaklbIA39cAP8v9CCkC/8bwnQC7gb9cuGIgQhkAtHeGqZ+HsQxy/ZaYfiLbD7qIM1XMoVIjHimaf7SxKBmtTF9HHknRVyPbkAtzQCnZA9MJjzD6BI0xILpi4LCdEKeHZgUHwmMHsR6rwhY17FAQhRUAzGt4YgOGfplR5y8aWy4mZy6dT9YwcmD8ZcecgUiBZhKa+c6UGhyjhdqjXVajJc6RlnqEvirI9a8ADbuz1c+ytcBzzbXTLbs9Gm+GWmsBA1eRARHahnnadT0LZ6i956A6W9n7l6H7PRXcRiPct7KLIBqKv6rZcm5iT7YGRreKL6x8wn8gImsHU89jM7DitR+jJ13YgWyKzjAc+B5SfmeiLcIGoToGOLT74HEW49OhmBaB7nFrcHZ8VrsT1rOFNr9TCT18DAWRQtSGojOrUkva8qwDCqlxOaQbpwGWmXMHBL4GOCVb0gxmpxNJdKmJrx2wLKlYVJtLa6pQUgFIDJ+BPja5Zg9hOhugvw/gQQq6esgMeI/MDwKABUMRPlUAzUHxsQ8zUMTHIXMJQkWZkQcRvz7Wtwj9aAgNpr4wpX/sQnPQpw5X7mhabyFHinSX/B6A2V2nrROnGaTkXgfql88bUAVlKxrQaBaph0MBVp/AwA2yfBK5nr4wngTV+ApVqoiAyLPRazSRlQNlaGnxfoyUp7CNPmsKtQrchsl6nEAWSKN6QXxjo5wWuUdriEcXBj8X3dgwxIs2OjNkvYxRuhQGvlvYakKWoHBrqW4X+b+Lf0BRZN8BFZsGSAlr8G1rCG8TiHtrSHiTyHIWqGg8xNyh6HqhGHIBmHWtWGWT2GacOS1NuGXgn3Zh9h3rn8XafG/kRHQxMDfRn39hxHpHLN7NxS5GrJ1BS1cV8kVH9g1GQsNGkLxN/4hJ92KdakOyBXgOjH/WeE2sMpSdaoLHFSUPuBCPtGOVdGJ7+Yh0bkmbbHXEQYfSUc0o8scZH7C8SJMIw91R3rDsILu3jmjbCzzmzbxVC2nRUOX27apyJ6OSna1QsO3bAQCnIAinATfbu7ynUlrP9gYKGkI6xOo6HzY6BINHE6qQmZU60pEBs70RBHH8nEM788AjQbCV0i5sYlhGD0uDsRMIVkcQnFfpmtlRmAnQomlQPM/OvpO6s1e71QauAutYR63MZXxmPlx2TdJ2vjvnF6/n52iK8XSLd2xjt7wXd7KGDGYWYG4q4HRq6XH3xO0Ob6brb7s9sWV7cXCW7cX7Wa36K8ZZYnXPq8f6EkVOYoPPpt/VDw28+qIABrYW5v4Xz6H3s7ZMSCgM4bJ832P32WCHcWiGtrBWiSGN9qKHttpv6LQOJMIPbqe69QHqcKlW2GOEtGpx3v+SvvXtjSBSXB/9Zn6U/EUgAKXgyPjSahrQGh6gVEAx8ncdParv2AzxSmjzCqAwixPAAcYPpxMfPvvvFXRrXm1QVcOrakXHEpEDf8UCnJsCnRcDYA/V+uSI/HriLs6KHDAQFCsx1BsnWjbCuROfvALPoVinu7sz8RIiSWPIcggMWJId2wshrxZIbKAqMgLL2xrwbBnLoBNIbBqzOjtsRnmY/D9TAjgCYgMeQQPuBBsfuApXinpmPCnxfJkKD1CfVmhd4eajxdtnpdudmjzBuv2ivnp2cLZ2But216d3N6V2d6IeoWD9jq+fY/vvVv6SgvPtSQNkqktuYhl3pBRLxKxXimCrA7/Ik2W+4+02fRPHpelQwu/7SZ5fLpFf0qoBk2YkE//7bEhLqYtwlg7eHeneXf/LYt3flKvefe/eA+SWixxNHCy/PsI+QjotvZrsUCr2HuqW2gsh/hHhcAY4P7q0VWqkZAeP7YHryzIbg9IWeXLACZUUpAZhwykFIHpAsqH8Mg8OZSNfxsBXU6SmLSgviBlIjpBmqMYZOMHhyesuY5rWhtryUL4pVemeAbB1xihZ5eQXbBAUpXUqaVtKA0FSNeAGjQ4lwSbLyogO4HWU+BykAQdDiAwAB1a8NAC4jKV5BmkN7EuWLr5QKuJpF4pJFTSNBdCJwFOJBwz4yRhgaCJIAin4hHhBaTwWQcwM4FICUBaAxyrb0wHKRsBGQX3rgNoGME6AEWHDvqUvIlIbgWRDAKGAFTZtc005MPHiAEgZ88QMgeGLJlLbRZEAMlKfgLw0gqBEADWDSMizQapsNIKbBVhpF5JFDluL7YylfVerGUVsfGZnJZGKoT41BLIS2EELcjpJK8PrLJvIHiGNR8UIYHMi6EhCSIfBbmZgh20diewOu3bLMO0xZiehmGzcd2P218DAJp4+g9XIoGsZRwi+M9EvlOz064U52VfYbkuz3Z19Wi9JLfJETgEyU4B7EOSogOQGoD9ILgqwI5SwE4C4gskRDi/BGE5QNApQeHLGTYrd4/+IIAAaeFZaQNv+BgHmuPhGBQgiEz2XHBCKBgaBABnAGEdPSIwctv23LEhqD35YvBwR//TEWOCDzo8ERAIiqBeRD4pEmguhdjNY1zQrBzUvsZWrsxtYvBYabYJcEkQZE3YOCsnCZLyNEFKUXhzgjAV8M8F+9/IhIRWCwTF6Oku6RXKcCGCXA4J8A5yJkfgKOTXNjsKAcYA4KlFvCZR7gnAbSQiCoY/alHVBPgG7pmpRWzQzhJdiFEy9vmaFZ4F61VYcIPAhIKTGgAN7SoWEBfFpMh2nCFZnwr4RGG6OSLCjlM1STIuqDtZ7R2R3Qy1LM28xh9eRy0AUZ/gTGf812nbaYVAMtwvBr8aHXtnwHaFmwK4PdKwl132GYVeuT/A9pXz77V8RutfXaianIBf9P0NIpEZIi2JthURwmdEVCOxF4N/uX7cAYSL/bEjAOZIyERSOobS5rksPKTNSIDKIjVgOUMcfH3f5+EqBMw4DtmWeioU/hKwcIf2HFBaMFEB46ENIV9TYdnRsXQsR6Kq5JsCoN0IDPyUMhVM7ix+PxIIGri49xy5NebNnnViR8eaX3NsF9xHE5Rt+P+QqOMBFD9hJOn2A+mkODxthMc4QzDK9g0hdQUJ6wEifHzIlthZwPYd5pkIDCQxuA0ANzFxEPDRAqJLQl4JiDdaqFkkydQSJIFcj4oLQVoG0KeQnwdAOgSQW4LAD5LaCzSBpYoSMAUl49yCVkU8ghNpF+RIJosP1BkAADUhkrsriAqLcgggIfANBTFUKYQjJhk2zOKRBqoBeJ82awn6M5GmI1wkYmYdTRGB3jIhEdEIAwXGHgQHB4g7StGUhwqQFIDkJ0QkU/HuiP+P46VMEyJTkg6gfk58UUF2G4ieuqUPrscM7ELt16wLC4X2OLCDiBqw47KT1HHzjjyJb8dibiE4mwBgBeIgHkvggE8sN2YPLfDTwynHVGpSiZqckI0DyTaG4mcDlJhlLCUDSw0/uKNOiBSsVwZsEBvJyjFEgSA4CCWBwA4A809pOkigIdP5GwSYwAFRcnumXCZIygnsKjsL2DyKJFpHEoqvQHZRoBUIxvIcueB3ANQVACuNKL3CUQtwM60Xb5P2G8QLY4JzqISD21K6LY/aZCMSBJHAoYBpkKgOJGQX/wEQ3URPPILTRyi5p7wbcFrBMEtjUBa0Q5KyPNjsLngSQqcTCESGsRehgoeRZgH6gySxgkwmRaCk9KwAIyYZ8AOGZaiBn9wzpftUWTcnEgNAwI11GSNAGllWAKgcU9AjtOiB+ILmJvSVNMIyJvxaGaMnwG5gizhJdCvAYSVCiMbsw2AzAF5D1A+YHC2xRwivrt3+ZDdAWPYgfmCwhacz9RtAY+oRL5lLS08ZSBGQKC2kyR4gW0naS0D2kHSOAR0k6ceM2xD9wqFpPjkUH9leRA5L0sCEYyTB+yOAaCFMAADJX4SiZSNDPxlWdcABcouXsFLlFAhZndSbG/ArlsBzSVbGAIrIqClzJZCslGRwQgS+ADZo9EgGlTCrWAexdMy8EPwnmw0LS+cvaXXKIC9y+ZbckgGBDkZGMQqYlOeXnOj5LzYE9cmCE3Oawtzy5cEzeVXJ3m7yN+ec/uSQCVkUBV5b8B+VfO3npU55as7POjLLzvyLSIVSEtexqnIi6pE+Z7AHLCFCyqJ7U0ATiS5aUYep/7DfMgC3zQ80QvSeaZAvCErSbpO3EWKFHkA8FcW0BdYEyWRmhCAp4oTbGe02k1Bo5scgModIokZz45p00ORdOlJ783IQkh0FCnU4FYn6JEZNkfKIDQB8A3YMhaQAfnXwqoKwWaCI2Xnf5gQbaCMbFykWtZpZgfNJEoy6Fa80ZaoMgWFJaBKxEAwgvgFOBaBQQHIHMkYMzWpr8xG2VPa0CJAnqjIMACisZrsCVxJh40BXLwrcH5gcFkxnQ3oC8UwYOYfJ3i24korti1BVUPdS/OwOmElwhuIIIoOQulmuFi2Qi/LrDIAQHp4YqS1ema1bFYBKAlQCkIQIiQGN3QpnS6S8Br7Tyjw5EqBSJULkaK35sZJKAdABROENFEQLRUovsYZs2l94jhGBGIGS5XSpZWIErG7CxBRM4wBJQ9Ao4CRR4FAAqRmxo7KNDEZARwM8RrBuKygyOMZgIr4D4KmQQyBQKe25z0o/sZk4XCT30KQByeihWNNTw/hYw6eWSU8hKPrrKQ3KVgKwH72jLtgzwRISoEhQqWLCOJrA5gX4iKVcISl6SqgA9S0WoB2UfYP+Nm3tllKKxC9IqS7MG4lL3Z5wsbmCwGgxJ6Z6eSopnKIlCy9pkijJdIq0WtLwhh8llSQC6XXyyg3NPcUdKPEQKs5bSriTiOAUCqWFYChqdgugVtScRn7MAV1MXEg8+WgHAaQVhobMSGVisCaTD2mn01QUoY6cCAtHH1TjxRY/Hth1mmgSCJIqnBddK5k0KNpEcqORAhjn7SmF8cqVYnKXAcKtpZ4n2eUngifTuepaMZXhSwBLDAQj44YVKvCB2hg4yqWuFQUgIG5dYOK3NPiEtni0dVskfmMIIXQFoRldC7ae6sYXHxmFbSv1XBLDk1AOZq0kCbzn2Vqgg1U4ZlWiu5XSyZ+EQZoDwHyiFJOMZA/CZYtMVWCZIViyCIZEIIqLNw+wVFUBWereFrCXZcQFnCMaO5cYoiyAG0HkCWKO18YGRbQ3KanE+2/koWeMpIjixJYk+JcrxytIupAggqYbPiAdgYAwAI81NZEPv6ZrGZj0tpfmoKiiYq2FaCpuRloQp0Mmmyp0nmy+xaoRgH66ZGQXW74gxQzkddsmgbU3ScZyIIFHnKZk6pPQzS3ND0rDCiQUOtCLAhGvFA/q7CeiQqkCmI0iKuVXSxrIIFiQurbxF6yIQuvRUSQOcNDDPqgp5Am0lQGq1RFQiySobIV07bFvmxPKjKqFHCWjVyE3yOwf54SUZg8TTlWkLguUidg7IKntjle+FV2aSsXZlSKVExTKEGqPp2qdVwG8WXWtLUMLPVla71bVN9W30U5kAYcAYsdUQwD1rKiSAn0GBzTR1ZipiZOrezr8WChtMihuuOJ8BrZtsidUFq7Uhaw+Wi+eZ3Jox9QYtE8zHATPWAyU6BQodlULMog1yl5N8bNjGwqDAA85XG9pXmv5h6AGUQsviG1AnlUqLwfsmSAAG0BtDsO1PBB1AABdcbVXMXmFzatdAerRQEa0Lzmte01rQVD0Cry2l/8qtgVvEq9aaVJmjADJXS0PzuZnanldvKqlGBTVh481cKsxygKYFCqucUqo2qIKiRfLK6WgsE0YKSwp5VpaAvHXaqpVSsIkDdlwWxg3w60g0pHPoXlq3N8wKtaAq83+r61S5YHbVNC3FbPQxMpWmTCDqjLQFKG/lHyCwDatZe1G9FE0DkTcAxmzWy9b6xTUY6HVeBNNf2pY4UBsZlQXGTwthouQaAdE+cAOF3J9bhy96gNnXBiQpQgUzEwXTqWA0PKOUDNE/NzuRBDo8APiiha4Ta7QdzSes6Dp9Gtky8iwA7EtY0iHpjhwK/SzIlRqU16b0dIiVyILt2AtARdNKnpoVXoC8oFs6m1hBiD8HztFCuveQNmn6aoL1NsgVQn7vYDANq0+IR7AZhhWc70dAOyRKDpSIQ7rkd0nhM/g8gg6wdKBcdbjvbjkx1Q6GP7L0BiKPRMmbouLocjVDCJ/sVdCfEBjSEcBoAAmCvdwCsAUzMme04mZvMc3gJq090qZYMyshrQqA54NMq+Co3Y67ZS5aHncviUibdU8gcTedzwSMIqe9AJIWCkSyk8Gg5PP/pTwKj6bi++K7RYdpQQkrcWZKyzRRXahpBM6ba1PahO82xaitxjaEBjtAW5yF5B8jpeglLldAYEtcubdG2JiLamt56lrbAcA0kB2tzW8eXtupWKAZIgB+GPAEOALbgAcu6LEgf50kB8DA4f/Z3JsAZQH5T8huRrrwAPyO5NoQA6zrq1QHgAD2yRKxLLxByNts2Lgy9IYPVyC58McuHVpQjAAmDoisAzVrTW4GE5HAc+c9JamIGeDksoOZ9HYM5Q84LgDIIAMbB5BQFWh2QDobmAoGoA+29AyIqEPUwcDrB53RgBaCEHRExBrsN2Bd1gQKDjQKgxUDKBK56DM25g/NtYMaH1g14KnrRE4PRBuD+hyRKEYKjhG2J/B7pLVMMPGG9DR0lI4APCCTZWA6gSRYviICjIcjuAPI65AKOxlJlnum4j7rvC/a/4tABCE/vpKv7IFoC9PeDo/0TzwCkcwGFXPTkWK396wNoygUH2xae9U+mJDPrPRVs+j9m/PSkTIM2g8taRqVevLSpAK4RN2iqHdoGM5T5Vs4kAfiIXHvalxn289t639FgdywPCrfEdJ6B0hDUHCBGS0paOSIZ+tCiSgGSGZwTKQLQdKfvpc3w645ch10ajoaXTgdjcYoNbsGz0hM0uMnS2Hnsx1QbYdYG4NR9IcBhqOtEQ8UG6nxAq4ID7O39aalmPVr50RRbDsG0sL8g2CVAesszGSiWpmlI6o6TqsH3chkomy3zpRqxMM6MAAAbhkgGxn97Ye0urCzBKxZylANkxstSiRz4kXOMVHtDDjFKgSnoO0g6SzgTweoApvnU4ZIM1A2TVkXU07pcNuHYyOG7oLDuFFMmkKjuvaGQgXKnYXdZJtYMgCUaLoZg5BC8NgCIBbZck0e9hHbkgIbdfGMSeCjEDZQ6mVTiAPcKFHQhsgdw0pqXZyeVAp8/amspU8XGRWqncYfoQSAmZ0RRxn9mYtUY/AL1KhmjIq1o+WdkDWibVAe6E2mT2i57gjRQIYy4DUEgwLTPC1XAlotLswtGtxhGTJFvxthFlAhvo4m1rHVISBWsASE8bKAPK24VOg5u+XpTHhfcrZigGycwhtlgw6MSXRyZTUBH01HOrNSScB0AExm3Z+gOTufKwHeTHOGs72T61XQrEkeLBM4mhCT64Jv0RIb43tjqbzgpMtMGqDPZiZaj8WkYFHAFbibuRK+6xIKTPrspgh6qLJLwAtAEwRgHKDzCYLWbkRbFp7EJtMnkCgnYhp6mgh6lMn1RhcHOlE1XpK5TnojhMu02yXC4PnI17J3pU+HP0tjI1BKjscStOH36yKnsxoy/sbW/7XjfK0E66rh27SEdcwJHZIi83JzP9OmnKPSqlULHBDh84A4GCkOzaZDrBmA20tW3wG2tWJ0w+FDXMNBmlMlfedVqMssGGtpljlfIYsvrbNtoC1k1vIAWxbNItll8/TJkrfyhT6QEUxqYlNm5ttIIG+VAECsVLQ29ljA05awM2GGt+pxA4wCINZWdL/iSg93OfkCH/DEB3A1uYiMkBuDxckq5gYdg6hJ4UGD6eIacvLzDLJ52Qz6o8tNSXpegDbVkalUZG5gjYay4lZDbdAUrU4b+dGdjPqB4zd8EYxPLGtUngrigGSnVewPBBcDdhhwzlb1Omn7D7hwqxJCfntWyrQRqVbEcoC4BKr3Bga8kZhipGBrRRkozQDKMhVV2z+nuJJYhPtnazMl2tRdLktlqFLQJrq66LUsTzJFQwddRaU3W9kbZlABy8ttq16GWgEJ9eZAAAC8cQO/PFaSPz6R1v1ms4aeKplanQxN+Y01tRvMX1gqx/G8teStoH+tlh6Qy5cW1bn2tTGLc/lYkP6XQDpV4y65eW2wHzLAGyy81qATd9ZAe00cEaCKCNgpbytWW1+YVtrH7uQ4yVbVKFUQnZrmgPY2ywOOdS3t+JVVdAP6lfKLlWq3W36HGnWippVx3fkfk/IXnJEetgUd/tvrvGpwKJ1zWDc81sLjxoJs8cVtulNnYTz8HLcjo8u1T3b/4BKdmo0vQhjkNKB8CXoJ3wn8iN0rPIndqk3h2wp2CDWlHoprRq0y83WrZdFYEA8jKYNk8Rp07JmU1AUCMID23XIhgwcBMZgKFCjrd7ppdxXkuVDvvTQ1naWLimGlysA8AbXFLVyCnD92gMS4GdT6jnV6cGocEB8aJ22y229hGFfi1fqJWmbb927D2eVLBZfXwNfHV2zlDjuNm2O48VnFHZUsx3QFN9+AZsZRHgKbboUO2x0fEpf6+jWl2O36F5utXRF3lmI7QFqugP9L/d0a2uaDCrWWbfNx/jXb2CxWy4tl/G1YCnnohbZyNzucg8BDz3AFGt6qVrdAU63aJLhhidRFgWHHlVxxs28SS+2TF0F9xv7byM8hZXM9Xtl1T7bdWg2vVwJ9hYDa2nhALTLth02WxxX0XY1ywKVbYumMPqMwadlrKYwaiOms8hzKztbDQDa7EK95/XS9MN3ohjdQ88CvTsjXtMxwVAYiFcwaCyh5QjiSXTik6BlYQ9H02ncqloOGM+ZxCa4NHTpwUKrwArbcOkAlNkREANgM4GZncdu70DFQbWFjBX2HIp4mUYJ+fgBzjgAw/gHtocituRP1uijaQLsvdpxq9ZwTr+NRG4SdwhIy/d0Bk4QRUBkE4FSOZjOTVEoVTRyaWV2lCdmo47xGgjcTDw1SqQo6gI7JdD2hCg/180KxuL27vjPEcrcPHaXvchdQfIukzvp4QIhmz0Y0SqznjTAic469OdFVDEh+NyNF4EabELxs0UDzgJhoSaEOWt2RDHH5urZ7k/+CTxSIOEGC7yLtMkAz4uk4jeLp4RHx2Q5UIhO120dGJMQCamJ2489MsD/o08SF1Ig8wpISFPTpyPzpijiI0XTFs2fU6AiSzMEafEbGvEkQXxyQjWLrGQlaaUJTicJ/4XzO5AqJDdw0OGfsBHkJIOm8aQeKprNpuQdgUSdbl3DEBno34vT3kXFHienEI+qSOKCKMvxFQyEV4v9fshCBL58dD005Bs42C9x80zZyU5EPoQXNI4N4ZZXsgyhyF2TKUBs1ZyFnNZkAg6QIV2H/wcdghX4cV7i1JeJLgotrY4t3FOIhwEkwEgFwkhlMy6PDQLuMUvYKgr29cS8WAOEA7TWksiaGHlmWkhR0IbXUcMa/9myVZ2uZLHIgIvgkIuhHEjzvlFnnBeQ1JE+O2xBo+/39gtE0a+dbo57qMw9gtqNe1vGWHKZk3L+Mpbxd3tz1jNB9i4mZrv0WbRLp98S99Zumy7qH0WHtYEW74ZRO4OD183yv9LHx9xFD7Y1Q7nA0P4+Pm/+8o8zl5W85Hhrw8/MSSK46DWS6bUwfOvC2A7FVhI0oZ4OYg+DShqB1YZEPzaxDhDogGdaFvQGurChjANwZUPZylDn0OOUbpcB7SqV0GFwFYCHl6GTdXgcD+zaW2dyKtRAFQ7KqIDWXSsusPELuGAevuIPbBqVXrfa0LPcA5HwA3AKfDAAY3kiJ+VzcMFgRBU5QILIhiAyOvXw91gEMECVh4BW3pcwrBybHn43T43H7wwC9jf5W73pctF34c6CvgRPEia+/joA+MKcojWpCW3oQPQfonKiRsDwZ9d0BoPn0Ry+Abo+AG0XwAAT7gCE8ie+rsnk13QA8OfQdP4eXAKXIFcUBQjSgEw4p51hdBaVETnCNE9cd8hTE6noqy/KUR+GnPelkuTEUhn03xVGx8h2as/vHv6J7OA25iRe3wKgeSC4kSw8yhwX6K9Z0Vlw5XfC6AtvDmHQI49X+3o7IJ0Ry4TuMbh03Mkbh8iYDV3rG1oLz0EM+VyjAfH6UdJ8+9ztE7f3nGXXWlBQhZujHCH/wch9WhDzCTv0cCvibZ0qbpUxJ5nULNC127poRasKI5mzP4DTofoZzI+ANL9PgHsZQZ99U16M0GgSrxeCq/GD1wTQlQH01tlih8fvvzNjFAgGChRuiu7H3wFOC485Qn5462byM83Bow0e0nr2B5kuj2k7glQB2D2UmYbRFYRgSK2Kc1MmumYLr44kHuUKdPszAerZ2E9oA0/SYDAaK79DrvM2TlEsF22hqhW7nzOUKSZCE4UaNrEn2NV6GW7PxZ4pw82bI3SEnSyRZAxoUm0eFV/ZuJjtAAL4PvhdHmxmBi1LUVDkBUz1lNr16NYlLapnk9CcdDKup/IMBbX2hA0eD99MyQE9JAaG0DGN8ewsUe8XFAniIWN206vQfdihCjihGSuMG6FI2rFDiZxJsXVQkVECcMy9kMSXpSzmfhtYNHOpCc3TRHgIvkv8gRMN66Eh8K7Lgvzn/F/IiJefIFftk1c4Fim8KnWiifZAQaRZ/xZ/+FPlqG7NAozZtfwU0QbU8w/RdKd7kEl/cd7XR7db/TxVFtP6+FgGTyiNUDiLIAbAmkWQXI09D4vRxSFIl64sGUSRPoyuuz1gT1lheEkWp3N5lHNddo0PqW+UtchS6NcBlWutKG88cRtW6oPt79qA7Od5FuD7j4r0Iv/irgZ+4wGhjSyt/kohOa5pGhjfKHmBCgG+i3gkhW+WiPiBou+pFn7wIAkPiDpWwQEd5gol3us49Qukh5hkI6gCO7NiY7qXxOyvzEJZdiZwg/qgsC7lOg/WpXkLqLgANudLhy/DvJY9eQjuDanSWziKCUWW7hvRMm57knYZydqte4Lyt7ml5GezloEbvu0dp+5/uyQv1YvuaVvVbkgwHs1ageeHloGQeH7vB76BBgUh6mOKHhwBoeKBJh6kgitoLb4eeckR4ba5HjF5UetAHrZVaznvh5bmjHof6YOsWqzYcAyPpx4ZQsbjx77UY4Px7iYXnv6Bxy25LsCSe+PiQDFy4QUlZKA0XpR60q3PuKYmu/+m/BAYoqCcAieGgR1YmWPSCMB7SfPpQA+B/ABEFzyHnqkH8oMSGr45utAJr7a+eQeNYiYkQZ0HCeToHr5ogk6Eb5DBVJop51WQHtGwge0DssLO+AmJNioBFAO1p++AfogBXylJsiAfWUAOfa8K6MPIGRBfgXF6huzfrE6emqXidY9ytQWA7ke8QSp6T+kiPcGPyjwfYD46NQYYEFybni3rme/MJZ6W2NnjVb/BbNpYEtWAIZIjueKQeME+e90L8Ehe6trCKa2+7oKpHuLEl+5jS33HQ7G2CCqbblifUpMSOmQ0m2C3WL0lxKTSlxgwy76RqgaTv20qharfisgGeJsOw3t8B9qM9hcq5KU7lgo4hegctIdezql16iBFaojoeafXiI5CBaOqE7V2oigL6i6iPv0r8ukeLOpQWWAImBKA0QJaC647Yu34deiJqApUhShlOCe0ioegjKh9Mk8CuGpMDbheSykmmTp2NoegaI+7kNIGmIe0uqa0+hahBKN60UPY4iilsK/pSqZockIWhEMHEYrOG8tP62hRdFMKIO6aDNChsDLDswBQ+gjqBMwkFoewfgAMucaciswlnCqwCwnwDtuo7vlICW1+vqTCWs7qNyP6q7NeC5hSmPSQeYUQUoqeQEYdEB7SVobXZ+WO2nyqVm3YcHJv2RXrdoleQoZEbUhAvE2EthbJG2FJsXoTLYcAvoTz6LAlIbiE9hhci6EtY6Do5rDhW4aOH8qmIVKqUOU4VVYzhr2AV4Yh8wAe7FeMqheFBy40gSHziDDsSG9Se1Bba08FIU+HUheqluIGqTtnNL/am4cKFp4QamKGnkvtoCbiBAdv15yhDStTSNqt9vdItm4YUeHjqw9hiaj2P0mQiE+vBut4vSvoCFxZw3zE+KPa2HKioZ2tNp6Dk6uJqhRBKPsv4JreAPnBKAogMv2YNIgAddBfqlcBmp2Ef6korxuHuJtDfgTjjDCKO8+mUCqO+OpOCWwvIdL7FuKYTzpvm1iERrM2TFl+CsR4svIwOOP4GNBgISYaHoJCg2sNrbClMBNrjajWFir3iWuGXRa4eRLkgymhygchcKgdFzCV2SFFOB9haDrGRGmMYa6Hxhl4OI4AwYoDghP43MNuoiRgZtiCRYu4WTDigf6muEEsooNQDwuMBOIDuOWGlzL3ITrpsC9AuBHJE9AEuL+6Yy+7Fow6Rw5pFgEmgnFjAIAL0AVAtQa6lpGuwf4OXY72VYfvbOyh9nWGlSc7lZqrs3so2p2azEiOGOasliIEg2Yge5rCOSch+AKBSjppZ2qI4SA61yYDrRHhekDpCFAGOXqxHryFgZAYNaoHu1otAAQF4CGG1ltg5yBmkbPLD8WXrtHHy+0ZfIDhcVp/L3RABq1Zvui2idGlyZ0RdEww+4e9Eb8oHpyioORAOg5tB+QSMG7yw/MuE+hopuuGBRe4a9EYO0MSQ7ohZDqeHa2d2n+JySAEiQQTiv/OSJYir4a9pEhv7Ew6khEPFOLriIHIBGO27XHu53hWIeAp4xdwATECA9IslLshPoiEw0MzXuoow2ekkqCuSqhAuZ8i3ANJLsxE0rP5CaOElxgQwVAb1BcSNEl5D6udtmrGGGBcJQCqxdqtrHwwGcJrF2qsbtS66xjEkaaUuG8H0yc68kn6iVcYsC7rYwI3PiiYQksa3p7S0nEGQqI46mP4iSqRH8izQlqFMKCuCfr4CVcJSA5ZmensXJJ8gQGLIj0BiMBpBAhici9woW0gMZQpxHAMMCZx0cTuE3QucdwAex+cXJJ7S/0Y0JHgCEmi7cSrWFlx3g4agFKQchYDQxZ49ko5LUW5kokDdkw2NghMElYYZrVhk7jfp9RNfGJZDRk3LXF8SWsPsRcAvZlnAOxgpm2DSxskncCOazMXMD3hE4eOIyxnMUTEkAtMViI9qczg+TkheEgGACSMQHWJXa8IuOFbGbMSvGwAu8crE5S+Ic9pG2b4SbaUxJISSKb4rDj9rsOshIKHdQvUDw7e2MEYI6zREgdwBLyD8ZzE1qiEaFGphhiJcrtw+7MjBfm+kRJFpkbkJaB5kaUG0ixxQEKGjAE/rr9T7AoPlJHf6OOss7p20FPq4oaDzBgA+KjphQr/ghCQ0D9290Mdz7AV4hrI4qpXAxDsJVnjS5/O04NW40AhhnK5naNEQAHbqBzAt6SyJEfeTm0TqIZyc0EMN6gU+kicH7BQk+r8ovA//m7DiYpqOUpJaOqNpz+AqJp447APZk+rso5Aq0SQ4NopgCIAzdq3izxmII8pwgz4I/zhQFAGABjAYACkDfUooOEB6gCwu+CukPThY52K11r4lpi9icEjsylruqCOIqAHFyDAxEJzS9CHiu3Br01Ytkmpm8pugBzmt2JiCkA1ONs6JMj4FChJ6NyvKg4coiRHLAEPuKeCIOdrimbquxJsfExA5vlyBFQ6iCi4eoTwBFDqgjKIsaFu/tHfCpI0nMqylK+/PsBCgOdngADYa6kIiii+XGGLMJxfmy4MynepiCV6s8BagHQgXp254RXYCd7UQNid8D7J6sC9CRYnblqYCWVbmYmxg9uiDCguo/pUDVwJEGbqauqThAG3EpLhlGAYMUIon3IBjFC6aS0Sd8T8c57D64tILqrMlhe/+MBKDwgNEJDiE3sOcQ6C53EGDiA0dOrjJuS+Lo5do7YAWz2E6OMsLdMJfLUDBspCSSn4wl0C3CyRPiVrzegVsEyl8qX4I4ipYGcDk4fwdOkUC6KdHJykXQhEQ0BZ4Tyf/glsbprb7K6PKW7B8plACb5jwwYUwCZuRUKD5WwVMGSgpJYcNqnBmWEnEKcaEMuMiywejv7Q7gDoD4hB07Ykq4WpRKOIC+AkVMGYnAN8PvBjsJIspGvQqkZYjMy3zDf5ly/cG97uQ82OhBvQErrQAOeNcREhUJuULMD6Jh9Owm7JqmiGi1wKaZsLaIeqXADe+JIHgC2oYZgfT0AaHFy5aEZYfUnIA0HKmZxIrbn0jBQuhAAH2kmIAsBJ6n0AUmWobachBIAlPtinHJJSC3DbQsXuXoHJL0FQBEk4FI6mkwiABHBdoBmpfqFSPUQKEjxJ9oNHHBTRrwHqx1AXGIhyA3gCYQJUoXNEwJ/4oBJiu27vTJpCkQc/HbmoRkrj8BXOHek7JoMdJw96PwOR4AAftemLWw/JcHiJJAIYZrR2XsfKGRO0W56N+PBto4Oe5HmMF6eP1K0m4AyQYJ7jBP8swABetQd9EwhxnusDwhSGd548Gj/ob4ZQEXowiIA5Hg9GAhecRZ58ywiZQDgh3QbgARp1cPZ5+OQMZ9b0kfsVCjnBc8sOBlhk2N0CaJlQNYgAZ5HjxmVKfGQ0B3wKQOwkjG6xreEbxrMdvGwJ/JPdoaxFXriJwKnLDV4fau1AKzfanGNqqqZwGtuI0RBmduk8OUEbyLgJM0YelQJx6fjGnpwdnzEdeI/kVx0W4ch5jiJDiIqDBCcUdnALQHMHozYc91DRGFY4SDsCJAJmNWAOJ+ICBmE8ncOtDLCQ6ZqgjA90GcnJuA7iml3wywlqYDuNOEO6kabML4Bap0gIDBKRHXtGZKJlOFyQKJfjqf7fJ70HFECQnvqcw7u6CUaCH05ujDR/pkiY5A/U6oID5Dumfn55pOFED7YtJhgirKHmefoUTBuwaalAw04afVlRpMaQhAOY/CYn5CgKfmFqpQquNtjb+HET9hMJMSjsmzJnCebqABEcORikprToqLXIipvCrUq/xqH6QyYqQTAIIgEMmkSZfTMwQgwQ/lBykgqcCebkBUiOCjZEMhDwhSQPPLl4Rom6uYkpZr2dJro4KqebrKpGwAinw572ctCppyeqJmwqLUgz5omI9riwMRfiFOC45FAOJlLgLahiiUEHGXX47uHmI4gCZOouYmbQxAZln9uf1JN43S2QnL4Q5GUGAD2gMwBQixIXNMOgppBGlzIPJXKYMlgevqR+RiRCAL6btAQub4AvJzSIzq/QKJv3ELpE7kunDx7ASJYNhXAePE2am6cAm6SJ6ru7MhQqjvHKZ16T/aQ2f9ooHHSMkOTYbhW6b1C5yecbel4A96T4Z+5T6SsGbQ7Ca+mwA1luATz6MlJ+mES+rt+lQA10SMRcZt8tzRFB3WYDEPRy8qXIgZ5GXCHgZNBo+64AUGTfIdBCIbBnjZSQTBkLYjgGhk7RdQcdFOWbnlXk+eGMKOyEZjQMRnZgKBnvKfRsISZ5ZxVGW/A0ZfUDwaLZkaZOgOeQMfXzlIBym2rk5lOetzDqMlO2A+wIsC4AmouACrBJCxlPH6ZMT4OTk75uSTtCGG3YM2loAcwIflJZHejGCjpNgKQzGUm8D2xRQB+RpCGGG+aEaspdAMZTdgujhvlb5uIN/nVA1iCFTT5LasaHQeR6lnhL5MkODGH5b1lfAaQSsCAjGUUmc07GUGgurIYAxlNpD2GDsB1jY2ugJZggFE3Neie2o2CHA9qikT5p56r8looUK8wcHncucEhjGUsN8djGHu98SemExnkAbEIwT2vsYdSH8RTGQCn4YBynx+mTwXxwhsebE48+qozEBmI3sxK8FhcKKHQ60Ed16ShSltKFP2E+HZkcxDmXukCanGILG4AfakS5+w8gaYKtwQOnBkTZfqBqLWF7ScXpqOqSUqnSFfAImAFZbqMGEEA+ZChkW+4wPUoR2IvAjnY5GUdwnQoZqeLhnMCpicAWO6svGhfkTiD+T/QltmS7huSdkCj4BRUEGl+uiYInEm89KYsGkJ8Gt0Dk6Zbn2CoQD0viDo5/JtCocpFRc9BOi1etH7EIUbJxE7Keiv5ysuYXnyo1Fqui/qsuKEBFiqswYb0UzeGmr4DnexohtzkAtDManME3YJ/gFQhwOH7d0zhcVGRUiiff66uGbHrKx+6OrYVtJxGgUVDk9KUEVdktEPfYF+CjFEVOwMRTPBxF2zqYgu2WpvmaKmJZCzI+g9SpEpjJJrjESQ6n0Hq5h+9xSbRmckXsOxeEoKPDynekQtkWYSw2Yt7a4FrgKynFJENyiZBtQPxi+ACBHMi4sfSSEgg+tvgsiIlM3gWQxFtrpTCrAzBAKzEaqoTxKOwlqASX6k2qVSXkAtiBSlS+lKazI4wXHCwjzh2IKuQPKNFpkDGSNaFaBYw90AMhOg53kQGVsb1uW7rckkmZ4CiD8SLGFGyjF8A65e9oumsBvUYbn1hvYl7KTEgzjPGp5sXpIX5w7hf3qtw3uUXFt6P+UynD5ehhanD5GgcIYNWohs1biYkELZh/BA+SCHUZYIXB5vwQ0NwCgi8ApaUuAOsZCYoRY4ewUPh3AI+xKZ3BXHBWlfBbOHT5ppXSoyQVeR7nKFSNhwBHFCGXmUGKteZnkYZrnnCHN5+GcNkeGnedIARl+sVIUIwoCfGUsxZ4bjEplAgMKoFlnOrOGyZWMR2U4xnBfZmpl3AH2Uvhb8YIXkxWmScbm2kxI14Cxe/GqxRlsMO4U8OuBMEoWIDtvSGde6hRKGKWylppaB2ehY/EGFiEYGqNqhgt4Rpwa1GRS5wLZSoXQE2ZnNkzZ5xDk7TwOWYCDHxhSR5iSEAWRa7KctsNIBOukHD8brcdbhuQURowsjA1IOvHii9C+KemSy5WARzxLZToKzA2pvjlK5r++bNH7PI8BJw5tgpsTbExa3qR17TefAFj58Rp5jirEIU6UO67AbiSuamJ3qG8l2MnoN+Vy5fiMimW2/gvBCWwCbr/iSAyGsAgWgJ3u+WA493hXbQxB6A4XqAUieL6f5fxWbjrcjqcxVpRX1JKBbe1GD0VpF9hcWVSJsqf1mGCyrqzJElVFl0mxcKojEA7Ay/IXY8y6oC/4sltvqhVgMalaihNwaADlFXQ8SMP4DFbkO+Zcy3zPCVSVkrkgE7FPpFxSVsjXAbBx+62WqKbZVoBxoUwBJrKXaKsORxU+YoRRUBbI4FLUURY9YibJZFowjkUCR3yoGkYVEVSGmoptjoJLAlL2fUqPFCwJpKMlw8o4ABFxtJri1AFPjmky+N0ukktukQuL4Kl6MOIC+mLPls4AVMscKBUAxWq24Bgy/vgFZZ0YHuSYAB5F2idRA8d1F6ly6QaX9RxuZcLDRPOWaVKFT5VfCCBJTsIFWZmhceUlap5SXH6FJBPjYXumlu9FTgn1d9zvRa5TGVgQ70WJQVldHlnEyxTKro5DQ6ODwbOwoQADV15GGSDUPxnsSCAq27Fd5g8Gm6jDWwxgNb3lQhR0dAZ5xUmR9nD5PBtJxulsNd/JOumRBFQFQmNVjVA1ngXnGg1HAIxkNZGALZ4YVFwAhB8mH1Z9WjG56TPLAx5pZWzqAdpcXGOlF0M6UI5bpTNoelJgUsHel5QL6X+llGYGVD5wZVXLpe/cGGWzByIHyYl5w/HmXFliGZ57IZZZTa7oZLno3nVlZebgAt5BGfWVCupGXrWC1OWowZW1/eSrUFQoIdZ6j5MSCzXLZLGVzUsFEqgmVbxSZXbnjlpFSoj8FhtjOXVe3UtpnMO57HpkogkFi14kVVLmRUBacPiH7TJsLo/Cs47kAGVe1QZT7GlgWoMtUn+TMrCC+JovCFw1VUsvxp3qLuTJEkyxUV+B3wXIDnbA52itsX8VnfkogbkSMst7EBowOayWoBGUiVVBCwChBeOVPhRUmwX2U/m0hZeuiB9AvBHPZEwX0s4aYgcrjnXBQpxXSmzQ6fmXCWVyJZ9BeFcpE+AHIuxVkR3+ltlebBhk9S/6TpANBMBgW5QO2I3mK/jlLYcBupMU/1UiImDZon5OEiGpqCtMjKwUUX4owpQoPIrXWjPnzliM19XakwFoipMDuoWoiErbY3CU0lTgd8LvnQaDWO0l2opdudB4InOfsDt+cnAPViyFClA0u4dTuf6b+pbsjnrC6UYPKQImADAjoIjkHA2TOqSPgHfMnjtUnow4lU6BCN3fuFwJsMkEbzc8dEmoIqw1FYgjwEsDYYwCNnmLBUmJMkKgVIIsboZB6JtQAABS/lAjjLywRbQ3KIGUsrrJsqIPKyqS1jnzmeozTpsBchHESY3w4vinAgFVW9ZiYi0KjYA0bkMwGwhzeQ6DUgW8ViDXV7AevCDDAg7db8BuN+ASlGFJgwuQB0A/TEdw8In6uJUYIVSUTBJMnpk8AeNCOHolVIZYQkhIq1TiUqKJGTrAhEAACDWk1gwIFHCjggoEoRBMHwHs4ga+5kXjthcjSaa71IQvLmZ07YTbXGUdsMMQiYGkB54AAHCJ5LgRzp6BqyUCKRngUHYaIqNYKpsaT2GbbNtW65LAWSxsBJUqPHzubGRJZLuVsefBZ1TuanJLRJWibGZ1KiKLXAhJdWrXWec2dLUQFWiiNoZaftLR74eVZSZ41lreZPAO1miKRn420AGvXc8iDjJTwo69VCjAAFCsObQArhNLLDmrwVsFZQRQf00C6XYEdaNArBJhXm1Hxuj7eGaLtZbmGLNpcEENiVQS1EtWAeECSURylxDuqehiSASw5OJIit6KIXkiAwrhmJ6kt6wE/J6GAMD9BohrBTbldlXBT2WXNRQGbG2xZMfHUqq38fV5QcD9U17cK6ddwBR1NLtnWAy+8M/DL+SntbFl1CheKB5RUSJBy4trIMfCANFjZbw5AYANQ1vlKKUuSd1uME97hwefHfVIBw9Zf4gydKL/DwFgCIOw1QAhMs0uJPDd41sJzTjfrmN/+CSA4QaUNMV/q7sS80kA3tXq0EEgqdpp6yD8OAGSyG5Kwkw0XzSFqOQvEUDmptxJmFWT1Zro7W5oBUEa4V+2iPnW2KBPvyGoJvQqUiQWrFC6oHBGeFgB057Sfg1bwhDc1iGQmKMFB51+ERMU9sYuAEJapxJZ/VpQldTlDOo49VyLK67kMy3BArLRLBqCu8PC3c8ObDMAacTGja0KNx6pUaMybyfMhDtvjSbwEu+pGIQ9kwTu5IciBCk6H4gPwIPCC5OopahzxRKJtYcJFVRd4jYKkkdrRBoKC1AFyhjTulEpqfo2inqgUPBV1I4yMU3Q5ywpDn4oFbEhTtuNwC43+tKJQNUluLrj4jom29QcRqFlmRoVHl2hSeW6FT1eeUvVTmpwqMBXUbqWHN+pcc2rpjYeunnNXMsxK6tl1bumIRwNn7ZwRModAnMdcCb/YJ5PYlRWZyIndubF1mbaXXbwHzf3VltAPiujaeX0ZbV952GcC321RGY7VXRNfujADt9OSFZC12iOO1Pg9LaoSg+TLSg37tLQOy3uqXLTlA8tunphUlZP0AK1+0mLaK0BdLHlDHDBWDpZ2S+0LbwSTWx7ejBIt6LexGotF/lsAGYmLe1qXBl7fi0eGDLS51CtRQE/KANFnebJ1+iNj1BRx9pXtKD5SiJLWSyw+QvZkSfMqdo/ND8k12pWwQdCGAtxnTbV9WILdmxgtJGQvajG0XdtjxaZEbAB/t69r4C8h61kYEgduBpB17SleIy57S8He1rOAzTmMZxGI3YOXXat8R/bji/BHHiriGIqTHTlGmQSKMO38SuL+4/BDyacW0PL7JMxUreAond3MaeLWqWrUVziJbXGLFQyS2CeExULekdIieIQFqIOct9QrG/hFViEBcSfqA91/diFKgCfO+TrHo+gngiEB/GxqvYDsRCMoj11x2zYJJiSeCQVBBx4cYpKRxwPYGSg9UqsZRVx3LfwRwJGkAhKPO4PSCAs9AYApAQICzA0BQtNoBHF+kCkAL2cAHAEQD8EQOiL0hAe0hL0hATVrsCHtHkl+1emvImrBogkWpar+ENxUHxv8eVQ0lMAtOn21eifwlr1RCh9K1yIUWmqmDBSXOQKxL6KXBTCeJIpa0SH6ryi1QU8Hyuq3su2peO4HNe9gRS8d5Ko2H9iGGDeFDl8mZ2XvdwTPdrhh8Pa/ECFV3UcYfhyCj/HCa6Ttj2w9cfSCA0hu5RByJEgqFChFQWBomhpScjuBBbmkAI/lAgLOW8l7AA4UTrvGy/iqDfGrQFwqykQKA914c8nuAUvGOUKL2CUNfqlyR2mkLyXrAovQgayKVnG/W8JlTLJK2wgnEMLyOedpRw56z5i8gBQViSGJfFQTDn2L68AGOCLVLmecrJhbIkQYvwzHhY0l2sRJ9B/p5uuvWpQWePO0WIl/eBS6c6vRmE0M/dtK6tYaUklnjVpInGq1SG5BjKEKJGh/C1calfhGs4H/ZxgBOZ8crpQV2MISDahiHFyBWQnmXYzjGPxuSijIx9WdBbAmKWtJx6AkJtgAy3qB8VDEeBCTiBttYL+Rm4XJFgDiNAYJi5P+aXYF7bqsueBSwAsgEGixe3FVahqACyVQOwAJOEtUBkguWwRrCJrl2h7gpEYn7aRaUhnz1gq7D5EhAlmCajYO1A4JxTgXAD/kuJxdgDgSpt9OQNgwog4v1PJEWFcmoWHxna1vFmyQ0CxAzSqUCSy4FJ640RMLrOnzss2RQmqD/BJZhaD5gx1i6D3WFJoiDYg28W4OdAJnTvGOkdWnYgTg8zalAVkLECq4YMKUCEB5OHkkm8s/iMglFQyba4ymY7Bvz8EtmJoNmDJOCEPCJwmgCn0IZCGQO3EngNoMdYVg4KA8KGAQsDN9/xdiCsDIboU5xZP5TSbIa0KRJCPsn+KwkZiFuBY5hYqlW3l8tP0DmAlDIQLZiBDlQ6kDWDPCk0NBDTA/8WHEWrn0N04kw3bhYqMQy6pxDxEVtCa0XrsWyklPjm1BLDIILZhcQvA5YQhD2RkIO9VFQxYP0+HTr0M/OqcN3VdDalQ4OgwASoJU4NFOBVEhwrhLb37APA3wMhuDA88wkAIFMMCc6RgMOAcpD3frhAoy/g/1KlbXFSUbc9KP4BeIKFq5wug5wwh6y8O3skK39GCakj4jWeJf1He8SXMB3JcAy6DHOJ3pDq0QlUP1gNARQ2k72tpCd4NRpbbUuQndCgMfXD6+BEYNWQfw1AEwpwyj8Ny5FA4OQcJXw/gVNRjoFmBylhHUgj/DiKIqN5u1yvRQrAjvcTpbZmnB+VyxNCS4U89L7NECi9AYCrBcidcEUroA4SCd2+9zATOx7VBuUH2cBlwicGVmcPXFYUIQ/Uy6R2jzr2Ey9HAKP0ToRQBP0Fq1uYd0shv6Cd2x9tUqL1iqAnZXD812flfbj9EPdQWXBRLdpDcA+we0E/pRQVC3OA7YIJzVj6MZEGxjHAHrY90bQJ5CBBmeRtHMeYeeR5o+wrd4bzY0ThiaB+MBv4B7SVgCgMmoSgNQYEgmAHON/N7tdhltjjHsXKlyl/ax4Td/hturzYo4ByMkA0Vo/KxEgGY9EryXGAOPEFJwXUTiAOzFOBRS0AEBhKwSMNj3gjyADY0QmA/RgTye1Bf4PlDzQ1gWRB/g6sM6DIE8sOAT2w4p6lDYE8EMQTjw88MIjN4/SQOFTsFaBgQ4mDcBEADAEdio2ehnNgSAsZJcoka8MF6M4lSg+2WR9I5cd0x934/H3Xh+3WwXDlHBbROQiv6LANj4zPdIy4g9zEq2aZCdfOUAcMAspDtgmHQ934RLcddRNGU2FaAxQlmE5KYdaHKMQnJyisvZ7QXENADQAuQP8n+dJsARa+aHKewBKg8sYQGcIDaTck35dyfC54l14NtFmEdk+BAjybmI5CaT2k4g42TByDKWxEGUcoyWwLvmUgOhsKKICV4+o25O5AaQ6WhWES5MZPBCTAMfWvtyAOFOn9YECaypdIKpDjhAVgNpCpdOrPpDRkWU1GTcQayrYghgZIw0gpAVgNeDlItALRqfQAwzwlxGMMuSCIJkmP6Cqoe2Cok5V9UHgTgIFRBFgrJHuF1Ov1ldFaxV6sAGVl0By+oxoH9+pCOhVMaybizqoeoINPBgOuHQh2JkIuBAOTBuJqHuwI8iVMs0wwD2QipQlYbSrkYEI5Lik4QI5KWYN0+KSIYq2U5n0UcRgCgGC7I2iAB5vjA1DAksI0MQUA6EltV5SO1Vx0B907sfbB9JuaH3Xxb3WxNAwHEx9N0gceMz32YvE5gz8T13Sn3LiIk2JOmNu/d9OYgcxCjMaAaM6pJ6itKoaoZS68ZvF3xcM3HicTAlByE0M3KGn4KTkAMU1gALFbixsEDSMbCwgt/Rynd1I6Xcn/g5SmLhlw93jUpCQQg0RwvEWKSc4JcCo+c6tAliFlEUUfqDzM8I8U1yB3jDRHoRvWjcrUANT+1KYUqztIh9AwIXKt7B0wABEyRZhCyZ9AU+lxQITzTqyYOS/J8FHWlcJFAG6gcokU6COokgXjE7DD0QHyPIQDgc6Q9VHcrd6RQG7EAg0MkuWbDOADs+BSLw5yFeJCAggKuo5YTwJiPBQms/BZumwTdhUvmv0oTMOAFk4HFNmYsFINX+cRsKBaVKpDeC5AZqMCDgwIKW/wOR9OZnD6jQDJxZKAouZNMNQQ1Xxytu8PAyzb13sF5jbY8XGqCazxAz6AuYSavOpDD+7PeZ/wlIr9MRs+wJ4AAz+UEwjAz+zf6Pcd+1UGMDRIfZVLh9B3fu76oceEoDRUSZYKoaAPUDOKx1Sfe+Ffxn4QKwtoDxiRBZpabsYLL6S/UkZE6T4CMAYIobIG4pF+wFxDugVkNpMZAYAFxAgLVddjDOAp9bYPzAG5ArgBJ8QGABpgzreyBPzS5Hn2MiNAJSJ4Fq6gTpVQso5ACwLRCQJDXAW5ra7f1700eNMW5rRwgDz8FF2hiFrNVLMkEYzEwujOWaTdT/4ChGn6z+ERC6CV96KS/DScFjTN1B4CswijuwUEIjhp1PwQ6HOA43r4ipI9Cw0CV96dB+b+0cSuHo1GACQKOQAO1rotgmdUDqJi0WoR6DhodKS6pZWAYKEF5mbmXPDl9W5mEmOwFtFVCV9VkFuZ/WQzQyyE44HQYsloTSV/bdMIFfvgvIpkTFF7QRAg1B3A8ML8aJgKuOPbTg/do1j+NH2VxixLfAWWwgVouWHpDeITWqyfeoUEuCVT1U6Dn4VHWXEtzmVRDK4TZrRFVMi4gmWfi3kPdoeYpQ5lfUA36FCdYsUpGxc+CMg3dHzOmw6oIYl/gmOe6ALkmsKehg+fYD74wufrkGwCAHGhMC5IjTgJCNywssPK1q8LvP5IuBHSaNLeAbQGAoQyAMv41N2LnZLlOSARk4YuCGkqOX+UcHYIWIovD3a+ilRP4LXptDAC7Ud6bMmlFL/dks7bJ7OEuRpJaCz1AZN5GOuCsixiUVwpAkOCMnwCsroW30JUbESlUpQyefUSg3kmGzJFIIOAGteJ7m/zGwLfgv4RuaUPQj4BeUNc7jAx/oeLDk+iCujnxW8PnXkIzUGhZlO8jvfU+9RgB2zliTQA2InsnxV+AGLKC4TKegIcYFBzA0C/YsGoqijc6s1QM/Ok6leuQGO1hB1Sc2DREqxuxSrx7JfC7VJ84GNuyhpZ7LcpGbEoDt+ApnkM/zmq1gCiL9JOqjicZGDDN7it8/uLhCjwC/MYzyfZ/Op9gHJY7igBULaO68MQLEDCLtUqUCermdH1wOLv89BZOZMaxcbGZiRLkuPdNGhmjXWpFnBJHg2xLc74L3alloSQXjRKjVKY+hI0oN+pKNjdgsaRCm5MJED0n8jmUimE4w+7KsJcmVs6wlFQBCR9mRuIQJnDK4lFsg1HKxCC2vmKMBVOp+oW+NmskQoHHEa4sqIEtg1JSLhv0piUC9vNxA8ZukodAKolnA6qvJqUCrr0Fhx0gz+q9auGrZ80dUVSA4lfMGAGgD2w/An+L+g/G6wh6i/oySNgBzAzrYhzOt/MBoCyAzAF4ChrH8yIWp9ArGQBuopPr8jW8Ggpz1n8gVGu5pMRyjAX280ZB5Tkza3N8xTgiG6qUfl3JLEC/CqmK0RXFVples0AL0O8auM7jK0TnEZsK0B/rMkB+sfg367+sbIceIBvAb+uGBtn6kGw1iBqO1I900R1+E7MFc/kyhyzzTin9jNs9AGhtaCePKhaWsECzyLyEqq4euIbzMCMAobZ3EhTvUC5FeCTCXbCyKccfo6MplgawhsjzClSlBxwqPbu7AFJ5q3ZaXweKnqv+9Jwkat8dUM8WA3hvaP2g1Tc06Oj0OS/DOh+AaANejdS96KuhPoG6K+iGAYWz8jqAlQeRBBkbqL7jzaf2LRCpbxgDuibAiGGgABYZW0Fi2YiGJBC0AlmAFjikAWAIAMA8QJBAAAnAjQMAQWG1uIYAgAjQKTCNGgCWYNSNZiQQAWAFj0AW6GFsCA4pCQABYtmFIwBYlmAwAzNkENUjjbkEOKTxAtmESDBYCNAICQQlmCoCbbCNOKRoAtmHNuHbRW9NuNbqgOVskAtmLQBBY/W7QAzNaAHQC2YDAG1uiAiGCQDLbbW1tvDbE25BCbbdWwFjXbJW4hjPbFW0NsCAbW0oD1bJAOKRBYB299vjb1W4hhLbaAJBBCe8QMNvPbCNJBDwQEOxACQAMzQFhEgiQOKQzNtAAFgzNeO4hgA7tmC8gzNJADjszN0OwjTxAXW59uBYvpQpNtUU2yVvI7HO0juJAMzV9tQ7kEO9uNbkEDM1tbAgAFhjbaAG1u1brO5HICAjkmgCIYjOyTtQAAgPTsMA4pJrtBYwYvECIYO24hhSMMzTM0W78QLQAI0dW0Nt07zOyoBG7AO5Zi6YQu6TskAbW/EAA7UO7Zi2Y7W+VudbvW2ztBYlmOkB+7iQBdsVbJADTunbtmFVu2Yeu5AAI0Vuwnuo76QAwBw7bWxTtBY3WwFgMAtmITuDbNW8zsCAQWK1vik4pGNs07aey1t07CO4hgO7iGOKTM7fu/nu2Y6u7QA9b20itudbDAJBD7bNu9zszNae9Zhg75O/Nul79CttKF7ou7jtBY4pEbvG7AgIhgS71SEkB9bae2gAS7tWx/AI0CNBTsMAVu1LuQQVu6bus7lmL7u2YGu1sgXb4pEDtp7JAM9vxArOzVsI0ke6RxtbR27nsPbHe1vtV7a2zuAzNAgPbsZ7tAETtp7tAMPt97H+wjRzb42wFh/7bW/vvV7Ru7QAPTNQPVvpAxe51uDbCe/iDe7+u3Vs1A3O5ftnbtANVutbm+8butbVezTtn7j2/Nvtbmu2zvM7ae93vlbNO9ts1IKgArt9biQBNto7he4dufbD0w7szNte/1vQQpB5ACY70u4gcl7fW2fs1Is26RzzbkcmttBYtu19vG7cB3VszNGexPsKHGe/EBBYRO0kCF7QWFYiQQM2+KT1QtmKrty7f+19sQHPW29t47lh1dsKHNB4zsI0bW21vdbju7ftoAx+6RwHblmFVszb8QM1sn7ue9TtB7Dhw1vcHVu9ZgmHK2+TuNb1SDM2OS820SDG72u2zsMAtAIhgI0b24rtBYwWILtvoYW4weNb3+y1vD775pUfdbKuwNtzbBuzruXYcByvsMAt+47uv7iGPEBNb7W41ujH3+6tsVb+R04fxAde39us7RuzEfV7rO7bsBY4wAofP7iu1HvYHR2yLtHbQWLTts7lhzUgPbiGCXvbS8O94dBHlx2nvP7r2zOYj7AWNtLNbxezNsXH423QBV7b+3/va7pHKzvoHTkgoeWYXO11skAZWw9OYHUexds07++0dvlbw22gAq7Ve4NttbK237te79RyVsCACk07toA4pN/suHw+6IB4n1O5fufblR60eQnVh4Tt07z25Ns4npO2gA1Hr25Zgc7LyNtINbzO4MczNy29Tv1b822UeQnFYBTty7RIEVsfoUABlu4AWW4gA5b7qPNquu+gEAA== -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details open="true">
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=y-scope/velox&utm_content=14):

- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
  - `I pushed a fix in commit <commit_id>, please review it.`
  - `Explain this complex logic.`
  - `Open a follow-up GitHub issue for this discussion.`
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai explain this code block.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and explain its main purpose.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

### Support

Need help? Create a ticket on our [support page](https://www.coderabbit.ai/contact-us/support) for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

@wraymo wraymo force-pushed the clp_integration_new branch from c5bba79 to b0e5211 Compare June 25, 2025 19:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 31

🔭 Outside diff range comments (3)
velox/connectors/clp/CMakeLists.txt (1)

32-32: Add missing newline at end of file.

The pipeline failure indicates a missing newline at the end of the file.

   add_subdirectory(tests)
 endif()
+
CMake/resolve_dependency_modules/clp.cmake (1)

48-48: Add missing newline at end of file.

Based on the pipeline failure pattern, this file is likely missing a newline at the end.

 add_subdirectory(${clp_SOURCE_DIR}/components/core
                  ${clp_BINARY_DIR}/components/core)
+
CMake/resolve_dependency_modules/ystdlib_cpp.cmake (1)

25-25: Add missing newline at end of file.

Based on the pipeline failure pattern, this file is likely missing a newline at the end.

 set(CLP_YSTDLIB_SOURCE_DIRECTORY "${ystdlib_cpp_SOURCE_DIR}")
+
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec15456 and e4a54a6.

📒 Files selected for processing (33)
  • CMake/resolve_dependency_modules/antlr4-runtime.cmake (1 hunks)
  • CMake/resolve_dependency_modules/clp.cmake (1 hunks)
  • CMake/resolve_dependency_modules/date.cmake (1 hunks)
  • CMake/resolve_dependency_modules/msgpack-cxx.cmake (1 hunks)
  • CMake/resolve_dependency_modules/nlohmann_json.cmake (1 hunks)
  • CMake/resolve_dependency_modules/simdjson.cmake (1 hunks)
  • CMake/resolve_dependency_modules/spdlog.cmake (1 hunks)
  • CMake/resolve_dependency_modules/ystdlib_cpp.cmake (1 hunks)
  • CMakeLists.txt (4 hunks)
  • velox/connectors/CMakeLists.txt (1 hunks)
  • velox/connectors/clp/CMakeLists.txt (1 hunks)
  • velox/connectors/clp/ClpColumnHandle.h (1 hunks)
  • velox/connectors/clp/ClpConfig.cpp (1 hunks)
  • velox/connectors/clp/ClpConfig.h (1 hunks)
  • velox/connectors/clp/ClpConnector.cpp (1 hunks)
  • velox/connectors/clp/ClpConnector.h (1 hunks)
  • velox/connectors/clp/ClpConnectorSplit.h (1 hunks)
  • velox/connectors/clp/ClpDataSource.cpp (1 hunks)
  • velox/connectors/clp/ClpDataSource.h (1 hunks)
  • velox/connectors/clp/ClpTableHandle.cpp (1 hunks)
  • velox/connectors/clp/ClpTableHandle.h (1 hunks)
  • velox/connectors/clp/search_lib/CMakeLists.txt (1 hunks)
  • velox/connectors/clp/search_lib/ClpCursor.cpp (1 hunks)
  • velox/connectors/clp/search_lib/ClpCursor.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpQueryRunner.cpp (1 hunks)
  • velox/connectors/clp/search_lib/ClpQueryRunner.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpVectorLoader.cpp (1 hunks)
  • velox/connectors/clp/search_lib/ClpVectorLoader.h (1 hunks)
  • velox/connectors/clp/tests/CMakeLists.txt (1 hunks)
  • velox/connectors/clp/tests/ClpConnectorTest.cpp (1 hunks)
  • velox/connectors/clp/tests/examples/test_1.ndjson (1 hunks)
  • velox/connectors/clp/tests/examples/test_2.ndjson (1 hunks)
  • velox/docs/develop/connectors.rst (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
velox/connectors/clp/ClpConnectorSplit.h (3)
velox/connectors/clp/ClpConfig.h (1)
  • nodiscard (42-45)
velox/connectors/clp/ClpConnector.h (2)
  • nodiscard (29-32)
  • nodiscard (34-36)
velox/connectors/clp/ClpTableHandle.h (2)
  • nodiscard (32-34)
  • nodiscard (36-38)
velox/connectors/clp/ClpTableHandle.h (4)
velox/connectors/clp/ClpConfig.h (1)
  • nodiscard (42-45)
velox/connectors/clp/ClpConnector.h (2)
  • nodiscard (29-32)
  • nodiscard (34-36)
velox/connectors/clp/ClpConnectorSplit.h (1)
  • nodiscard (26-28)
velox/connectors/hive/TableHandle.h (1)
  • tableName_ (146-148)
velox/connectors/clp/ClpConfig.h (3)
velox/connectors/clp/ClpConnector.h (2)
  • nodiscard (29-32)
  • nodiscard (34-36)
velox/connectors/clp/ClpTableHandle.h (2)
  • nodiscard (32-34)
  • nodiscard (36-38)
velox/connectors/clp/ClpConnectorSplit.h (1)
  • nodiscard (26-28)
velox/connectors/clp/ClpConnector.h (3)
velox/connectors/clp/ClpConfig.h (1)
  • nodiscard (42-45)
velox/connectors/clp/ClpTableHandle.h (2)
  • nodiscard (32-34)
  • nodiscard (36-38)
velox/connectors/clp/ClpConnectorSplit.h (1)
  • nodiscard (26-28)
velox/connectors/clp/ClpDataSource.h (1)
velox/connectors/clp/ClpConnector.h (1)
  • outputType (38-44)
🪛 GitHub Actions: Linux Build using GCC
CMake/resolve_dependency_modules/nlohmann_json.cmake

[error] 34-34: CMake Error: URL_HASH is set to a raw hash string but must be in the format ALGO=value where ALGO is one of MD5, SHA1, SHA224, SHA256, SHA384, SHA512, SHA3_224, SHA3_256, SHA3_384, SHA3_512. This caused the nlohmann_json dependency fetch to fail.

🪛 GitHub Actions: Run Checks
CMakeLists.txt

[error] 1-1: end-of-file-fixer hook failed and modified this file to fix missing newline at end of file.

velox/connectors/clp/search_lib/CMakeLists.txt

[error] 1-1: end-of-file-fixer hook failed and modified this file to fix missing newline at end of file.

velox/docs/develop/connectors.rst

[error] 147-147: end-of-file-fixer hook failed and fixed missing newline at end of file.

🔇 Additional comments (35)
CMake/resolve_dependency_modules/date.cmake (1)

32-38: LGTM - FetchContent configuration is properly structured.

The FetchContent declaration correctly uses the resolved URL, includes SHA256 verification, and sets OVERRIDE_FIND_PACKAGE for consistent dependency resolution.

CMake/resolve_dependency_modules/simdjson.cmake (1)

30-31: LGTM - Proper dependency override configuration.

Adding OVERRIDE_FIND_PACKAGE ensures consistent usage of the fetched simdjson version over any system-installed packages, which is the correct approach for reproducible builds.

velox/connectors/CMakeLists.txt (1)

20-23: LGTM - Consistent conditional build pattern.

The CLP connector integration follows the established pattern used for other connectors (Hive, TPC-H), ensuring consistent build behaviour.

velox/connectors/clp/tests/CMakeLists.txt (2)

14-25: LGTM - Well-structured test configuration.

The test setup correctly creates the executable, links all necessary dependencies (connector, test libraries, GTest), and registers with CTest for automated testing.


26-27: LGTM - Proper test data setup.

Copying example data files to the build directory ensures test data is available at runtime for the CLP connector tests.

velox/connectors/clp/tests/examples/test_1.ndjson (1)

1-10: LGTM - Comprehensive test data coverage.

The NDJSON test data provides excellent coverage with diverse HTTP request scenarios, including various methods, authentication states, and realistic response characteristics. The consistent schema and mix of null/non-null values will effectively test the CLP connector's data handling capabilities.

CMakeLists.txt (2)

130-130: LGTM! CLP connector option added correctly.

The new build option follows the established pattern and is properly integrated with the minimal build configuration.


525-552: LGTM! Dependency resolution follows established patterns.

The conditional dependency resolution for the CLP connector is well-structured and follows the existing patterns in the codebase. The dependencies appear comprehensive for a log processing connector.

velox/connectors/clp/tests/examples/test_2.ndjson (1)

1-10: LGTM! Comprehensive test data for CLP connector.

The test data provides good coverage of different event types and JSON structures, which will be valuable for testing the CLP connector functionality.

CMake/resolve_dependency_modules/clp.cmake (1)

16-47: LGTM! Well-structured dependency resolution.

The CLP dependency is properly configured with specific commit hash for reproducibility and appropriate build options disabled for the connector use case.

CMake/resolve_dependency_modules/ystdlib_cpp.cmake (1)

16-24: LGTM! Proper dependency resolution structure.

The ystdlib_cpp dependency is correctly configured with specific commit hash and appropriate variable setting for the CLP connector.

velox/connectors/clp/search_lib/CMakeLists.txt (1)

17-17: Verify that the CLP_SRC_FILES variable is defined.

The variable ${CLP_SRC_FILES} is referenced but not defined in this file. Ensure this variable is properly set in a parent CMake file or consider removing it if unused.

#!/bin/bash
# Description: Search for CLP_SRC_FILES definition in CMake files
# Expected: Find where CLP_SRC_FILES is defined

rg -A 3 -B 3 "CLP_SRC_FILES" --type cmake
velox/connectors/clp/ClpConnectorSplit.h (1)

22-31: Well-designed split implementation.

The struct follows good design principles with immutable data members, proper inheritance, and clear string representation.

CMake/resolve_dependency_modules/msgpack-cxx.cmake (1)

14-33: Well-structured dependency module.

This CMake module follows excellent practices with version pinning, SHA256 checksum verification, proper FetchContent usage, and include guards. The dependency resolution is secure and reproducible.

CMake/resolve_dependency_modules/nlohmann_json.cmake (1)

30-32: Good practice disabling tests for external dependency.

Properly disabling JSON tests to avoid building unnecessary components for the external dependency.

CMake/resolve_dependency_modules/spdlog.cmake (2)

33-34: LGTM: External fmt configuration is appropriate.

Setting SPDLOG_FMT_EXTERNAL ON before making the content available is a good practice to use an external fmt library rather than spdlog's bundled version, which helps avoid potential conflicts.


16-21: Verify spdlog v1.12.0 checksum using Python

Since the sandbox lacks both sha256sum and openssl, let’s compute the SHA-256 via Python:

#!/bin/bash
URL="https://github.com/gabime/spdlog/archive/refs/tags/v1.12.0.tar.gz"
EXPECTED="4dccf2d10f410c1e2feaff89966bfc49a1abb29ef6f08246335b110e001e09a9"

echo "Downloading and verifying spdlog v1.12.0 checksum via Python..."
ACTUAL=$(curl -sL "$URL" | python3 - << 'EOF'
import sys, hashlib
data = sys.stdin.buffer.read()
print(hashlib.sha256(data).hexdigest())
EOF
)

echo "Expected: $EXPECTED"
echo "Actual:   $ACTUAL"

if [ "$ACTUAL" = "$EXPECTED" ]; then
  echo "✓ Checksum matches"
else
  echo "✗ Checksum mismatch"
fi
CMake/resolve_dependency_modules/antlr4-runtime.cmake (1)

16-21: I noticed sha256sum isn’t available in the sandbox, so let’s re-run the checksum verification using Python’s hashlib:

#!/bin/bash
# Description: Verify the SHA256 checksum for ANTLR4 v4.13.2 using Python
URL="https://github.com/antlr/antlr4/archive/refs/tags/4.13.2.tar.gz"
EXPECTED_CHECKSUM="9f18272a9b32b622835a3365f850dd1063d60f5045fb1e12ce475ae6e18a35bb"

echo "Downloading ANTLR4 v4.13.2 and computing SHA256..."
ACTUAL_CHECKSUM=$(curl -sL "$URL" | python3 - << 'EOF'
import sys, hashlib
data = sys.stdin.buffer.read()
print(hashlib.sha256(data).hexdigest())
EOF
)

echo "Expected: $EXPECTED_CHECKSUM"
echo "Actual:   $ACTUAL_CHECKSUM"

if [ "$ACTUAL_CHECKSUM" = "$EXPECTED_CHECKSUM" ]; then
  echo "✓ Checksum matches"
else
  echo "✗ Checksum mismatch"
fi
velox/connectors/clp/ClpColumnHandle.h (1)

34-48: LGTM: Well-designed getter methods.

The getter methods follow best practices:

  • Return const references for strings and TypePtr to avoid unnecessary copies
  • Return by value for the boolean (which is efficient)
  • Marked as const methods
  • Use [[nodiscard]] attributes would be beneficial but not strictly necessary
velox/connectors/clp/ClpConfig.cpp (3)

23-32: LGTM: Robust string-to-enum conversion.

The stringToStorageType function correctly handles:

  • Case-insensitive comparison using boost::algorithm::to_upper_copy
  • Clear error messaging with the original input value
  • Support for both "FS" and "S3" storage types

36-38: LGTM: Appropriate default value handling.

The storageType() method properly uses "FS" as the default storage type when the configuration key is not present, which is a sensible default for filesystem-based storage.


17-17: Verify boost algorithm header inclusion.

The code uses boost::algorithm::to_upper_copy but doesn't explicitly include the required header. Ensure that ClpConfig.h or a transitive dependency includes <boost/algorithm/string.hpp>.

#!/bin/bash
# Description: Check if boost algorithm header is included in ClpConfig.h or its dependencies
# Expected: The header should be included somewhere in the dependency chain

echo "Checking for boost algorithm includes in ClpConfig.h..."
if [ -f "velox/connectors/clp/ClpConfig.h" ]; then
  grep -n "boost.*algorithm" velox/connectors/clp/ClpConfig.h || echo "Not found in ClpConfig.h"
fi

echo "Checking for boost algorithm includes in related connector headers..."
fd -e h . velox/connectors/clp/ --exec grep -l "boost.*algorithm" {} \; 2>/dev/null || echo "No boost algorithm includes found"
velox/connectors/clp/ClpTableHandle.h (2)

24-30: LGTM: Well-designed constructor with move semantics.

The constructor properly:

  • Initialises the base class with connectorId
  • Uses move semantics for the shared_ptr parameter to avoid unnecessary copying
  • Initialises all members in the member initialiser list

32-38: LGTM: Consistent getter design.

Both getter methods follow good practices:

  • Marked with [[nodiscard]] to prevent accidental ignoring of return values
  • Return const references to avoid unnecessary copying
  • Marked as const methods
velox/connectors/clp/ClpConfig.h (1)

47-47: ```shell
#!/bin/bash

Verify storageType() definition in ClpConfig.cpp

grep -R "StorageType\s+ClpConfig::storageType" -n velox/connectors/clp/ClpConfig.cpp
grep -R "storageType() *{" -n velox/connectors/clp/ClpConfig.cpp


</details>
<details>
<summary>velox/docs/develop/connectors.rst (1)</summary>

`125-150`: **LGTM - Comprehensive CLP connector documentation.**

The documentation provides excellent coverage of the CLP connector architecture, including detailed explanations of ClpConnectorSplit, ClpDataSource initialization, query processing, and the lazy vector loading mechanism. The content is well-structured and informative.

</details>
<details>
<summary>velox/connectors/clp/search_lib/ClpQueryRunner.cpp (2)</summary>

`26-60`: **LGTM - Well-structured initialization logic.**

The `init` method correctly handles projection mapping by iterating through ordered matching nodes and finding corresponding column readers. The fallback to nullptr for missing readers is appropriate and allows for robust handling of incomplete projections.

---

`62-80`: **LGTM - Efficient message filtering implementation.**

The `fetchNext` method implements an efficient filtering loop that respects the requested row count while scanning messages. The logic correctly accumulates filtered row indices and tracks the number of rows scanned for performance monitoring.

</details>
<details>
<summary>velox/connectors/clp/ClpConnector.cpp (3)</summary>

`23-26`: **LGTM - Clean connector initialization.**

The constructor correctly initializes the base class and wraps the provided config in a ClpConfig instance for type-safe configuration access.

---

`28-41`: **LGTM - Proper data source creation.**

The `createDataSource` method correctly instantiates ClpDataSource with all necessary parameters including output type, table handle, column handles, memory pool, and configuration.

---

`43-49`: **LGTM - Appropriate handling of unsupported functionality.**

The `createDataSink` method correctly throws a NYI exception since the CLP connector is designed for read-only operations. This is consistent with the documentation stating that CLP connector doesn't implement the DataSink interface.

</details>
<details>
<summary>velox/connectors/clp/search_lib/ClpVectorLoader.h (3)</summary>

`24-34`: **LGTM - Well-designed VectorLoader specialization.**

The class provides a clean interface for loading CLP data into Velox vectors. The constructor parameters appropriately capture the column reader, type information, and filtered indices needed for selective data loading.

---

`50-52`: **LGTM - Efficient thread-local JSON parser.**

The thread-local static parser is a good optimization for JSON parsing, avoiding the overhead of creating parser instances for each array operation while maintaining thread safety.

---

`43-44`: **LGTM - Flexible templated data population.**

The templated `populateData` method provides a flexible approach for handling different vector types while maintaining type safety. This design supports the various data types mentioned in the class documentation.

</details>
<details>
<summary>velox/connectors/clp/search_lib/ClpVectorLoader.cpp (1)</summary>

`31-54`: **Fix potential memory safety issue with StringView**

The code creates a `StringView` from a temporary `std::string` returned by `extract_value`. This is unsafe because `StringView` doesn't own the data and the temporary string will be destroyed after line 46, leaving the `StringView` pointing to deallocated memory.



```diff
 if constexpr (std::is_same_v<T, std::string>) {
-  auto string_value =
-      std::get<std::string>(columnReader_->extract_value(messageIndex));
-  vector->set(vectorIndex, StringView(string_value));
+  // StringView requires the underlying string to remain valid.
+  // The FlatVector should handle string storage internally.
+  vector->set(vectorIndex, std::get<std::string>(columnReader_->extract_value(messageIndex)));
 } else {

Likely an incorrect or invalid review comment.

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#set(CLP_SRC_DIR ${clp_SOURCE_DIR}/components/core/src)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove commented code.

This commented line appears to be leftover code and should be removed to keep the build file clean.

-#set(CLP_SRC_DIR ${clp_SOURCE_DIR}/components/core/src)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#set(CLP_SRC_DIR ${clp_SOURCE_DIR}/components/core/src)
🤖 Prompt for AI Agents
In velox/connectors/clp/CMakeLists.txt at line 14, remove the commented out line
setting CLP_SRC_DIR as it is leftover code and not needed, to keep the build
file clean and maintainable.

clp_s::search
clp_s::search::kql
velox_vector)
target_compile_features(clp-s-search PRIVATE cxx_std_20) No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the missing newline at end of file.

The pipeline failure indicates a missing newline at the end of the file.

-target_compile_features(clp-s-search PRIVATE cxx_std_20)
+target_compile_features(clp-s-search PRIVATE cxx_std_20)
+
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
target_compile_features(clp-s-search PRIVATE cxx_std_20)
target_compile_features(clp-s-search PRIVATE cxx_std_20)
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/CMakeLists.txt at line 34, the file is
missing a newline at the end, causing pipeline failure. Add a newline character
after the last line to ensure the file ends properly.

: connector::ConnectorSplit(connectorId), path_(path) {}

[[nodiscard]] std::string toString() const override {
return fmt::format("CLP Split: {}", path_);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing include for fmt::format.

The fmt::format function is used but the required header is not included.

 #pragma once
 
 #include "velox/connectors/Connector.h"
+#include <fmt/format.h>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return fmt::format("CLP Split: {}", path_);
#pragma once
#include "velox/connectors/Connector.h"
#include <fmt/format.h>
// ... rest of ClpConnectorSplit.h remains unchanged ...
🤖 Prompt for AI Agents
In velox/connectors/clp/ClpConnectorSplit.h at line 27, the code uses
fmt::format but does not include the necessary header. Add the appropriate
include directive for the fmt library, typically #include <fmt/format.h>, at the
top of the file to ensure fmt::format is recognized and compiles correctly.

Comment on lines +76 to +79
static std::string getExampleFilePath(const std::string& filePath) {
std::string current_path = fs::current_path().string();
return current_path + "/examples/" + filePath;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

❓ Verification inconclusive

Avoid using fs::current_path() in tests

Using fs::current_path() makes tests dependent on the working directory, which can lead to flaky tests when run from different locations. Consider using a more robust approach to locate test files.

Consider these alternatives:

  1. Use an environment variable or CMake-defined path
  2. Use __FILE__ to determine the test file location and construct relative paths from there
  3. Embed test data as string literals in the test file

Here's an example using __FILE__:

static std::string getExampleFilePath(const std::string& filePath) {
  // Get the directory containing this test file
  std::string testFilePath = __FILE__;
  auto lastSlash = testFilePath.find_last_of("/\\");
  std::string testDir = testFilePath.substr(0, lastSlash);
  // Construct path relative to test directory
  return testDir + "/examples/" + filePath;
}

Avoid using fs::current_path() in tests

Using fs::current_path() makes tests dependent on the working directory, which can lead to flaky tests when run from different locations. Consider using a more robust approach to locate test files.

Consider these alternatives:

  1. Use an environment variable or CMake-defined path
  2. Use __FILE__ to determine the test file location and construct relative paths from there
  3. Embed test data as string literals in the test file

Here's an example using __FILE__:

static std::string getExampleFilePath(const std::string& filePath) {
  // Get the directory containing this test file
  std::string testFilePath = __FILE__;
  auto lastSlash = testFilePath.find_last_of("/\\");
  std::string testDir = testFilePath.substr(0, lastSlash);
  // Construct path relative to test directory
  return testDir + "/examples/" + filePath;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/tests/ClpConnectorTest.cpp around lines 76 to 79,
replace the use of fs::current_path() with a method that derives the path
relative to the test file location using __FILE__. Extract the directory path
from __FILE__ by finding the last slash, then construct the example file path
relative to that directory to avoid dependency on the working directory and make
the tests more robust.

Comment on lines +150 to +252
ErrorCode ClpCursor::loadArchive() {
auto networkAuthOption = inputSource_ == InputSource::Filesystem
? NetworkAuthOption{.method = AuthMethod::None}
: NetworkAuthOption{.method = AuthMethod::S3PresignedUrlV4};

try {
archiveReader_->open(
get_path_object_for_raw_path(archivePath_), networkAuthOption);
} catch (std::exception& e) {
VLOG(2) << "Failed to open archive file: " << e.what();
return ErrorCode::InternalError;
}

auto timestampDict = archiveReader_->get_timestamp_dictionary();
auto schemaTree = archiveReader_->get_schema_tree();
auto schemaMap = archiveReader_->get_schema_map();

EvaluateTimestampIndex timestampIndex(timestampDict);
if (clp_s::EvaluatedValue::False == timestampIndex.run(expr_)) {
VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
return ErrorCode::InvalidTimestampRange;
}

schemaMatch_ = std::make_shared<SchemaMatch>(schemaTree, schemaMap);
if (expr_ = schemaMatch_->run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "No matching schemas for query '" << query_ << "'";
return ErrorCode::SchemaNotFound;
}

projection_ = std::make_shared<Projection>(
outputColumns_.empty() ? ReturnAllColumns : ReturnSelectedColumns);
try {
for (auto const& column : outputColumns_) {
std::vector<std::string> descriptorTokens;
std::string descriptorNamespace;
if (false ==
tokenize_column_descriptor(
column.name, descriptorTokens, descriptorNamespace)) {
VLOG(2) << "Can not tokenize invalid column: '" << column.name << "'";
return ErrorCode::InternalError;
}

auto columnDescriptor = ColumnDescriptor::create_from_escaped_tokens(
descriptorTokens, descriptorNamespace);
switch (column.type) {
case ColumnType::String:
columnDescriptor->set_matching_types(
LiteralType::ClpStringT | LiteralType::VarStringT |
LiteralType::EpochDateT);
break;
case ColumnType::Integer:
columnDescriptor->set_matching_types(LiteralType::IntegerT);
break;
case ColumnType::Float:
columnDescriptor->set_matching_types(LiteralType::FloatT);
break;
case ColumnType::Boolean:
columnDescriptor->set_matching_types(LiteralType::BooleanT);
break;
case ColumnType::Array:
columnDescriptor->set_matching_types(LiteralType::ArrayT);
break;
default:
break;
}

projection_->add_column(columnDescriptor);
}
} catch (TraceableException& e) {
VLOG(2) << e.what();
return ErrorCode::InternalError;
}
projection_->resolve_columns(schemaTree);
archiveReader_->set_projection(projection_);

archiveReader_->read_metadata();

matchedSchemas_.clear();
for (auto schemaId : archiveReader_->get_schema_ids()) {
if (schemaMatch_->schema_matched(schemaId)) {
matchedSchemas_.push_back(schemaId);
}
}

if (matchedSchemas_.empty()) {
return ErrorCode::SchemaNotFound;
}

EvaluateTimestampIndex timestamp_index(timestampDict);
if (EvaluatedValue::False == timestamp_index.run(expr_)) {
VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
return ErrorCode::InvalidTimestampRange;
}

archiveReader_->read_variable_dictionary();
archiveReader_->read_log_type_dictionary();
archiveReader_->read_array_dictionary();

currentSchemaIndex_ = 0;
currentSchemaTableLoaded_ = false;
return ErrorCode::Success;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider refactoring loadArchive into smaller methods

This method is over 100 lines long and handles multiple responsibilities. Consider breaking it down into smaller, focused methods for better maintainability and testability:

  • openArchiveWithAuth()
  • evaluateTimestampConstraints()
  • matchSchemas()
  • setupProjection()
  • loadDictionaries()
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.cpp around lines 150 to 252, the
loadArchive method is too long and handles multiple responsibilities. Refactor
this method by extracting logical sections into smaller private methods: create
openArchiveWithAuth() to handle archive opening with authentication,
evaluateTimestampConstraints() to run timestamp index evaluation, matchSchemas()
to perform schema matching and filtering, setupProjection() to configure the
projection based on output columns, and loadDictionaries() to read variable, log
type, and array dictionaries. Then update loadArchive to call these new methods
sequentially, improving readability and maintainability.

Comment on lines +239 to +243
EvaluateTimestampIndex timestamp_index(timestampDict);
if (EvaluatedValue::False == timestamp_index.run(expr_)) {
VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
return ErrorCode::InvalidTimestampRange;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate timestamp evaluation

This timestamp evaluation is redundant as the same check is already performed at lines 167-171. The expression and timestamp dictionary haven't changed between these two checks.

-  EvaluateTimestampIndex timestamp_index(timestampDict);
-  if (EvaluatedValue::False == timestamp_index.run(expr_)) {
-    VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
-    return ErrorCode::InvalidTimestampRange;
-  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
EvaluateTimestampIndex timestamp_index(timestampDict);
if (EvaluatedValue::False == timestamp_index.run(expr_)) {
VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
return ErrorCode::InvalidTimestampRange;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.cpp between lines 239 and 243,
remove the redundant evaluation of the timestamp index since the same check is
already done at lines 167-171 with unchanged expression and timestamp
dictionary. Simply delete the block that creates EvaluateTimestampIndex and runs
the check to avoid duplicate evaluation.

Comment on lines +76 to +82
if (EvaluatedValue::False ==
queryRunner_->schema_init(currentSchemaId_)) {
currentSchemaIndex_ += 1;
currentSchemaTableLoaded_ = false;
errorCode_ = ErrorCode::DictionaryNotFound;
continue;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix error code handling in schema iteration

Setting errorCode_ to DictionaryNotFound at line 80 is problematic because it gets overwritten in the next iteration. If all schemas fail, the final error code might not reflect the actual issues encountered.

Consider tracking schema-specific errors or logging them without overwriting the error code:

       if (EvaluatedValue::False ==
           queryRunner_->schema_init(currentSchemaId_)) {
+        LOG(WARNING) << "Dictionary not found for schema " << currentSchemaId_;
         currentSchemaIndex_ += 1;
         currentSchemaTableLoaded_ = false;
-        errorCode_ = ErrorCode::DictionaryNotFound;
         continue;
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (EvaluatedValue::False ==
queryRunner_->schema_init(currentSchemaId_)) {
currentSchemaIndex_ += 1;
currentSchemaTableLoaded_ = false;
errorCode_ = ErrorCode::DictionaryNotFound;
continue;
}
if (EvaluatedValue::False ==
queryRunner_->schema_init(currentSchemaId_)) {
LOG(WARNING) << "Dictionary not found for schema " << currentSchemaId_;
currentSchemaIndex_ += 1;
currentSchemaTableLoaded_ = false;
continue;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.cpp around lines 76 to 82, the
errorCode_ is set to DictionaryNotFound inside a loop but gets overwritten in
subsequent iterations, causing loss of error context if all schemas fail. Modify
the code to track errors per schema or log each failure without overwriting
errorCode_, and after the loop completes, set errorCode_ to reflect the most
relevant or aggregated error status.

Comment on lines +127 to +145
if (expr_ = standardizePass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}

NarrowTypes narrowPass;
if (expr_ = narrowPass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}

ConvertToExists convertPass;
if (expr_ = convertPass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Refactor assignments inside if conditions for clarity

The assignments inside if conditions (lines 127, 134, 141) are valid but can be confusing. Consider separating the assignment from the condition check for better readability.

-  if (expr_ = standardizePass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = standardizePass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }

   NarrowTypes narrowPass;
-  if (expr_ = narrowPass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = narrowPass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }

   ConvertToExists convertPass;
-  if (expr_ = convertPass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = convertPass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (expr_ = standardizePass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
NarrowTypes narrowPass;
if (expr_ = narrowPass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
ConvertToExists convertPass;
if (expr_ = convertPass.run(expr_);
std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
expr_ = standardizePass.run(expr_);
if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
NarrowTypes narrowPass;
expr_ = narrowPass.run(expr_);
if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
ConvertToExists convertPass;
expr_ = convertPass.run(expr_);
if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
VLOG(2) << "Query '" << query_ << "' is logically false";
return ErrorCode::LogicalError;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.cpp around lines 127 to 145, the
code assigns values to expr_ inside if conditions, which reduces clarity.
Refactor by first assigning expr_ the result of each pass.run(expr_) call on
separate lines, then use a separate if statement to check if expr_ is a
dynamic_pointer_cast to EmptyExpr. This separation improves readability and
makes the code easier to understand.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
velox/connectors/clp/search_lib/CMakeLists.txt (1)

34-35: Add the missing newline at end-of-file
The earlier review already flagged this; the file still ends without a newline, which triggers some linters/CI checks.

 target_compile_features(clp-s-search PRIVATE cxx_std_20)
+
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4a54a6 and b0e5211.

📒 Files selected for processing (3)
  • CMake/resolve_dependency_modules/ystdlib_cpp.cmake (1 hunks)
  • velox/connectors/clp/search_lib/CMakeLists.txt (1 hunks)
  • velox/docs/develop/connectors.rst (2 hunks)
🧰 Additional context used
🪛 GitHub Actions: Linux Build using GCC
velox/connectors/clp/search_lib/CMakeLists.txt

[error] 22-22: ExternalProject_Add failed due to invalid URL_HASH format for nlohmann_json dependency.

🔇 Additional comments (1)
CMake/resolve_dependency_modules/ystdlib_cpp.cmake (1)

1-1: Header cleanup looks good.

The stray UI artefact reported earlier is gone – thanks for fixing it.

Comment on lines 16 to 23
FetchContent_Declare(
ystdlib_cpp
GIT_REPOSITORY https://github.com/y-scope/ystdlib-cpp.git
GIT_TAG d80cf86e1a1f2dae6421978c8ee353408368f424)

FetchContent_Populate(ystdlib_cpp)

set(CLP_YSTDLIB_SOURCE_DIRECTORY "${ystdlib_cpp_SOURCE_DIR}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Shallow-clone the dependency to speed up CMake configure time.

FetchContent_Declare defaults to a full clone. For libraries that are used in a header-only fashion (as appears to be the case here), fetching history is unnecessary and slows CI builds.

 FetchContent_Declare(
   ystdlib_cpp
   GIT_REPOSITORY https://github.com/y-scope/ystdlib-cpp.git
   GIT_TAG d80cf86e1a1f2dae6421978c8ee353408368f424
+  GIT_SHALLOW TRUE            # Only fetch the required commit
+  GIT_PROGRESS TRUE           # Show progress in verbose builds
+  FETCHCONTENT_UPDATES_DISCONNECTED ON  # Reproducible builds
 )

These flags are entirely optional but cheap to adopt and improve build ergonomics.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/ystdlib_cpp.cmake around lines 16 to 23, the
FetchContent_Declare call performs a full git clone by default, which slows down
CMake configure time. Modify the FetchContent_Declare command to include shallow
clone options such as GIT_SHALLOW TRUE or GIT_DEPTH 1 to fetch only the latest
commit, speeding up the clone process for this header-only library.

Comment on lines 125 to 131
CLP Connector
-------------
The CLP Connector is used to read CLP archives stored in a local file system or S3. It implements similar
interfaces as the Hive Connector except for the `DataSink` interface. Here we only describe the `DataSource`
interface and the `ConnectorSplit` interface implementation since `Connector` and `ConnectorFactory` are
similar to the Hive Connector.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clarify “both” vs. “three” connectors.

The preceding section lists three connectors, but the sentence begins with “both”. Consider re-phrasing to avoid possible confusion.

E.g.:

-similar to the Hive Connector.
+similar to the Hive connector.

…or explicitly state that only the DataSource side of Hive and CLP are discussed here.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In velox/docs/develop/connectors.rst around lines 125 to 131, the text
incorrectly uses "both" when referring to connectors, causing confusion since
three connectors are listed earlier. Update the sentence to clarify that only
the DataSource side of the Hive and CLP connectors are discussed here, or
rephrase to explicitly mention the two connectors being referenced instead of
"both."

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
CMake/resolve_dependency_modules/date.cmake (1)

19-21: Fix variable reference inconsistency.

The URL construction uses DATE_BUILD_VERSION but the variable is defined as VELOX_DATE_BUILD_VERSION on line 16.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0e5211 and b05bba5.

📒 Files selected for processing (2)
  • CMake/resolve_dependency_modules/date.cmake (1 hunks)
  • velox/docs/configs.rst (1 hunks)

@wraymo wraymo force-pushed the clp_integration_new branch from 9d0e735 to 46754a1 Compare June 25, 2025 20:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
CMake/resolve_dependency_modules/nlohmann_json.cmake (1)

27-31: Missing SHA256= prefix in URL_HASH.

Same concern as previously raised: CMake requires the algorithm prefix.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b05bba5 and 46754a1.

📒 Files selected for processing (2)
  • CMake/resolve_dependency_modules/date.cmake (1 hunks)
  • CMake/resolve_dependency_modules/nlohmann_json.cmake (1 hunks)

Comment on lines +26 to +28
set(CMAKE_INSTALL_MESSAGE
LAZY
CACHE STRING "" FORCE)
Copy link

@coderabbitai coderabbitai bot Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Global override of CMAKE_INSTALL_MESSAGE may have unintended side-effects.

Forcing CMAKE_INSTALL_MESSAGE=LAZY with CACHE ... FORCE changes behaviour for all subsequent projects, not just date.
Consider setting it locally (e.g., via set_property(TARGET date PROPERTY ...)) or omitting the FORCE flag.

🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/date.cmake at lines 26 to 28, avoid globally
overriding CMAKE_INSTALL_MESSAGE with CACHE and FORCE as it affects all
projects. Instead, remove the FORCE flag to prevent forcing the value globally
or set the install message property locally on the date target using
set_property(TARGET date PROPERTY ...). This confines the setting to the date
module only and avoids side effects on other projects.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anlowee Thanks for confirming—glad we’re aligned. Let me know if you’d like any help scoping that CMAKE_INSTALL_MESSAGE change.


🧠 Learnings used
Learnt from: anlowee
PR: y-scope/velox#14
File: CMake/resolve_dependency_modules/date.cmake:32-36
Timestamp: 2025-07-03T17:36:47.311Z
Learning: In CMake FetchContent_Declare, the URL_HASH argument requires an algorithm prefix (e.g., SHA256=) followed by the checksum value. Without this prefix, CMake silently skips checksum validation, which defeats supply-chain protection.

Comment on lines +32 to +36
FetchContent_Declare(
date
URL ${VELOX_DATE_SOURCE_URL}
URL_HASH ${VELOX_DATE_BUILD_SHA256_CHECKSUM}
OVERRIDE_FIND_PACKAGE)
Copy link

@coderabbitai coderabbitai bot Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

URL_HASH is missing the required algorithm prefix (SHA256=).

CMake expects the URL_HASH argument in the form SHA256=<hash>.
Without the prefix, checksum validation is silently skipped, defeating supply-chain protection.

 FetchContent_Declare(
   date
   URL ${VELOX_DATE_SOURCE_URL}
-  URL_HASH ${VELOX_DATE_BUILD_SHA256_CHECKSUM}
+  URL_HASH SHA256=${VELOX_DATE_BUILD_SHA256_CHECKSUM}
   OVERRIDE_FIND_PACKAGE)
🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/date.cmake around lines 32 to 36, the
URL_HASH argument is missing the required algorithm prefix. Update the URL_HASH
value to include the prefix "SHA256=" followed by the checksum, so it reads as
SHA256=${VELOX_DATE_BUILD_SHA256_CHECKSUM}. This ensures CMake performs checksum
validation correctly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anlowee Thanks for confirming! You're absolutely right - the URL_HASH does need the SHA256= prefix for proper checksum validation. This will help ensure the integrity of the downloaded date library dependency.


✏️ Learnings added
Learnt from: anlowee
PR: y-scope/velox#14
File: CMake/resolve_dependency_modules/date.cmake:32-36
Timestamp: 2025-07-03T17:36:47.311Z
Learning: In CMake FetchContent_Declare, the URL_HASH argument requires an algorithm prefix (e.g., SHA256=) followed by the checksum value. Without this prefix, CMake silently skips checksum validation, which defeats supply-chain protection.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

velox_resolve_dependency adds SHA256. Manually adding it will result in errors

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
CMake/resolve_dependency_modules/nlohmann_json.cmake (1)

27-31: URL_HASH still lacks algorithm prefix

Same issue as previously reported; the checksum must be prefixed with SHA256= to satisfy CMake’s URL_HASH format.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46754a1 and 85acfe0.

📒 Files selected for processing (2)
  • CMake/resolve_dependency_modules/antlr4-runtime.cmake (1 hunks)
  • CMake/resolve_dependency_modules/nlohmann_json.cmake (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (1)
CMake/resolve_dependency_modules/antlr4-runtime.cmake (1)

19-24: Double-check variable-name consistency

velox_resolve_dependency_url(ANTLR4_RUNTIME) traditionally sets/expects VELOX_ANTLR4_RUNTIME_URL, yet this script later references VELOX_ANTLR4_RUNTIME_SOURCE_URL. If the helper macro has not been updated accordingly, build-time substitution will fail. Please verify that both the macro and this file use the same canonical variable name across all dependency modules.

Comment on lines +28 to +31
antlr4-runtime
URL ${VELOX_ANTLR4_RUNTIME_SOURCE_URL}
URL_HASH ${VELOX_ANTLR4_RUNTIME_BUILD_SHA256_CHECKSUM}
SOURCE_SUBDIR runtime/Cpp OVERRIDE_FIND_PACKAGE)
Copy link

@coderabbitai coderabbitai bot Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

URL_HASH missing required algorithm prefix

CMake ≥ 3.24 requires the hash value to be prefixed with the algorithm (e.g., SHA256=). Without it the download fails the integrity check.

 FetchContent_Declare(
   antlr4-runtime
   URL ${VELOX_ANTLR4_RUNTIME_SOURCE_URL}
-  URL_HASH ${VELOX_ANTLR4_RUNTIME_BUILD_SHA256_CHECKSUM}
+  URL_HASH SHA256=${VELOX_ANTLR4_RUNTIME_BUILD_SHA256_CHECKSUM}
   SOURCE_SUBDIR runtime/Cpp OVERRIDE_FIND_PACKAGE)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
antlr4-runtime
URL ${VELOX_ANTLR4_RUNTIME_SOURCE_URL}
URL_HASH ${VELOX_ANTLR4_RUNTIME_BUILD_SHA256_CHECKSUM}
SOURCE_SUBDIR runtime/Cpp OVERRIDE_FIND_PACKAGE)
FetchContent_Declare(
antlr4-runtime
URL ${VELOX_ANTLR4_RUNTIME_SOURCE_URL}
URL_HASH SHA256=${VELOX_ANTLR4_RUNTIME_BUILD_SHA256_CHECKSUM}
SOURCE_SUBDIR runtime/Cpp OVERRIDE_FIND_PACKAGE)
🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/antlr4-runtime.cmake around lines 28 to 31,
the URL_HASH value is missing the required algorithm prefix. Update the URL_HASH
to include the prefix "SHA256=" before the checksum value to comply with CMake ≥
3.24 requirements and ensure the integrity check passes.

Copy link

@anlowee anlowee Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. Our min CMaker version is 3.28, so this might be an issue

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anlowee Thank you for confirming! The SHA256= prefix will ensure the integrity check works correctly across different CMake versions. This fix will prevent potential build failures when using newer CMake versions.

    /|   /|  
   ( :v:  )
    |(_)|
    |   |
   /     \
  /       \

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85acfe0 and 0686e0f.

📒 Files selected for processing (2)
  • CMake/resolve_dependency_modules/msgpack-cxx.cmake (1 hunks)
  • velox/docs/configs.rst (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (1)
CMake/resolve_dependency_modules/msgpack-cxx.cmake (1)

16-21: ```shell
#!/bin/bash

Download the tarball and compute SHA256 using available tools

curl -sL "https://github.com/msgpack/msgpack-c/archive/refs/tags/cpp-7.0.0.tar.gz" -o /tmp/msgpack.tar.gz
if command -v openssl >/dev/null; then
openssl dgst -sha256 /tmp/msgpack.tar.gz
elif command -v shasum >/dev/null; then
shasum -a 256 /tmp/msgpack.tar.gz
elif command -v python3 >/dev/null; then
python3 - << 'EOF'
import hashlib
with open('/tmp/msgpack.tar.gz', 'rb') as f:
print(hashlib.sha256(f.read()).hexdigest())
EOF
else
echo "Error: No tool available to compute SHA256"
exit 1
fi


</details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +913 to +916
* - clp.storage-type
- string
- FS
- The storage type of CLP archives, **Allowed values:** ``FS``, ``S3``.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clarify value semantics and case-sensitivity

To avoid user confusion, please add a brief note on whether the clp.storage-type values are case-sensitive and what behaviour to expect when an unrecognised value is supplied (e.g., fallback to default vs. startup failure). A single sentence in the Description cell would suffice.

🤖 Prompt for AI Agents
In velox/docs/configs.rst around lines 913 to 916, add a sentence in the
Description cell for clp.storage-type clarifying whether the allowed values "FS"
and "S3" are case-sensitive and specify the behavior if an unrecognized value is
provided, such as whether the system falls back to a default or fails to start.
This will help users understand the expected input format and error handling.

Comment on lines +23 to +25
velox_resolve_dependency_url(MSGPACK)

message(STATUS "Building msgpack-cxx from source")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Status message may be misleading after URL override.

velox_resolve_dependency_url(MSGPACK) can rewrite the URL to a mirror or
pre-downloaded cache, yet the unconditional
message(STATUS "Building msgpack-cxx from source")
follows immediately. If a binary package is later introduced (or a system
package is found), the log would still announce a source build, confusing
developers.

Consider printing the message only when the resolver decided to build from
source, or phrase it more generically (“Using msgpack-cxx from
${VELOX_MSGPACK_SOURCE_URL}”).

🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/msgpack-cxx.cmake around lines 23 to 25, the
status message always states "Building msgpack-cxx from source" immediately
after calling velox_resolve_dependency_url(MSGPACK), which may be misleading if
the URL was overridden or a binary package is used later. Modify the code to
print the message conditionally only when the build is actually from source, or
change the message to a more generic form like "Using msgpack-cxx from
${VELOX_MSGPACK_SOURCE_URL}" to accurately reflect the source or binary usage.

Comment on lines +27 to +35
FetchContent_Declare(
msgpack-cxx
URL ${VELOX_MSGPACK_SOURCE_URL}
URL_HASH ${VELOX_MSGPACK_BUILD_SHA256_CHECKSUM}
OVERRIDE_FIND_PACKAGE EXCLUDE_FROM_ALL SYSTEM)

set(MSGPACK_USE_BOOST OFF)

FetchContent_MakeAvailable(msgpack-cxx)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Expose an imported target for downstream linking convenience.

Many third-party projects expect to link against an msgpackc or
msgpack-cxx target created by the upstream CMake. After
FetchContent_MakeAvailable, that target lives inside the
msgpack-cxx sub-build and is not guaranteed to be globally visible.

Add:

# Ensure an ALIAS exists at the top level.
if (NOT TARGET msgpackc::msgpackc AND TARGET msgpackc)
  add_library(msgpackc::msgpackc ALIAS msgpackc)
endif()

This small alias spares every dependent Velox component from writing custom
find_package_handle code.

🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/msgpack-cxx.cmake around lines 27 to 35, the
imported target for msgpack-cxx is not globally visible after
FetchContent_MakeAvailable, which complicates downstream linking. To fix this,
add an alias target at the top level by checking if msgpackc::msgpackc does not
exist but msgpackc does, then create an ALIAS target named msgpackc::msgpackc
pointing to msgpackc. This will make the target globally accessible for
dependent projects.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
CMakeLists.txt (1)

178-187: Flag override is OK but brittle

The manual reset to OFF works, yet it breaks as soon as a third “minimal” preset is added and someone forgets to touch this line.
Consider centralising connector-option resets (e.g., iterate over a list) to avoid future drift.
No immediate change required.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0686e0f and b8fc0b2.

📒 Files selected for processing (1)
  • CMakeLists.txt (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (1)
CMakeLists.txt (1)

464-470: Boost component additions look good

headers, iostreams, and url are harmless header-only (or thin) targets. Nothing else to flag.

Comment on lines +130 to 131
option(VELOX_ENABLE_CLP_CONNECTOR "Build CLP connector." ON)
option(VELOX_ENABLE_HIVE_CONNECTOR "Build Hive connector." ON)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing compile-time define for the new flag

Other feature toggles (S3, ABFS, PARQUET, …) inject a -DVELOX_ENABLE_* definition immediately after the option is parsed.
VELOX_ENABLE_CLP_CONNECTOR never sets such a macro, so C++ sources cannot conditionally compile on it.

 option(VELOX_ENABLE_CLP_CONNECTOR "Build CLP connector." ON)
+
+if(VELOX_ENABLE_CLP_CONNECTOR)
+  add_definitions(-DVELOX_ENABLE_CLP_CONNECTOR)
+endif()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
option(VELOX_ENABLE_CLP_CONNECTOR "Build CLP connector." ON)
option(VELOX_ENABLE_HIVE_CONNECTOR "Build Hive connector." ON)
option(VELOX_ENABLE_CLP_CONNECTOR "Build CLP connector." ON)
if(VELOX_ENABLE_CLP_CONNECTOR)
add_definitions(-DVELOX_ENABLE_CLP_CONNECTOR)
endif()
option(VELOX_ENABLE_HIVE_CONNECTOR "Build Hive connector." ON)
🤖 Prompt for AI Agents
In CMakeLists.txt around lines 130 to 131, the option VELOX_ENABLE_CLP_CONNECTOR
is defined but does not set a corresponding compile-time macro. To fix this, add
a check after the option definition that appends -DVELOX_ENABLE_CLP_CONNECTOR to
the compiler flags if the option is ON, similar to how other feature toggles
like S3 or ABFS do it. This will enable conditional compilation in the C++
sources based on this flag.

Comment on lines 526 to 552
if(${VELOX_ENABLE_CLP_CONNECTOR})
set(ystdlib_cpp_SOURCE BUNDLED)
velox_resolve_dependency(ystdlib_cpp)

velox_set_source(absl)
velox_resolve_dependency(absl)

velox_set_source(nlohmann_json)
velox_resolve_dependency(nlohmann_json)

velox_set_source(simdjson)
velox_resolve_dependency(simdjson)

velox_set_source(date)
velox_resolve_dependency(date)

velox_set_source(spdlog)
velox_resolve_dependency(spdlog)

velox_set_source(msgpack-cxx)
velox_resolve_dependency(msgpack-cxx)

velox_set_source(antlr4-runtime)
velox_resolve_dependency(antlr4-runtime)

set(clp_SOURCE BUNDLED)
velox_resolve_dependency(clp)
endif()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate dependency resolution may lead to version skew

absl and simdjson are resolved twice: once in this CLP block and again later in the global section (lines 576-586).
If the two calls pick different default versions or options, CMake will error or, worse, silently choose one, causing ODR/ABI issues.

At minimum, guard the re-resolves:

-  velox_set_source(absl)
-  velox_resolve_dependency(absl)
+  if(NOT TARGET absl::absl)
+    velox_set_source(absl)
+    velox_resolve_dependency(absl)
+  endif()

-  velox_set_source(simdjson)
-  velox_resolve_dependency(simdjson)
+  if(NOT TARGET simdjson::simdjson)
+    velox_set_source(simdjson)
+    velox_resolve_dependency(simdjson)
+  endif()

The same pattern applies to every library already handled elsewhere.
Alternatively, remove them here and rely on the later unconditional resolve block.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if(${VELOX_ENABLE_CLP_CONNECTOR})
set(ystdlib_cpp_SOURCE BUNDLED)
velox_resolve_dependency(ystdlib_cpp)
velox_set_source(absl)
velox_resolve_dependency(absl)
velox_set_source(nlohmann_json)
velox_resolve_dependency(nlohmann_json)
velox_set_source(simdjson)
velox_resolve_dependency(simdjson)
velox_set_source(date)
velox_resolve_dependency(date)
velox_set_source(spdlog)
velox_resolve_dependency(spdlog)
velox_set_source(msgpack-cxx)
velox_resolve_dependency(msgpack-cxx)
velox_set_source(antlr4-runtime)
velox_resolve_dependency(antlr4-runtime)
set(clp_SOURCE BUNDLED)
velox_resolve_dependency(clp)
endif()
if(${VELOX_ENABLE_CLP_CONNECTOR})
set(ystdlib_cpp_SOURCE BUNDLED)
velox_resolve_dependency(ystdlib_cpp)
if(NOT TARGET absl::absl)
velox_set_source(absl)
velox_resolve_dependency(absl)
endif()
velox_set_source(nlohmann_json)
velox_resolve_dependency(nlohmann_json)
if(NOT TARGET simdjson::simdjson)
velox_set_source(simdjson)
velox_resolve_dependency(simdjson)
endif()
velox_set_source(date)
velox_resolve_dependency(date)
velox_set_source(spdlog)
velox_resolve_dependency(spdlog)
velox_set_source(msgpack-cxx)
velox_resolve_dependency(msgpack-cxx)
velox_set_source(antlr4-runtime)
velox_resolve_dependency(antlr4-runtime)
set(clp_SOURCE BUNDLED)
velox_resolve_dependency(clp)
endif()
🤖 Prompt for AI Agents
In CMakeLists.txt between lines 526 and 553, the dependencies absl and simdjson
(and others) are resolved again although they are already resolved later in
lines 576-586, which risks version conflicts. To fix this, remove the duplicate
velox_set_source and velox_resolve_dependency calls for libraries already
resolved globally, or add conditional guards to skip resolving them again in
this block. This ensures each dependency is resolved only once to avoid CMake
errors or runtime issues.

@wraymo wraymo force-pushed the clp_integration_new branch from b8fc0b2 to e9f9046 Compare June 25, 2025 22:48
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8fc0b2 and e9f9046.

📒 Files selected for processing (1)
  • CMake/resolve_dependency_modules/boost/CMakeLists.txt (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency

Comment on lines 45 to 46
URL_HASH ${VELOX_BOOST_BUILD_SHA256_CHECKSUM}
OVERRIDE_FIND_PACKAGE)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add the algorithm prefix to URL_HASH to avoid CMake fetch failures

FetchContent_Declare() expects URL_HASH in the form SHA256=<hash> (or MD5=<hash>, etc.).
Passing only the raw hex string causes CMake to error out with “Invalid hash string” on 3.14+.

-set(VELOX_BOOST_BUILD_SHA256_CHECKSUM
-    4d27e9efed0f6f152dc28db6430b9d3dfb40c0345da7342eaa5a987dde57bd95)
+set(VELOX_BOOST_BUILD_SHA256_CHECKSUM
+    SHA256=4d27e9efed0f6f152dc28db6430b9d3dfb40c0345da7342eaa5a987dde57bd95)

With that change the subsequent URL_HASH ${VELOX_BOOST_BUILD_SHA256_CHECKSUM} line remains valid.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
URL_HASH ${VELOX_BOOST_BUILD_SHA256_CHECKSUM}
OVERRIDE_FIND_PACKAGE)
# CMake/resolve_dependency_modules/boost/CMakeLists.txt
# Ensure the checksum is prefixed with the algorithm name
set(VELOX_BOOST_BUILD_SHA256_CHECKSUM
SHA256=4d27e9efed0f6f152dc28db6430b9d3dfb40c0345da7342eaa5a987dde57bd95)
FetchContent_Declare(
VELOX_BOOST_BUILD
URL ${VELOX_BOOST_BUILD_URL}
URL_HASH ${VELOX_BOOST_BUILD_SHA256_CHECKSUM}
OVERRIDE_FIND_PACKAGE)
🤖 Prompt for AI Agents
In CMake/resolve_dependency_modules/boost/CMakeLists.txt around lines 45 to 46,
the URL_HASH value is set without specifying the hash algorithm prefix, causing
CMake fetch failures. Fix this by prefixing the hash string with "SHA256=" so
that URL_HASH is set as "SHA256=${VELOX_BOOST_BUILD_SHA256_CHECKSUM}", ensuring
compatibility with CMake 3.14 and later.

wraymo pushed a commit that referenced this pull request Jun 26, 2025
…ger-overflow (facebookincubator#13831)

Summary:
Pull Request resolved: facebookincubator#13831

This avoids the following errors:

```
fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56:41: runtime error: negation of -9223372036854775808 cannot be represented in type 'long'; cast to an unsigned type to negate this value to itself
    #0 0x000000346ce5 in std::abs(long) fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56
    #1 0x000000345879 in std::shared_ptr<facebook::velox::BiasVector<facebook::velox::test::EvalTypeHelper<long>::Type>> facebook::velox::test::VectorMaker::biasVector<long>(std::vector<std::optional<long>, std::allocator<std::optional<long>>> const&) fbcode/velox/vector/tests/utils/VectorMaker-inl.h:58
    #2 0x000000344d34 in facebook::velox::test::BiasVectorErrorTest::errorTest(std::vector<std::optional<long>, std::allocator<std::optional<long>>>) fbcode/velox/vector/tests/BiasVectorTest.cpp:39
    #3 0x00000033ec99 in facebook::velox::test::BiasVectorErrorTest_checkRangeTooLargeError_Test::TestBody() fbcode/velox/vector/tests/BiasVectorTest.cpp:44
    #4 0x7fe0a2342c46 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) fbsource/src/gtest.cc:2727
    #5 0x7fe0a234275d in testing::Test::Run() fbsource/src/gtest.cc:2744
    #6 0x7fe0a2345fb3 in testing::TestInfo::Run() fbsource/src/gtest.cc:2890
    #7 0x7fe0a234c8eb in testing::TestSuite::Run() fbsource/src/gtest.cc:3068
    #8 0x7fe0a237b52b in testing::internal::UnitTestImpl::RunAllTests() fbsource/src/gtest.cc:6059
    #9 0x7fe0a237a0a2 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) fbsource/src/gtest.cc:2727
    #10 0x7fe0a23797f5 in testing::UnitTest::Run() fbsource/src/gtest.cc:5599
    #11 0x7fe0a2239800 in RUN_ALL_TESTS() fbsource/gtest/gtest.h:2334
    #12 0x7fe0a223952c in main fbcode/common/gtest/LightMain.cpp:20
    #13 0x7fe09ec2c656 in __libc_start_call_main /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #14 0x7fe09ec2c717 in __libc_start_main@GLIBC_2.2.5 /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../csu/libc-start.c:409:3
    #15 0x00000033d8b0 in _start /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/csu/../sysdeps/x86_64/start.S:116

UndefinedBehaviorSanitizer: signed-integer-overflow fbcode/third-party-buck/platform010/build/libgcc/include/c++/trunk/bits/std_abs.h:56:41
```
Avoid overflow by using the expression (static_cast<uint64_t>(1) + ~static_cast<uint64_t>(min)) to calculate the absolute value of min without using std::abs

Reviewed By: dmm-fb, peterenescu

Differential Revision: D76901449

fbshipit-source-id: 7eb3bd0f83e42f44cdf34ea1759f3aa9e1042dae
@wraymo wraymo force-pushed the clp_integration_new branch from 13d1585 to a8c3dea Compare June 27, 2025 15:33
GIT_REPOSITORY https://github.com/y-scope/ystdlib-cpp.git
GIT_TAG 0ae886c6a7ee706a3c6e1950262b63d72f71fe63)

FetchContent_Populate(ystdlib_cpp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we upgrade clp to install ystdlib as a package, we should be able to use MakeAvailable instead.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (10)
velox/connectors/clp/ClpConfig.h (2)

34-34: Add documentation for the configuration key.

Consider documenting what this configuration key controls for better maintainability.

+  // Configuration key to specify the storage backend type (FS or S3)
   static constexpr const char* kStorageType = "clp.storage-type";

36-39: Optimize constructor with member initializer list.

The constructor can be more efficient by using a member initializer list.

-  explicit ClpConfig(std::shared_ptr<const config::ConfigBase> config) {
-    VELOX_CHECK_NOT_NULL(config, "Config is null for CLP initialization");
-    config_ = std::move(config);
-  }
+  explicit ClpConfig(std::shared_ptr<const config::ConfigBase> config)
+      : config_(std::move(config)) {
+    VELOX_CHECK_NOT_NULL(config_, "Config is null for CLP initialization");
+  }
velox/docs/develop/connectors.rst (1)

146-146: Fix missing newline at end of file.

The pipeline failure indicates a missing newline character.

 decode and load data as needed during execution.
+
velox/connectors/clp/search_lib/ClpVectorLoader.h (3)

19-21: Add missing include headers.

The file uses standard library types and simdjson without explicitly including their headers.

 #include "velox/connectors/clp/search_lib/ClpCursor.h"
 #include "velox/vector/LazyVector.h"
+#include <memory>
+#include <vector>
+#include <simdjson.h>

33-36: Document ownership semantics for the raw pointer parameter.

The raw pointer parameter needs clear documentation about ownership and lifetime requirements.

+  /// @param columnReader Non-owning pointer to a CLP column reader. The caller
+  ///                     must ensure the reader remains valid for the lifetime
+  ///                     of this ClpVectorLoader instance.
+  /// @param nodeType The column type being loaded
+  /// @param filteredRowIndices Shared pointer to filtered row indices
   ClpVectorLoader(
       clp_s::BaseColumnReader* columnReader,
       ColumnType nodeType,
       std::shared_ptr<std::vector<uint64_t>> filteredRowIndices);

39-46: Add documentation for private methods.

The private methods should be documented to improve code maintainability.

+  /// Implements the VectorLoader interface to load data from CLP column readers
+  /// into Velox vectors based on the filtered row indices.
   void loadInternal(
       RowSet rows,
       ValueHook* hook,
       vector_size_t resultSize,
       VectorPtr* result) override;

+  /// Template helper to populate typed vector data from the column reader
+  /// @tparam T The data type being populated
+  /// @tparam VectorPtr The specific vector type (e.g., FlatVector<T>)
   template <typename T, typename VectorPtr>
   void populateData(RowSet rows, VectorPtr vector);
velox/connectors/clp/search_lib/ClpQueryRunner.h (1)

71-73: Use const shared_ptr for filteredRowIndices parameter

Since the ownership of filteredRowIndices is transferred to ClpVectorLoader, the parameter should be const to clarify ownership semantics and prevent accidental modifications.

Apply this diff:

 uint64_t fetchNext(
     uint64_t numRows,
-    const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
+    const std::shared_ptr<std::vector<uint64_t>> filteredRowIndices);
velox/connectors/clp/ClpDataSource.cpp (2)

121-121: Remove reference from const shared_ptr parameter

Since the ownership of filteredRows is transferred to ClpVectorLoader, the parameter should be passed by value.

Apply this diff:

-    const std::shared_ptr<std::vector<uint64_t>>& filteredRows,
+    const std::shared_ptr<std::vector<uint64_t>> filteredRows,

103-115: Add error handling for cursor initialization and query execution

The method creates a cursor and executes a query without checking for errors. This could lead to undefined behaviour if either operation fails.

Apply this diff to add proper error handling:

 void ClpDataSource::addSplit(std::shared_ptr<ConnectorSplit> split) {
   auto clpSplit = std::dynamic_pointer_cast<ClpConnectorSplit>(split);
+  VELOX_CHECK_NOT_NULL(clpSplit, "Invalid split type for CLP connector");

   if (storageType_ == ClpConfig::StorageType::kFs) {
     cursor_ = std::make_unique<search_lib::ClpCursor>(
         clp_s::InputSource::Filesystem, clpSplit->path_);
   } else if (storageType_ == ClpConfig::StorageType::kS3) {
     cursor_ = std::make_unique<search_lib::ClpCursor>(
         clp_s::InputSource::Network, clpSplit->path_);
+  } else {
+    VELOX_USER_FAIL("Unsupported storage type");
   }

+  VELOX_CHECK_NOT_NULL(cursor_, "Failed to create CLP cursor");
+  try {
     cursor_->executeQuery(kqlQuery_, fields_);
+  } catch (const std::exception& e) {
+    VELOX_USER_FAIL("Failed to execute query '{}': {}", kqlQuery_, e.what());
+  }
 }
velox/connectors/clp/ClpDataSource.h (1)

98-99: Remove unused member variable

The columnUntypedNames_ member variable is declared but never used in the implementation.

Apply this diff:

   velox::memory::MemoryPool* pool_;
   RowTypePtr outputType_;
-  std::set<std::string> columnUntypedNames_;
   uint64_t completedRows_{0};
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9411387 and f3b0576.

📒 Files selected for processing (9)
  • .github/workflows/linux-build-base.yml (1 hunks)
  • velox/connectors/clp/ClpConfig.cpp (1 hunks)
  • velox/connectors/clp/ClpConfig.h (1 hunks)
  • velox/connectors/clp/ClpDataSource.cpp (1 hunks)
  • velox/connectors/clp/ClpDataSource.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpCursor.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpQueryRunner.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpVectorLoader.h (1 hunks)
  • velox/docs/develop/connectors.rst (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
velox/connectors/clp/ClpDataSource.h (1)
Learnt from: anlowee
PR: y-scope/velox#1
File: CMakeLists.txt:458-483
Timestamp: 2025-03-10T05:55:18.722Z
Learning: When contributing to the Velox project, prefer adding new code on top of existing scripts rather than restructuring the original code organization, especially for dependency management.
🧬 Code Graph Analysis (1)
velox/connectors/clp/ClpDataSource.h (1)
velox/connectors/clp/ClpConnector.h (1)
  • outputType (39-45)
🪛 GitHub Actions: Run Checks
velox/docs/develop/connectors.rst

[error] 146-146: end-of-file-fixer hook failed: missing newline at end of file. The hook fixed the file by adding a newline.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
  • GitHub Check: Build
🔇 Additional comments (1)
velox/connectors/clp/ClpConfig.cpp (1)

25-40: LGTM!

The implementation correctly handles case-insensitive string to enum conversion and provides appropriate error handling for unsupported storage types.

- name: Build Artifact
env:
VELOX_DEPENDENCY_SOURCE: BUNDLED
Boost_SOURCE: SYSTEM
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Environment variable likely ignored – use all-caps BOOST_SOURCE instead of Boost_SOURCE.

resolve_dependency.cmake converts package names to upper-case (e.g., BoostBOOST) and then looks for BOOST_SOURCE.
With the current mixed-case key, Boost will silently fall back to the default (BUNDLED), negating the intent of using the system-installed libraries and increasing build time.

-          Boost_SOURCE: SYSTEM
+          BOOST_SOURCE: SYSTEM
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Boost_SOURCE: SYSTEM
BOOST_SOURCE: SYSTEM
🤖 Prompt for AI Agents
In .github/workflows/linux-build-base.yml at line 55, the environment variable
key is incorrectly cased as Boost_SOURCE, which is ignored by the build system.
Change the key to all uppercase BOOST_SOURCE to ensure the system-installed
Boost libraries are used instead of the default bundled ones, preventing
unnecessary build time increases.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3b0576 and e4f5b39.

📒 Files selected for processing (1)
  • velox/docs/develop/connectors.rst (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency

Comment on lines +32 to +33
Velox provides Hive, CLP, and TPC-H Connectors out of the box.
Let's examine the implementation details of both the Hive and CLP Connectors as examples.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer lowercase “connectors” for consistency with doc style

Throughout the document, common nouns aren’t capitalised (e.g., “storage adapters”). Sticking to that pattern improves consistency and helps avoid accidental emphasis.

-Velox provides Hive, CLP, and TPC-H Connectors out of the box.
-Let's examine the implementation details of both the Hive and CLP Connectors as examples.
+Velox provides Hive, CLP, and TPC-H connectors out of the box.
+Let's examine the implementation details of the Hive and CLP connectors as examples.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Velox provides Hive, CLP, and TPC-H Connectors out of the box.
Let's examine the implementation details of both the Hive and CLP Connectors as examples.
Velox provides Hive, CLP, and TPC-H connectors out of the box.
Let's examine the implementation details of the Hive and CLP connectors as examples.
🤖 Prompt for AI Agents
In velox/docs/develop/connectors.rst around lines 32 to 33, the word
"Connectors" is capitalized inconsistently with the document style. Change
"Connectors" to lowercase "connectors" in both instances to maintain consistency
with the style used for common nouns throughout the document.

Comment on lines +127 to +130
The CLP Connector is used to read CLP archives stored on a local file system or S3. It implements similar
interfaces as the Hive Connector except for the ``DataSink`` interface. Here we only describe the ``DataSource``
interface and the ``ConnectorSplit`` interface implementation since `Connector` and ``ConnectorFactory`` are
similar to the Hive Connector.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Tighten grammar and unify inline-code markup

Small wording tweak plus consistent use of double back-ticks for inline class names.

-The CLP Connector is used to read CLP archives stored on a local file system or S3. It implements similar
-interfaces as the Hive Connector except for the ``DataSink`` interface. Here we only describe the ``DataSource``
-interface and the ``ConnectorSplit`` interface implementation since `Connector` and ``ConnectorFactory`` are
-similar to the Hive Connector.
+The CLP connector reads CLP archives stored on a local file system or S3. It implements interfaces similar
+to those of the Hive connector, except for the ``DataSink`` interface.  Here we describe only the ``DataSource``
+and ``ConnectorSplit`` interface implementations, since ``Connector`` and ``ConnectorFactory`` behave
+similarly to their Hive counterparts.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
The CLP Connector is used to read CLP archives stored on a local file system or S3. It implements similar
interfaces as the Hive Connector except for the ``DataSink`` interface. Here we only describe the ``DataSource``
interface and the ``ConnectorSplit`` interface implementation since `Connector` and ``ConnectorFactory`` are
similar to the Hive Connector.
The CLP connector reads CLP archives stored on a local file system or S3. It implements interfaces similar
to those of the Hive connector, except for the ``DataSink`` interface. Here we describe only the ``DataSource``
and ``ConnectorSplit`` interface implementations, since ``Connector`` and ``ConnectorFactory`` behave
similarly to their Hive counterparts.
🤖 Prompt for AI Agents
In velox/docs/develop/connectors.rst around lines 127 to 130, improve the
grammar and ensure consistent use of double back-ticks for inline class names.
Adjust the wording slightly for clarity and replace any single back-ticks around
class names with double back-ticks to unify the markup style throughout the
paragraph.

@wraymo wraymo requested review from anlowee and kirkrodrigues July 7, 2025 17:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
velox/connectors/clp/search_lib/ClpQueryRunner.cpp (2)

24-26: Add blank line before method definition.

 namespace facebook::velox::connector::clp::search_lib {

+
 void ClpQueryRunner::init(

64-74: Fix variable naming consistency.

The variable rowsfiltered should follow camelCase naming convention to be consistent with the codebase style.

-  size_t rowsfiltered{0};
+  size_t rowsFiltered{0};
   size_t rowsScanned{0};
   while (curMessage_ < numMessages_) {
     if (filter(curMessage_)) {
       filteredRowIndices->emplace_back(curMessage_);
-      rowsfiltered += 1;
+      rowsFiltered += 1;
     }

     curMessage_ += 1;
     rowsScanned += 1;
-    if (rowsfiltered >= numRows) {
+    if (rowsFiltered >= numRows) {
velox/connectors/clp/ClpConnector.cpp (1)

52-60: Consolidate TimestampPattern initialization to avoid duplication.

Both constructors call clp_s::TimestampPattern::init(). Consider using constructor delegation or extracting the initialization to a private method.

Using constructor delegation:

 ClpConnectorFactory::ClpConnectorFactory()
     : ConnectorFactory(kClpConnectorName) {
   clp_s::TimestampPattern::init();
 }

 ClpConnectorFactory::ClpConnectorFactory(const char* connectorName)
-    : ConnectorFactory(connectorName) {
-  clp_s::TimestampPattern::init();
-}
+    : ConnectorFactory(connectorName) {
+  // Delegate initialization to a common method
+  initializeTimestampPattern();
+}
+
+void ClpConnectorFactory::initializeTimestampPattern() {
+  static std::once_flag initFlag;
+  std::call_once(initFlag, []() {
+    clp_s::TimestampPattern::init();
+  });
+}

Note: Using std::once_flag ensures thread-safe initialization if TimestampPattern::init() should only be called once globally.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4f5b39 and 01b51dd.

📒 Files selected for processing (2)
  • velox/connectors/clp/ClpConnector.cpp (1 hunks)
  • velox/connectors/clp/search_lib/ClpQueryRunner.cpp (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (2)
velox/connectors/clp/ClpConnector.cpp (2)

24-28: Constructor implementation looks good!

The use of std::move for the config parameter is appropriate since it's passed by value. This follows C++ best practices for efficient ownership transfer.


29-42: Data source creation is properly implemented.

The method correctly instantiates ClpDataSource with all required parameters and returns it as a unique pointer.

Comment on lines +44 to +50
std::unique_ptr<DataSink> ClpConnector::createDataSink(
RowTypePtr inputType,
std::shared_ptr<ConnectorInsertTableHandle> connectorInsertTableHandle,
ConnectorQueryCtx* connectorQueryCtx,
CommitStrategy commitStrategy) {
VELOX_NYI("createDataSink for ClpConnector is not implemented!");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Mark unused parameters to avoid compiler warnings.

Since this method is not yet implemented, consider marking the parameters as unused to avoid compiler warnings.

 std::unique_ptr<DataSink> ClpConnector::createDataSink(
-    RowTypePtr inputType,
-    std::shared_ptr<ConnectorInsertTableHandle> connectorInsertTableHandle,
-    ConnectorQueryCtx* connectorQueryCtx,
-    CommitStrategy commitStrategy) {
+    RowTypePtr /*inputType*/,
+    std::shared_ptr<ConnectorInsertTableHandle> /*connectorInsertTableHandle*/,
+    ConnectorQueryCtx* /*connectorQueryCtx*/,
+    CommitStrategy /*commitStrategy*/) {
   VELOX_NYI("createDataSink for ClpConnector is not implemented!");
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::unique_ptr<DataSink> ClpConnector::createDataSink(
RowTypePtr inputType,
std::shared_ptr<ConnectorInsertTableHandle> connectorInsertTableHandle,
ConnectorQueryCtx* connectorQueryCtx,
CommitStrategy commitStrategy) {
VELOX_NYI("createDataSink for ClpConnector is not implemented!");
}
std::unique_ptr<DataSink> ClpConnector::createDataSink(
RowTypePtr /*inputType*/,
std::shared_ptr<ConnectorInsertTableHandle> /*connectorInsertTableHandle*/,
ConnectorQueryCtx* /*connectorQueryCtx*/,
CommitStrategy /*commitStrategy*/) {
VELOX_NYI("createDataSink for ClpConnector is not implemented!");
}
🤖 Prompt for AI Agents
In velox/connectors/clp/ClpConnector.cpp around lines 44 to 50, the
createDataSink method is not implemented and its parameters are unused, which
can cause compiler warnings. To fix this, mark all the parameters as unused by
casting them to void or using an equivalent method to explicitly indicate they
are unused, preventing compiler warnings while keeping the method stub.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (6)
velox/connectors/clp/search_lib/ClpCursor.h (1)

85-87: Consider using non-const reference instead of shared_ptr for filteredRowIndices.

Since filteredRowIndices is an in/out parameter that gets modified, using std::vector<uint64_t>& would be more appropriate than shared_ptr. This avoids unnecessary shared ownership semantics.

   uint64_t fetchNext(
       uint64_t numRows,
-      const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
+      std::vector<uint64_t>& filteredRowIndices);
velox/connectors/clp/search_lib/ClpQueryRunner.h (1)

70-72: Consider using non-const reference instead of shared_ptr for filteredRowIndices.

Consistent with the comment on ClpCursor::fetchNext, using std::vector<uint64_t>& would be more appropriate for this in/out parameter.

   uint64_t fetchNext(
       uint64_t numRows,
-      const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
+      std::vector<uint64_t>& filteredRowIndices);
velox/connectors/clp/search_lib/ClpCursor.cpp (4)

76-85: Fix error code handling in schema iteration.

Setting errorCode_ inside the loop gets overwritten in subsequent iterations. Consider tracking schema-specific errors without overwriting the global error state.

       if (EvaluatedValue::False ==
           queryRunner_->schema_init(currentSchemaId_)) {
+        LOG(WARNING) << "Dictionary not found for schema " << currentSchemaId_;
         currentSchemaIndex_ += 1;
         currentSchemaTableLoaded_ = false;
-        errorCode_ = ErrorCode::DictionaryNotFound;
         continue;
       }

130-148: Refactor assignments inside if conditions for clarity.

Separate the assignment from the condition check to improve readability.

   OrOfAndForm standardizePass;
-  if (expr_ = standardizePass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = standardizePass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }

   NarrowTypes narrowPass;
-  if (expr_ = narrowPass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = narrowPass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }

   ConvertToExists convertPass;
-  if (expr_ = convertPass.run(expr_);
-      std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
+  expr_ = convertPass.run(expr_);
+  if (std::dynamic_pointer_cast<EmptyExpr>(expr_)) {
     VLOG(2) << "Query '" << query_ << "' is logically false";
     return ErrorCode::LogicalError;
   }

153-255: Consider refactoring loadArchive into smaller methods.

This method handles multiple responsibilities and is over 100 lines long. Consider breaking it down into focused methods.

Extract logical sections into private methods:

  • openArchiveWithAuth() - lines 154-164
  • evaluateTimestampConstraints() - lines 170-174
  • matchSchemas() - lines 176-181, 232-240
  • setupProjection() - lines 183-227
  • loadDictionaries() - lines 248-250

242-246: Remove duplicate timestamp evaluation.

This timestamp evaluation is redundant as the same check is already performed at lines 170-174.

-  EvaluateTimestampIndex timestamp_index(timestampDict);
-  if (EvaluatedValue::False == timestamp_index.run(expr_)) {
-    VLOG(2) << "No matching timestamp ranges for query '" << query_ << "'";
-    return ErrorCode::InvalidTimestampRange;
-  }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01b51dd and 01d8efa.

📒 Files selected for processing (3)
  • velox/connectors/clp/search_lib/ClpCursor.cpp (1 hunks)
  • velox/connectors/clp/search_lib/ClpCursor.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpQueryRunner.h (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency
🔇 Additional comments (5)
velox/connectors/clp/search_lib/ClpCursor.h (2)

41-50: ErrorCode enum is well-structured.

The error codes cover the necessary states for query execution and follow consistent naming conventions.


52-58: ColumnType enum and Field struct are well-designed.

Good use of explicit value for Unknown state to clearly distinguish it from valid types.

velox/connectors/clp/search_lib/ClpQueryRunner.h (1)

45-52: Constructor is properly implemented.

The constructor correctly initializes the base class and stores the projection for later use.

velox/connectors/clp/search_lib/ClpCursor.cpp (2)

36-46: Constructor and destructor are properly implemented.

Good practice to close the archive in the destructor and initialize error code to QueryNotInitialized.


107-114: Well-implemented getter with proper null handling.

Good use of static const empty vector for the null case to avoid creating temporary objects.

Comment on lines +198 to +218
switch (column.type) {
case ColumnType::String:
columnDescriptor->set_matching_types(
LiteralType::ClpStringT | LiteralType::VarStringT |
LiteralType::EpochDateT);
break;
case ColumnType::Integer:
columnDescriptor->set_matching_types(LiteralType::IntegerT);
break;
case ColumnType::Float:
columnDescriptor->set_matching_types(LiteralType::FloatT);
break;
case ColumnType::Boolean:
columnDescriptor->set_matching_types(LiteralType::BooleanT);
break;
case ColumnType::Array:
columnDescriptor->set_matching_types(LiteralType::ArrayT);
break;
default:
break;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for unknown column types.

The default case in the switch statement silently ignores unknown column types. Consider logging or returning an error.

         case ColumnType::Array:
           columnDescriptor->set_matching_types(LiteralType::ArrayT);
           break;
         default:
+          VLOG(2) << "Unknown column type for column: '" << column.name << "'";
+          return ErrorCode::InternalError;
-          break;
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
switch (column.type) {
case ColumnType::String:
columnDescriptor->set_matching_types(
LiteralType::ClpStringT | LiteralType::VarStringT |
LiteralType::EpochDateT);
break;
case ColumnType::Integer:
columnDescriptor->set_matching_types(LiteralType::IntegerT);
break;
case ColumnType::Float:
columnDescriptor->set_matching_types(LiteralType::FloatT);
break;
case ColumnType::Boolean:
columnDescriptor->set_matching_types(LiteralType::BooleanT);
break;
case ColumnType::Array:
columnDescriptor->set_matching_types(LiteralType::ArrayT);
break;
default:
break;
}
switch (column.type) {
case ColumnType::String:
columnDescriptor->set_matching_types(
LiteralType::ClpStringT | LiteralType::VarStringT |
LiteralType::EpochDateT);
break;
case ColumnType::Integer:
columnDescriptor->set_matching_types(LiteralType::IntegerT);
break;
case ColumnType::Float:
columnDescriptor->set_matching_types(LiteralType::FloatT);
break;
case ColumnType::Boolean:
columnDescriptor->set_matching_types(LiteralType::BooleanT);
break;
case ColumnType::Array:
columnDescriptor->set_matching_types(LiteralType::ArrayT);
break;
default:
VLOG(2) << "Unknown column type for column: '" << column.name << "'";
return ErrorCode::InternalError;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.cpp around lines 198 to 218, the
switch statement on column.type lacks error handling for unknown types in the
default case. Modify the default case to log an error message or return an error
indicating an unrecognized column type to ensure such cases are not silently
ignored.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
velox/connectors/clp/search_lib/ClpVectorLoader.cpp (2)

44-54: Add error handling for variant access and bounds checking


122-139: Fix memory safety and error handling in array processing

velox/connectors/clp/search_lib/ClpCursor.h (2)

66-66: Use std::string_view for the archivePath parameter


85-87: Consider returning error status from fetchNext method

The method currently returns only the number of rows scanned without indicating if an error occurred during execution. This makes error handling difficult for callers.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01d8efa and 17fc6e4.

📒 Files selected for processing (2)
  • velox/connectors/clp/search_lib/ClpCursor.h (1 hunks)
  • velox/connectors/clp/search_lib/ClpVectorLoader.cpp (1 hunks)

Comment on lines +47 to +50
if constexpr (std::is_same_v<T, std::string>) {
auto string_value =
std::get<std::string>(columnReader_->extract_value(messageIndex));
vector->set(vectorIndex, StringView(string_value));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential use-after-free with StringView

The string_value variable goes out of scope while the StringView still references its data, leading to undefined behavior.

 if constexpr (std::is_same_v<T, std::string>) {
-  auto string_value =
-      std::get<std::string>(columnReader_->extract_value(messageIndex));
-  vector->set(vectorIndex, StringView(string_value));
+  vector->set(vectorIndex, std::get<std::string>(columnReader_->extract_value(messageIndex)));
 } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if constexpr (std::is_same_v<T, std::string>) {
auto string_value =
std::get<std::string>(columnReader_->extract_value(messageIndex));
vector->set(vectorIndex, StringView(string_value));
if constexpr (std::is_same_v<T, std::string>) {
vector->set(
vectorIndex, std::get<std::string>(columnReader_->extract_value(messageIndex)));
} else {
vector->set(
vectorIndex, std::get<T>(columnReader_->extract_value(messageIndex)));
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpVectorLoader.cpp around lines 47 to 50,
the StringView is constructed from a local std::string variable that goes out of
scope immediately after, causing a use-after-free. To fix this, ensure the
string data outlives the StringView by either storing the string persistently or
copying the string data into a buffer that remains valid for the lifetime of the
StringView before setting it on the vector.

Success
};

enum class ColumnType { String, Integer, Float, Array, Boolean, Unknown = -1 };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider making Unknown the last enum value without explicit assignment

Having Unknown = -1 in an enum that otherwise uses default unsigned values can lead to issues when used as array indices or in switch statements. Consider moving Unknown to the end without an explicit value assignment.

-enum class ColumnType { String, Integer, Float, Array, Boolean, Unknown = -1 };
+enum class ColumnType { String, Integer, Float, Array, Boolean, Unknown };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
enum class ColumnType { String, Integer, Float, Array, Boolean, Unknown = -1 };
enum class ColumnType { String, Integer, Float, Array, Boolean, Unknown };
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpCursor.h at line 52, the enum class
ColumnType assigns Unknown the value -1, which can cause issues when used as
array indices or in switch statements. To fix this, move Unknown to be the last
enum value and remove the explicit assignment so it uses the default incremented
value, ensuring consistent and safe usage.


ErrorCode errorCode_;

clp_s::InputSource inputSource_{clp_s::InputSource::Filesystem};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove redundant default initialization of inputSource_

The member variable inputSource_ is initialized with a default value here but is also set from the constructor parameter. This redundant initialization could be confusing and should be removed.

-  clp_s::InputSource inputSource_{clp_s::InputSource::Filesystem};
+  clp_s::InputSource inputSource_;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
clp_s::InputSource inputSource_{clp_s::InputSource::Filesystem};
clp_s::InputSource inputSource_;
🤖 Prompt for AI Agents
In the file velox/connectors/clp/search_lib/ClpCursor.h around line 108, the
member variable `inputSource_` is being initialized with a default value of
`clp_s::InputSource::Filesystem`, but it is also set from a constructor
parameter. This redundant initialization is unnecessary and could be confusing.
Remove the default initialization of `inputSource_` to simplify the code and
avoid potential confusion.

anlowee
anlowee previously approved these changes Jul 9, 2025
class Projection;
} // namespace clp_s::search

namespace lp_s::search::ast {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be clp_s?

Copy link

@anlowee anlowee Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
velox/connectors/clp/search_lib/ClpQueryRunner.h (1)

33-35: Namespace comment typo still present.

The closing comment says lp_s::search::ast; it should be clp_s::search::ast.
This is the exact issue flagged in earlier reviews and by the clang-format hook.

-} // namespace lp_s::search::ast
+} // namespace clp_s::search::ast
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17fc6e4 and b10d1aa.

📒 Files selected for processing (1)
  • velox/connectors/clp/search_lib/ClpQueryRunner.h (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run Checks
velox/connectors/clp/search_lib/ClpQueryRunner.h

[error] 32-32: clang-format formatting error: incorrect namespace comment 'lp_s::search::ast' should be 'clp_s::search::ast'. The hook modified the file to fix this.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build with GCC / Ubuntu release with resolve_dependency

Comment on lines +81 to +84
std::shared_ptr<clp_s::search::ast::Expression> expr_;
std::shared_ptr<clp_s::SchemaTree> schemaTree_;
std::shared_ptr<clp_s::search::Projection> projection_;
std::vector<clp_s::BaseColumnReader*> projectedColumns_;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Member expr_ is never initialised or used.

expr_ is declared but not set in the constructor and appears unused, leading to dead code and potential confusion. Either remove it or initialise it and use it.

-        projection_(projection) {}
+        expr_(expr),
+        projection_(projection) {}

If the member is genuinely unnecessary, delete it instead.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpQueryRunner.h around lines 81 to 84, the
member variable expr_ is declared but never initialized or used, resulting in
dead code. Review whether expr_ is needed; if it is, initialize it properly in
the constructor and use it as intended. Otherwise, remove the declaration of
expr_ entirely to clean up the code.

Comment on lines +75 to +78
/// represent the columns involved in the scanning operation.
const std::vector<clp_s::BaseColumnReader*>& getProjectedColumns() {
return projectedColumns_;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Make accessor const-correct.

getProjectedColumns does not modify the object; mark the method const to allow calls on const instances and signal intent.

-  const std::vector<clp_s::BaseColumnReader*>& getProjectedColumns() {
+  const std::vector<clp_s::BaseColumnReader*>& getProjectedColumns() const {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// represent the columns involved in the scanning operation.
const std::vector<clp_s::BaseColumnReader*>& getProjectedColumns() {
return projectedColumns_;
}
/// represent the columns involved in the scanning operation.
const std::vector<clp_s::BaseColumnReader*>& getProjectedColumns() const {
return projectedColumns_;
}
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpQueryRunner.h around lines 75 to 78, the
method getProjectedColumns should be marked as const since it does not modify
the object. Add the const qualifier at the end of the method declaration to make
it a const method, enabling calls on const instances and clarifying that it does
not alter object state.

Comment on lines +19 to +20
#include "clp_s/search/QueryRunner.hpp"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Include missing standard headers.

std::unordered_map, std::vector, and std::shared_ptr are used in this header but the corresponding headers are not included. Relying on transitive includes makes the file fragile and breaks IWYU guidelines.

+#include <memory>
+#include <unordered_map>
+#include <vector>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include "clp_s/search/QueryRunner.hpp"
#include "clp_s/search/QueryRunner.hpp"
#include <memory>
#include <unordered_map>
#include <vector>
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpQueryRunner.h around lines 19 to 20, the
header uses std::unordered_map, std::vector, and std::shared_ptr but does not
include their respective standard headers. Add #include directives for
<unordered_map>, <vector>, and <memory> to explicitly include these standard
headers and comply with IWYU guidelines.

Comment on lines +70 to +72
uint64_t fetchNext(
uint64_t numRows,
const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

fetchNext should use override and avoid heap-allocating the index vector.

  1. If this method is meant to override the base QueryRunner::fetchNext, mark it with override to catch signature mismatches at compile-time.
  2. Passing a shared_ptr<std::vector<uint64_t>> forces callers to heap-allocate and share ownership of a simple scratch buffer, which is unnecessary and hurts performance. A non-owning reference (e.g., std::vector<uint64_t>&) is sufficient.
-  uint64_t fetchNext(
-      uint64_t numRows,
-      const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
+  uint64_t fetchNext(
+      uint64_t numRows,
+      std::vector<uint64_t>& filteredRowIndices) override;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uint64_t fetchNext(
uint64_t numRows,
const std::shared_ptr<std::vector<uint64_t>>& filteredRowIndices);
uint64_t fetchNext(
uint64_t numRows,
std::vector<uint64_t>& filteredRowIndices) override;
🤖 Prompt for AI Agents
In velox/connectors/clp/search_lib/ClpQueryRunner.h around lines 70 to 72, the
fetchNext method should be marked with override to ensure it correctly overrides
the base class method and catch signature mismatches at compile time.
Additionally, change the parameter from a shared_ptr to a non-owning reference
to a vector (std::vector<uint64_t>&) to avoid unnecessary heap allocation and
shared ownership, improving performance and simplifying usage.


namespace clp_s::search::ast {
class Expression;
} // namespace lp_s::search::ast
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} // namespace lp_s::search::ast
} // namespace clp_s::search::ast

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-format failed due to this

@kirkrodrigues kirkrodrigues changed the title feat: Add CLP connector feat: Add CLP connector. Jul 10, 2025
@wraymo wraymo merged commit 749d289 into y-scope:presto-0.293-clp-connector Jul 10, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants