-
Notifications
You must be signed in to change notification settings - Fork 37
feat: implement Primitive type Literal #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
59f60da
9355670
082c7e1
e1dd11d
1132ef5
1eb9a9b
07dd257
146a86e
d125408
012966f
88f67dc
694cb55
6d72f80
1d7f904
470259d
56c265e
c46ff60
197102e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
#include "iceberg/datum.h" | ||
|
||
#include <sstream> | ||
|
||
#include "iceberg/exception.h" | ||
|
||
namespace iceberg { | ||
|
||
// Constructor | ||
PrimitiveLiteral::PrimitiveLiteral(PrimitiveLiteralValue value, | ||
std::shared_ptr<PrimitiveType> type) | ||
: value_(std::move(value)), type_(std::move(type)) {} | ||
|
||
// Factory methods | ||
PrimitiveLiteral PrimitiveLiteral::Boolean(bool value) { | ||
return PrimitiveLiteral(value, std::make_shared<BooleanType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::Integer(int32_t value) { | ||
return PrimitiveLiteral(value, std::make_shared<IntType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::Long(int64_t value) { | ||
return PrimitiveLiteral(value, std::make_shared<LongType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::Float(float value) { | ||
return PrimitiveLiteral(value, std::make_shared<FloatType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::Double(double value) { | ||
return PrimitiveLiteral(value, std::make_shared<DoubleType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::String(std::string value) { | ||
return PrimitiveLiteral(std::move(value), std::make_shared<StringType>()); | ||
} | ||
|
||
PrimitiveLiteral PrimitiveLiteral::Binary(std::vector<uint8_t> value) { | ||
return PrimitiveLiteral(std::move(value), std::make_shared<BinaryType>()); | ||
} | ||
|
||
Result<PrimitiveLiteral> PrimitiveLiteral::Deserialize(std::span<const uint8_t> data) { | ||
return NotImplemented("Deserialization of PrimitiveLiteral is not implemented yet"); | ||
} | ||
|
||
Result<std::vector<uint8_t>> PrimitiveLiteral::Serialize() const { | ||
return NotImplemented("Serialization of PrimitiveLiteral is not implemented yet"); | ||
} | ||
|
||
// Getters | ||
const PrimitiveLiteralValue& PrimitiveLiteral::value() const { return value_; } | ||
|
||
const std::shared_ptr<PrimitiveType>& PrimitiveLiteral::type() const { return type_; } | ||
|
||
// Cast method | ||
Result<PrimitiveLiteral> PrimitiveLiteral::CastTo( | ||
const std::shared_ptr<PrimitiveType>& target_type) const { | ||
if (*type_ == *target_type) { | ||
// If types are the same, return a copy of the current literal | ||
return PrimitiveLiteral(value_, target_type); | ||
} | ||
|
||
return NotImplemented("Cast from {} to {} is not implemented", type_->ToString(), | ||
target_type->ToString()); | ||
} | ||
|
||
// Three-way comparison operator | ||
std::partial_ordering PrimitiveLiteral::operator<=>(const PrimitiveLiteral& other) const { | ||
// If types are different, comparison is unordered | ||
if (type_->type_id() != other.type_->type_id()) { | ||
return std::partial_ordering::unordered; | ||
} | ||
if (value_ == other.value_) { | ||
return std::partial_ordering::equivalent; | ||
} | ||
throw IcebergError("Not implemented: comparison between different primitive types"); | ||
} | ||
|
||
std::string PrimitiveLiteral::ToString() const { | ||
throw NotImplemented("ToString for PrimitiveLiteral is not implemented yet"); | ||
} | ||
|
||
} // namespace iceberg |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
#pragma once | ||
|
||
#include <compare> | ||
#include <memory> | ||
#include <string> | ||
#include <variant> | ||
#include <vector> | ||
|
||
#include "iceberg/result.h" | ||
#include "iceberg/type.h" | ||
|
||
namespace iceberg { | ||
|
||
/// \brief Exception type for values that are below the minimum allowed value for a | ||
/// primitive type. | ||
/// | ||
/// When casting a value to a narrow primitive type, if the value exceeds the maximum of | ||
/// dest type, it might be above the maximum allowed value for that type. | ||
struct BelowMin { | ||
bool operator==(const BelowMin&) const = default; | ||
std::strong_ordering operator<=>(const BelowMin&) const = default; | ||
}; | ||
|
||
/// \brief Exception type for values that are above the maximum allowed value for a | ||
/// primitive type. | ||
/// | ||
/// When casting a value to a narrow primitive type, if the value exceeds the maximum of | ||
/// dest type, it might be above the maximum allowed value for that type. | ||
struct AboveMax { | ||
bool operator==(const AboveMax&) const = default; | ||
std::strong_ordering operator<=>(const AboveMax&) const = default; | ||
}; | ||
|
||
using PrimitiveLiteralValue = | ||
std::variant<bool, // for boolean | ||
int32_t, // for int, date | ||
int64_t, // for long, timestamp, timestamp_tz, time | ||
float, // for float | ||
double, // for double | ||
std::string, // for string | ||
std::vector<uint8_t>, // for binary, fixed, decimal and uuid | ||
mapleFU marked this conversation as resolved.
Show resolved
Hide resolved
|
||
BelowMin, AboveMax>; | ||
|
||
/// \brief PrimitiveLiteral is owned literal of a primitive type. | ||
class PrimitiveLiteral { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rename to Datum since you are using datum.h/cc as the filename?
|
||
public: | ||
explicit PrimitiveLiteral(PrimitiveLiteralValue value, | ||
std::shared_ptr<PrimitiveType> type); | ||
|
||
// Factory methods for primitive types | ||
static PrimitiveLiteral Boolean(bool value); | ||
static PrimitiveLiteral Integer(int32_t value); | ||
static PrimitiveLiteral Long(int64_t value); | ||
static PrimitiveLiteral Float(float value); | ||
static PrimitiveLiteral Double(double value); | ||
static PrimitiveLiteral String(std::string value); | ||
static PrimitiveLiteral Binary(std::vector<uint8_t> value); | ||
|
||
/// Create iceberg value from bytes. | ||
/// | ||
/// See [this spec](https://iceberg.apache.org/spec/#binary-single-value-serialization) | ||
/// for reference. | ||
static Result<PrimitiveLiteral> Deserialize(std::span<const uint8_t> data); | ||
/// Serialize iceberg value to bytes. | ||
/// | ||
/// See [this spec](https://iceberg.apache.org/spec/#binary-single-value-serialization) | ||
/// for reference. | ||
Result<std::vector<uint8_t>> Serialize() const; | ||
|
||
/// Get the value as a variant | ||
const PrimitiveLiteralValue& value() const; | ||
|
||
/// Get the Iceberg Type of the literal | ||
const std::shared_ptr<PrimitiveType>& type() const; | ||
|
||
/// Cast the literal to a specific type | ||
Result<PrimitiveLiteral> CastTo( | ||
const std::shared_ptr<PrimitiveType>& target_type) const; | ||
|
||
std::partial_ordering operator<=>(const PrimitiveLiteral& other) const; | ||
|
||
std::string ToString() const; | ||
|
||
private: | ||
PrimitiveLiteralValue value_; | ||
std::shared_ptr<PrimitiveType> type_; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. don't know |
||
}; | ||
|
||
} // namespace iceberg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ string not requires utf8, so I wonder whether
std::vector<uint8_t>
with string is possibleThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small String Optimization (SSO) will make PrimitiveLiteralValue 40 bytes, may be use a unique_ptr to own the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, Literal might be used as vector, so more bytes is not critical?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the data part of vector will not be in Literal's layout?
If we are not going to use something like std::vector, I think it's fine with the current design.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emmm, a tiny wrapper like this should be defined if unique_ptr should be added. I'm also ok for the case
In libc++ version,
sizeof(PrimitiveLiteralValue) == 48
, sincestd::vector
is large, it's still 48b after change toStringLiteralValue
.After change to below:
The sizeof
sizeof(PrimitiveLiteralValue) == 48
becomes 40B, just eliminate 8B 😅. I think it's due to C++ doesn't do any niche optimization for std::variant... If the largest element inPrimitiveLiteralValue
is 8B,sizeof(PrimitiveLiteralValue)
can be 32B. So I think currently std::string is enough here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the analysis, let's keep std::string :)