-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: to_time function
#19540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: to_time function
#19540
Conversation
409e183 to
118f1ed
Compare
118f1ed to
9917de9
Compare
| let seconds = time.second() as i64; | ||
| let nanos = time.nanosecond() as i64; | ||
|
|
||
| hours * 3_600_000_000_000 + minutes * 60_000_000_000 + seconds * 1_000_000_000 + nanos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to extract these out to constants
const NANOS_PER_SECOND: i64 = 1_000_000_000;
const NANOS_PER_MINUTE: i64 = 60 * NANOS_PER_SECOND;
const NANOS_PER_HOUR: i64 = 60 * NANOS_PER_MINUTE;I think there is existing constants for some of these already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, much better.
|
Thank you for this PR. I would like to see some additional slt tests
Support timestamp as input, extract time from it. |
Sure, I will add those. Thanks for feedback. |
- Extract magic numbers to NANOS_PER_SECOND, NANOS_PER_MINUTE, NANOS_PER_HOUR constants - Use Arrow's NANOSECONDS constant as base - Add timestamp input support (extracts time portion) - Add SLT tests for: - HH:MM default parsing (no seconds) - Out of range minutes/seconds - Timestamp input (various timestamp types) - Null timestamp handling
| } | ||
| } | ||
|
|
||
| fn to_time(&self, args: &[ColumnarValue]) -> Result<ColumnarValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved out of ToTimeFunc if it doesn't use self
| if let ColumnarValue::Scalar(ScalarValue::Utf8(Some(s))) | ||
| | ColumnarValue::Scalar(ScalarValue::LargeUtf8(Some(s))) | ||
| | ColumnarValue::Scalar(ScalarValue::Utf8View(Some(s))) = arg | ||
| { | ||
| Some(s.as_str()) | ||
| } else { | ||
| None | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ignores array inputs without erroring?
| } | ||
|
|
||
| /// Convert NaiveTime to nanoseconds since midnight | ||
| fn time_to_nanos(time: NaiveTime) -> i64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse arrow function instead: https://docs.rs/arrow/latest/arrow/array/temporal_conversions/fn.time_to_time64ns.html
| # Timestamp with timezone | ||
|
|
||
| query D | ||
| select to_time(to_timestamp('2024-01-15T14:30:45+00:00')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a timezone that isn't UTC?
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much of these tests are cases that can be done in SLT instead?
| ---- | ||
| 00:00:00 | ||
|
|
||
| # Timestamp input - extract time portion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a test case for timestamps before the epoch
| ColumnarValue::Array(array) => { | ||
| let len = array.len(); | ||
| let mut builder: PrimitiveBuilder<Time64NanosecondType> = | ||
| PrimitiveArray::builder(len); | ||
|
|
||
| match array.data_type() { | ||
| Timestamp(arrow::datatypes::TimeUnit::Nanosecond, _) => { | ||
| let ts_array = | ||
| array.as_primitive::<arrow::datatypes::TimestampNanosecondType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can replace all this logic using arrow cast kernel
| ColumnarValue::Scalar(scalar) => { | ||
| let nanos = match scalar { | ||
| ScalarValue::TimestampNanosecond(Some(ts), _) => *ts % NANOS_PER_DAY, | ||
| ScalarValue::TimestampMicrosecond(Some(ts), _) => { | ||
| (*ts * 1_000) % NANOS_PER_DAY | ||
| } | ||
| ScalarValue::TimestampMillisecond(Some(ts), _) => { | ||
| (*ts * 1_000_000) % NANOS_PER_DAY | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here; in this case we can create arrays of size 1 to pass into the arrow cast kernel and recreate a scalar value from the output, to reuse it's code
Which issue does this PR close?
Rationale for this change
Previously we needed to create timestamp and then extract the time component from that.
What changes are included in this PR?
to_timeAre these changes tested?
Are there any user-facing changes?