Skip to content

Conversation

@andygrove
Copy link
Member

Which issue does this PR close?

Part of #2955

Rationale for this change

Improve coverage of string expression benchmarks.

What changes are included in this PR?

How are these changes tested?

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
ascii:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               223            230          15          4.7         212.3       1.0X
Comet (Scan)                                        225            253          48          4.7         214.7       1.0X
Comet (Scan + Exec)                                 264            268           4          4.0         251.7       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
bit_length:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               198            228          52          5.3         188.7       1.0X
Comet (Scan)                                        202            204           1          5.2         192.8       1.0X
Comet (Scan + Exec)                                 249            252           3          4.2         237.7       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
chr:                                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               217            226          11          4.8         207.3       1.0X
Comet (Scan)                                        227            235          14          4.6         216.2       1.0X
Comet (Scan + Exec)                                 342            349           7          3.1         326.2       0.6X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
concat:                                   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               239            314         150          4.4         227.9       1.0X
Comet (Scan)                                        249            295          99          4.2         237.7       1.0X
Comet (Scan + Exec)                                 288            299           7          3.6         274.8       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
concat_ws:                                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               246            248           2          4.3         234.9       1.0X
Comet (Scan)                                        255            270          34          4.1         243.1       1.0X
Comet (Scan + Exec)                                 288            302          29          3.6         274.4       0.9X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
contains:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               656            677          19          1.6         626.0       1.0X
Comet (Scan)                                        447            458          12          2.3         426.3       1.5X
Comet (Scan + Exec)                                 271            285          14          3.9         258.7       2.4X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
endswith:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               195            202          11          5.4         185.8       1.0X
Comet (Scan)                                        202            203           1          5.2         192.5       1.0X
Comet (Scan + Exec)                                 278            280           2          3.8         265.4       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
initCap:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1007           1022          20          1.0         960.8       1.0X
Comet (Scan)                                       1016           1033          23          1.0         969.3       1.0X
Comet (Scan + Exec)                                1163           1164           2          0.9        1109.0       0.9X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
instr:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1258           1270          18          0.8        1199.5       1.0X
Comet (Scan)                                       1373           1406          47          0.8        1309.5       0.9X
Comet (Scan + Exec)                                 839            846           7          1.2         800.5       1.5X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
length:                                   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1217           1220           5          0.9        1160.2       1.0X
Comet (Scan)                                       1135           1138           3          0.9        1082.8       1.1X
Comet (Scan + Exec)                                 371            375           2          2.8         353.8       3.3X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
like:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               694            741          57          1.5         661.9       1.0X
Comet (Scan)                                        448            452           6          2.3         426.8       1.6X
Comet (Scan + Exec)                                 277            282           3          3.8         264.3       2.5X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
lower:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               731            755          21          1.4         697.6       1.0X
Comet (Scan)                                        693            696           3          1.5         661.1       1.1X
Comet (Scan + Exec)                                 389            394           5          2.7         371.2       1.9X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
lpad:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1606           1686         113          0.7        1532.0       1.0X
Comet (Scan)                                       1479           1496          23          0.7        1410.4       1.1X
Comet (Scan + Exec)                                 336            345           9          3.1         320.2       4.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
ltrim:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               204            205           2          5.2         194.2       1.0X
Comet (Scan)                                        200            215          22          5.3         190.4       1.0X
Comet (Scan + Exec)                                 303            306           4          3.5         288.6       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
octet_length:                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               192            194           1          5.5         182.9       1.0X
Comet (Scan)                                        196            201           6          5.4         186.9       1.0X
Comet (Scan + Exec)                                 248            250           2          4.2         236.5       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
regexp_replace:                           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                             12072          12084          17          0.1       11512.9       1.0X
Comet (Scan)                                      11694          11708          20          0.1       11152.0       1.0X
Comet (Scan + Exec)                               11521          11648         179          0.1       10987.5       1.0X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
repeat:                                   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               241            260          37          4.4         229.6       1.0X
Comet (Scan)                                        264            267           2          4.0         251.8       0.9X
Comet (Scan + Exec)                                 342            356          10          3.1         326.4       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
replace:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               895            911          20          1.2         853.8       1.0X
Comet (Scan)                                        907            910           4          1.2         865.2       1.0X
Comet (Scan + Exec)                                 526            534           5          2.0         502.0       1.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
reverse:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              3048           3051           5          0.3        2906.5       1.0X
Comet (Scan)                                       2260           2266           8          0.5        2155.3       1.3X
Comet (Scan + Exec)                                 421            425           2          2.5         401.6       7.2X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
rlike:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               867            875           9          1.2         826.7       1.0X
Comet (Scan)                                        879            906          32          1.2         837.9       1.0X
Comet (Scan + Exec)                                 876            936          53          1.2         835.1       1.0X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
rpad:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1559           1564           7          0.7        1487.2       1.0X
Comet (Scan)                                       1504           1504           1          0.7        1434.0       1.0X
Comet (Scan + Exec)                                 324            337          12          3.2         308.7       4.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
rtrim:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               188            195           5          5.6         179.1       1.0X
Comet (Scan)                                        196            204           9          5.4         186.9       1.0X
Comet (Scan + Exec)                                 326            342          12          3.2         311.2       0.6X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
space:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                28             29           1         37.5          26.7       1.0X
Comet (Scan)                                         27             29           1         38.4          26.1       1.0X
Comet (Scan + Exec)                                  34             35           1         31.3          32.0       0.8X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
startswith:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               178            180           2          5.9         169.3       1.0X
Comet (Scan)                                        184            186           3          5.7         175.2       1.0X
Comet (Scan + Exec)                                 269            277          13          3.9         256.5       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
substring:                                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              1408           1410           3          0.7        1342.7       1.0X
Comet (Scan)                                       1330           1332           4          0.8        1268.0       1.1X
Comet (Scan + Exec)                                 313            326           9          3.4         298.4       4.5X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
translate:                                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                              6609           6779         241          0.2        6302.5       1.0X
Comet (Scan)                                       6794           6957         230          0.2        6479.5       1.0X
Comet (Scan + Exec)                                8924           8958          49          0.1        8510.3       0.7X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
trim:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               188            193           9          5.6         179.0       1.0X
Comet (Scan)                                        197            199           3          5.3         187.5       1.0X
Comet (Scan + Exec)                                 330            334           2          3.2         315.0       0.6X

OpenJDK 64-Bit Server VM 17.0.17+10-Ubuntu-122.04 on Linux 6.8.0-90-generic
AMD Ryzen 9 7950X3D 16-Core Processor
upper:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                               748            758          17          1.4         712.9       1.0X
Comet (Scan)                                        706            707           1          1.5         673.7       1.1X
Comet (Scan + Exec)                                 397            404           5          2.6         378.8       1.9X

@codecov-commenter
Copy link

codecov-commenter commented Dec 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.55%. Comparing base (f09f8af) to head (b471c32).
⚠️ Report is 806 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3012      +/-   ##
============================================
+ Coverage     56.12%   59.55%   +3.42%     
- Complexity      976     1368     +392     
============================================
  Files           119      167      +48     
  Lines         11743    15495    +3752     
  Branches       2251     2568     +317     
============================================
+ Hits           6591     9228    +2637     
- Misses         4012     4971     +959     
- Partials       1140     1296     +156     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove changed the title improve string benchmarks chore: Improve string benchmarks Dec 29, 2025
@andygrove andygrove changed the title chore: Improve string benchmarks chore: Improve string expression microbenchmarks Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants