Skip to content

Conversation

@stenlarsson
Copy link
Contributor

@stenlarsson stenlarsson commented Dec 14, 2025

Rationale for this change

The MakeStructOptions class is not available in GLib/Ruby, and it is used together with the make_struct compute function.

What changes are included in this PR?

This adds the MakeStructOptions class to GLib.

Are these changes tested?

Yes, with Ruby unit tests.

Are there any user-facing changes?

Yes, a new class.

@stenlarsson stenlarsson requested a review from kou as a code owner December 14, 2025 14:46
@github-actions
Copy link

⚠️ GitHub issue #48490 has been automatically assigned in GitHub to PR creator.

@stenlarsson
Copy link
Contributor Author

The class is missing the field_nullability and field_metadata properties because I ran into limitations of the gobject-introspection gem, e.g.: NotImplementedError: TODO: GIArgument(GList)[ghash] -> Ruby. I couldn't figure out what types the properties should have for it to work.

@kou
Copy link
Member

kou commented Dec 20, 2025

Could you push the code that raises NotImplementedError: TODO: GIArgument(GList)[ghash] -> Ruby? I'll improve gobject-introspection gem for the case.

@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from 85e630c to 730c87a Compare December 20, 2025 14:01
@stenlarsson
Copy link
Contributor Author

I pushed some code. I couldn't figure out what types the properties should have, so I added getters and setters instead, but it doesn't work because of the limitations in gobject-introspection. I'm sure there are more problems with this code, since I can't test it. Perhaps there is a better way to do it?

@kou
Copy link
Member

kou commented Dec 21, 2025

Thanks. I understand this situation and I also consider API.

How about the following API?

GArrowMakeStructOptions *
garrow_make_struct_options_new(void);
void
garrow_make_struct_options_add_field(GArrowMakeStructOptions *options, const char *name, gboolean nullability, GHashTable *metadata);

// no getter or
const char *
garrow_make_struct_options_get_field_name(GArrowMakeStructOptions *options, gsize i);
gboolean
garrow_make_struct_options_get_field_nullability(GArrowMakeStructOptions *options, gsize i);
GHashTable *
garrow_make_struct_options_get_field_metadta(GArrowMakeStructOptions *options, gsize i);
GArrowMakeStructOptions *options = garrow_make_struct_options_new();
garrow_make_struct_options_add_field(options, name1, nullability1, metadata1);
garrow_make_struct_options_add_field(options, name2, nullability2, metadata2);
...

@stenlarsson
Copy link
Contributor Author

I implemented your suggested API. I have some code in my own project that automatically generates Ruby code from the function doc, and it wouldn't work with this API. Still, I guess it is the best we can do.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some code in my own project that automatically generates Ruby code from the function doc

Wow! How do you implement it?

BTW, we can provide convenient converter in Ruby like https://github.com/apache/arrow/blob/main/ruby/red-arrow/lib/arrow/equal-options.rb does.

}

enum {
PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep this? I think that we should not provided this because mixing field_names and add_field() isn't expected. For example, add_field(); field_names = [...] removes information provided by the first add_field().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it can be confusing, but I think this is more convenient if you don't care about nullability or metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. How about providing convenient APIs like this in Ruby something like the following?

module Arrow
  class MakeStructOptions
    def field_names=(names)
      raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
      names.each do |name|
        add_field(name, true, nil)
      end
    end
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you call #field_names= twice?

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 22, 2025
@stenlarsson
Copy link
Contributor Author

Wow! How do you implement it?

The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb

It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code.

However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient.

@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from d665a82 to 8e58c8b Compare December 22, 2025 12:35
@github-actions github-actions bot added awaiting review Awaiting review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Dec 22, 2025
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb

Thanks. It's very helpful!

It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code.

Good point. I want to add similar features to Red Arrow but I want to use dynamic method definitions instead of code generation because Acero functions can be added dynamically.

Does the dynamic method definition work with IDE integration?

However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient.

We can add convenient API in Ruby.

For example, Arrow::Function#resolve_options provides a try_convert hook:

def resolve_options(options)
return nil if options.nil?
return options if options.is_a?(FunctionOptions)
arrow_options_class = options_type&.to_class
if arrow_options_class
if arrow_options_class.respond_to?(:try_convert)
arrow_options = arrow_options_class.try_convert(options)
return arrow_options if arrow_options
end
arrow_options = (default_options || arrow_options_class.new)
else
arrow_options = default_options
end
return arrow_options if arrow_options.nil?
options.each do |key, value|
setter = :"#{key}="
next unless arrow_options.respond_to?(setter)
arrow_options.__send__(setter, value)
end
arrow_options
end

Arrow::MakeStructOptions.try_convert can accept {field_names:, field_nullability:, field_metadata:}:

module Arrow
  class MakeStructOptions
    class << self
      def try_convert(value)
        case value
        when Hash
          options = new
          field_names = value[:field_names]
          field_nullability = value[:field_nullability] || []
          field_metadta = value[:field_metadata] || []
          field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        else
          nil
        end
      end
    end
  end
end

[field, ...] can also be acceptable:

module Arrow
  class MakeStructOptions
    class << self
      def try_convert(value)
        case value
        when ::Array
          options = new
          value.each do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        when Hash
          options = new
          field_names = value[:field_names]
          field_nullability = value[:field_nullability] || []
          field_metadta = value[:field_metadata] || []
          field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        else
          nil
        end
      end
    end
  end
end

}

enum {
PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. How about providing convenient APIs like this in Ruby something like the following?

module Arrow
  class MakeStructOptions
    def field_names=(names)
      raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
      names.each do |name|
        add_field(name, true, nil)
      end
    end
  end
end

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 23, 2025
@github-actions github-actions bot added awaiting review Awaiting review awaiting committer review Awaiting committer review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Dec 29, 2025
@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from 83e908a to 323eef7 Compare December 29, 2025 11:59
@stenlarsson
Copy link
Contributor Author

Good point. I want to add similar features to Red Arrow but I want to use dynamic method definitions instead of code generation because Acero functions can be added dynamically.

Yes, that makes sense for something included with Arrow. In my case it doesn't matter since I can generate the code afterwards.

Does the dynamic method definition work with IDE integration?

No, unfortunately IDE integration for Ruby is pretty bad.

We can add convenient API in Ruby.

This could work, but I implemented another solution. This adds field-nullability and field-metadata as boxed properties of type. These aren't automatically converted by gobject-introspection, so to work around this, I implemented the getter and setter methods to the Ruby extension.

If support is added to gobject-introspection in the future, we can just remove the code from the extension without changing the API. I'm not sure what the type annotations for such properties are supposed to look like however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants