Add CLI commands for browsing and searching OpenML tasks #1509
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Metadata
openml tasks list,openml tasks info, andopenml tasks search"Details
What does this PR implement/fix?
This PR adds three new CLI subcommands under
openml tasksto improve the user experience of the task catalogue:openml tasks list- List tasks with optional filtering (tag, task-type, status, data-name, pagination, output format)openml tasks info <task_id>- Display detailed information about a specific task including task type, dataset, target feature, evaluation measure, and class labelsopenml tasks search <query>- Search tasks by associated dataset name with case-insensitive matchingWhy is this change necessary? What is the problem it solves?
Currently, users must write Python code to browse or search OpenML tasks, even for simple operations like listing available tasks or finding tasks for a specific dataset. This creates a barrier to entry and makes the task catalogue less accessible. Adding CLI commands allows users to interact with the task catalogue directly from the command line without writing code.
This directly addresses the ESoC 2025 goal of "Improving user experience of the task catalogue in AIoD and OpenML".
How can I reproduce the issue this PR is solving and its solution?
Before (requires Python code):
bash
List first 10 tasks
openml tasks list --size 10
Search for tasks related to iris dataset
openml tasks search iris
Get detailed info about a task
openml tasks info 1
List classification tasks with a specific tag
openml tasks list --task-type "Supervised Classification" --tag study_14 --format table
List tasks with pagination
openml tasks list --offset 20 --size 10```
Implementation Details:
Added three new functions in
openml/cli.py: tasks_list(), tasks_info() , and tasks_search()
Added handler function
tasks_handler() to route subcommands
Integrated into main CLI parser with proper argument handling
Added comprehensive test suite in tests/test_openml/test_cli.py
Uses existing openml.tasks.list_tasks() and openml.tasks.get_task() functions - no changes to core API
Follows existing CLI patterns (similar to
configure
command)
All tests use mocked API calls to avoid requiring server connections
Any other comments?
All pre-commit hooks pass (ruff, mypy, formatting)
No breaking changes
Follows project code style and patterns
Ready for review