Skip to content

Conversation

@pankajbaid567
Copy link

Metadata

Details

What does this PR implement/fix?

This PR adds three new CLI subcommands under openml tasks to improve the user experience of the task catalogue:

  • openml tasks list - List tasks with optional filtering (tag, task-type, status, data-name, pagination, output format)
  • openml tasks info <task_id> - Display detailed information about a specific task including task type, dataset, target feature, evaluation measure, and class labels
  • openml tasks search <query> - Search tasks by associated dataset name with case-insensitive matching

Why is this change necessary? What is the problem it solves?

Currently, users must write Python code to browse or search OpenML tasks, even for simple operations like listing available tasks or finding tasks for a specific dataset. This creates a barrier to entry and makes the task catalogue less accessible. Adding CLI commands allows users to interact with the task catalogue directly from the command line without writing code.

This directly addresses the ESoC 2025 goal of "Improving user experience of the task catalogue in AIoD and OpenML".

How can I reproduce the issue this PR is solving and its solution?

Before (requires Python code):

import openml
tasks = openml.tasks.list_tasks(size=10)
for tid in tasks:
    task = openml.tasks.get_task(tid)
    print(f"{tid}: {task.task_type}")```
After (CLI commands):

bash

List first 10 tasks

openml tasks list --size 10

Search for tasks related to iris dataset

openml tasks search iris

Get detailed info about a task

openml tasks info 1

List classification tasks with a specific tag

openml tasks list --task-type "Supervised Classification" --tag study_14 --format table

List tasks with pagination

openml tasks list --offset 20 --size 10```
Implementation Details:

Added three new functions in
openml/cli.py: tasks_list(), tasks_info() , and tasks_search()
Added handler function
tasks_handler() to route subcommands
Integrated into main CLI parser with proper argument handling
Added comprehensive test suite in tests/test_openml/test_cli.py

Uses existing openml.tasks.list_tasks() and openml.tasks.get_task() functions - no changes to core API
Follows existing CLI patterns (similar to
configure
command)
All tests use mocked API calls to avoid requiring server connections
Any other comments?

All pre-commit hooks pass (ruff, mypy, formatting)
No breaking changes
Follows project code style and patterns
Ready for review

Add three new CLI subcommands under 'openml tasks':
- openml tasks list: List tasks with optional filtering
- openml tasks info: Display detailed task information
- openml tasks search: Search tasks by dataset name (case-insensitive)

Features:
- Support for multiple filter options (tag, task-type, status, data-name)
- Output formatting (table/json) with verbose mode
- Pagination support (offset, size)
- Comprehensive test suite with mocked API calls
- Proper error handling

Addresses ESoC 2025 goal of improving user experience of the task catalogue.

Related to issue openml#1486
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants