Skip to content

Conversation

@JATAYU000
Copy link
Contributor

@JATAYU000 JATAYU000 commented Dec 29, 2025

Metadata

@codecov-commenter
Copy link

codecov-commenter commented Dec 31, 2025

Codecov Report

❌ Patch coverage is 77.27273% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.91%. Comparing base (c5f68bf) to head (8450715).

Files with missing lines Patch % Lines
openml/setups/setup.py 60.00% 4 Missing ⚠️
openml/datasets/data_feature.py 50.00% 3 Missing ⚠️
openml/tasks/split.py 57.14% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1567      +/-   ##
==========================================
- Coverage   53.02%   52.91%   -0.12%     
==========================================
  Files          36       36              
  Lines        4326     4328       +2     
==========================================
- Hits         2294     2290       -4     
- Misses       2032     2038       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: have you considered inheriting from OpenMLBase, or at least move the __repr__ related logic to a common place, instead of writing a new one?

Not saying that this is how it has to be done, I would like to hear your rationale.

@JATAYU000
Copy link
Contributor Author

Question: have you considered inheriting from OpenMLBase, or at least move the __repr__ related logic to a common place, instead of writing a new one?

OpenMLSplit is independent, so making it inherit OpenMLBase just to reuse two methods did not feel right. I hadn’t considered moving __repr__ to a common place before, now that I think about it creating a small Mixin class would be better? ex ReprMixin which defines __repr__ and _apply_repr_template which expects the class inheriting from it implement _get_repr_body_fields
What is your opinion on that?

@fkiraly
Copy link
Collaborator

fkiraly commented Jan 2, 2026

What is your opinion on that?

I think it might be a good idea, simply because of the DRY principle.
Although I am not 100% sure how it works out.

@geetu040
Copy link
Contributor

geetu040 commented Jan 2, 2026

can we put this in utility functions? I don't think that's going to be clean, then ReprMixin class sounds good.

@SimonBlanke
Copy link
Collaborator

Question: have you considered inheriting from OpenMLBase, or at least move the __repr__ related logic to a common place, instead of writing a new one?

Not saying that this is how it has to be done, I would like to hear your rationale.

OpenMLSplit should not inherit from OpenMLBase, because it is not a server-entity. Moving the __repr__ logic can make sense. But I would not do it in this PR. I would rather do quick PRs and followup issues/PRs for refactoring.

@JATAYU000 JATAYU000 changed the title Added __repr__ method from OpenMLSplit Make all OpenML classes to inherit ReprMixin Jan 7, 2026
EmanAbdelhaleem and others added 8 commits January 8, 2026 10:51
…as non-strict expected fail. (openml#1587)

#### Metadata
* Reference Issue: Temporarily fix issue openml#1586

#### Details 
- Running the pytest locally, I found only one failed test which is: `tests/test_runs/test_run_functions.py::test__run_task_get_arffcontent_2`
- However, when trying to go through the failed tests in the recent runed jobs in different recent PRs, I found many other failed tests, I picked some of them and tried to make a kind of analysis, and here are my findings:

##### Primary Failure Patterns
1. OpenML Test Server Issues (Most Common)
The majority of failures are caused by:
  - `OpenMLServerError: Unexpected server error when calling https://test.openml.org/... with Status code: 500`
  - Database connection errors: `Database connection error. Usually due to high server load. Please wait N seconds and try again.`
  - Timeout errors: `TIMEOUT: Failed to fetch uploaded dataset`

2. Cache/Filesystem Issues
  - `ValueError: Cannot remove faulty tasks cache directory ... Please do this manually!`
  - `FileNotFoundError: No such file or directory`

3. Data Format Issues 
  - `KeyError: ['type'] not found in axis`
  - `KeyError: ['class'] not found in axis`
  - `KeyError: ['Class'] not found in axis`
…openml#1556)

#### Metadata

* Reference Issue: Fixes openml#1542
 
#### Details 
Fixed sklearn models detection by safely importing openml-sklearn at `openml/runs/__init__.py`
…#1559)

I have Refactored the `OpenMLEvaluation` class from a traditional Python class to use the `@dataclass` decorator to reduce boilerplate code and improve code maintainability.

#### Metadata
* Reference Issue: openml#1540 
* New Tests Added: No
* Documentation Updated: No
* Change Log Entry: Refactored the `OpenMLEvaluation` class to use the `@dataclass`

#### Details 
Edited the `OpenMLEvaluation` class in `openml\evaluations\evaluation.py` to use `@dataclass` decorator. This significantly reduces the boilerplate code in the following places:

- Instance Variable Definitions

**Before:**
```python
def __init__(
    self,
    run_id: int,
    task_id: int,
    setup_id: int,
    flow_id: int,
    flow_name: str,
    data_id: int,
    data_name: str,
    function: str,
    upload_time: str,
    uploader: int,
    uploader_name: str,
    value: float | None,
    values: list[float] | None,
    array_data: str | None = None,
):
    self.run_id = run_id
    self.task_id = task_id
    self.setup_id = setup_id
    self.flow_id = flow_id
    self.flow_name = flow_name
    self.data_id = data_id
    self.data_name = data_name
    self.function = function
    self.upload_time = upload_time
    self.uploader = uploader
    self.uploader_name = uploader_name
    self.value = value
    self.values = values
    self.array_data = array_data
```

**After:**
```python
run_id: int
task_id: int
setup_id: int
flow_id: int
flow_name: str
data_id: int
data_name: str
function: str
upload_time: str
uploader: int
uploader_name: str
value: float | None
values: list[float] | None
array_data: str | None = None
```

-  _to_dict Method Simplification

**Before:**
```python
def _to_dict(self) -> dict:
    return {
        "run_id": self.run_id,
        "task_id": self.task_id,
        "setup_id": self.setup_id,
        "flow_id": self.flow_id,
        "flow_name": self.flow_name,
        "data_id": self.data_id,
        "data_name": self.data_name,
        "function": self.function,
        "upload_time": self.upload_time,
        "uploader": self.uploader,
        "uploader_name": self.uploader_name,
        "value": self.value,
        "values": self.values,
        "array_data": self.array_data,
    }
```

**After:**
```python
def _to_dict(self) -> dict:
    return asdict(self)
```
All tests are passing with accordnce to the changes:

```bash
PS C:\Users\ASUS\Documents\work\opensource\openml-python> pytest tests/test_evaluations/
======================================= test session starts =======================================
platform win32 -- Python 3.14.0, pytest-9.0.2, pluggy-1.6.0
rootdir: C:\Users\ASUS\Documents\work\opensource\openml-python
configfile: pyproject.toml
plugins: anyio-4.12.0, flaky-3.8.1, asyncio-1.3.0, cov-7.0.0, mock-3.15.1, rerunfailures-16.1, timeout-2.4.0, xdist-3.8.0, requests-mock-1.12.1
asyncio: mode=Mode.STRICT, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 13 items                                                                                 

tests\test_evaluations\test_evaluation_functions.py ............                             [ 92%]
tests\test_evaluations\test_evaluations_example.py .                                         [100%]

================================= 13 passed in 274.80s (0:04:34) ================================== 
```
…enml#1566)

#### Metadata
* Reference Issue: Fixes openml#1531
* New Tests Added: No
* Documentation Updated: Yes
* Change Log Entry: Update supported Python version range to 3.10–3.14 and extend CI testing to Python 3.14


#### Details 
This pull request updates the officially supported Python version range for openml-python from 3.8–3.13 to 3.10–3.14, in line with currently supported Python releases.

The following changes were made:

Updated pyproject.toml to reflect the new supported Python range (3.10–3.14).

Extended GitHub Actions CI workflows (test.yml, dist.yaml, docs.yaml) to include Python 3.14.

Updated documentation (README.md) wherever Python version support is mentioned.

No new functionality or tests were introduced; this is a maintenance update to keep Python version support and CI configuration up to date.

This change ensures that users and contributors can use and test openml-python on the latest supported Python versions.
Fixes openml#1598

This PR adds the `@pytest.mark.uses_test_server()` marker to tests that depend on the OpenML test server.

Changes
* added `uses_test_server` on the relevant test sets.
* replaced all the `server` markers with `uses_test_server` marker
* removed all the `@pytest.mark.xfail(reason="failures_issue_1544", strict=False)` where the failure was due to race-conditions or server connectivity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add __repr__ Methods to OpenMLSplit

8 participants