-
-
Notifications
You must be signed in to change notification settings - Fork 13
Dataset/upload #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Dataset/upload #209
Conversation
Some of those aspects are not server defined but simply outside of the dataset table and are not expected during upload
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Reviewer's GuideThis PR reorganizes dataset metadata schemas by introducing optional fields and a dedicated view model, implements a new upload endpoint with authentication and parquet validation, updates retrieval and conversion logic to use the new view, adjusts tests accordingly, and adds the multipart dependency. Sequence diagram for the new dataset upload endpoint with authentication and file validationsequenceDiagram
actor User
participant API as /datasets (upload_data)
participant Auth as fetch_user
User->>API: POST /datasets (file, metadata)
API->>Auth: fetch_user
Auth-->>API: User or None
alt User is None
API-->>User: 401 Unauthorized
else User is authenticated
API->>API: Check file extension == .pq
alt Not .pq
API-->>User: 400 Bad Request
else Valid parquet file
API->>API: (TODO: async file handling)
API-->>User: 200 OK (or success response)
end
end
Class diagram for updated DatasetMetadata and new DatasetMetadataViewclassDiagram
class DatasetMetadata {
+str name
+str licence
+int version
+str|None version_label
+str|None language
+list~str~ creators
+list~str~ contributors
+str|None citation
+HttpUrl|None paper_url
+str|None collection_date
+str description
+list~str~ default_target_attribute
+list~str~ ignore_attribute
+list~str~ row_id_attribute
+DatasetFileFormat format_
+list~HttpUrl~|None original_data_url
}
class DatasetMetadataView {
+int id_
+Visibility visibility
+DatasetStatus status
+int description_version
+list~str~ tags
+datetime upload_date
+str|None processing_error
+str|None processing_warning
+int file_id
+HttpUrl url
+HttpUrl|None parquet_url
+str md5_checksum
}
DatasetMetadataView --|> DatasetMetadata
Class diagram for updated openml_dataset_to_dcat conversionclassDiagram
class openml_dataset_to_dcat {
+DcatApWrapper openml_dataset_to_dcat(DatasetMetadataView metadata)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #209 +/- ##
=======================================
Coverage ? 77.74%
=======================================
Files ? 51
Lines ? 1865
Branches ? 146
=======================================
Hits ? 1450
Misses ? 379
Partials ? 36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary by Sourcery
Introduce a dataset upload endpoint with parquet validation, refactor the OpenML dataset metadata into separate request and view models with additional fields, update the GET handler to return the new view model, and update tests and dependencies accordingly
New Features:
Enhancements:
Build:
Tests:
Chores: