-
-
Notifications
You must be signed in to change notification settings - Fork 13
feat: Migrate from URI-encoded API keys to Authorization header #210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Migrate from URI-encoded API keys to Authorization header #210
Conversation
- Replace query parameter-based API key authentication with Authorization header
- Implement FastAPI's APIKeyHeader security scheme in dependencies.py
- Update fetch_user() to extract API key from Authorization header using Depends(api_key_header)
- Update all test files to use headers={'Authorization': key} instead of ?api_key= query params
- Set auto_error=False to maintain backward compatibility for public endpoints
This change improves security by preventing API keys from appearing in:
- Server logs
- Browser history
- Proxy caches
- Referrer headers
Follows FastAPI security best practices and RESTful API standards.
Reviewer's GuideThis PR refactors API authentication to leverage FastAPI’s APIKeyHeader mechanism, migrating from URI-encoded API keys in query parameters to Authorization headers and updating dependant router logic and tests accordingly. Sequence diagram for API key authentication via Authorization headersequenceDiagram
participant Client
participant API
participant "fetch_user()"
Client->>API: GET /datasets/list (with Authorization header)
API->>"fetch_user()": Depends(api_key_header)
"fetch_user()"->>API: Returns User or None
API-->>Client: Response
Class diagram for updated fetch_user dependency injectionclassDiagram
class APIKeyHeader {
+name: str
+auto_error: bool
}
class fetch_user {
+api_key: APIKey | None (Injected via Depends(api_key_header))
+user_data: Connection (Injected via Depends(userdb_connection))
+return: User | None
}
APIKeyHeader <.. fetch_user: injects api_key
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughMigrates API authentication in tests from URL query parameters to the HTTP Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant Tests
participant Router
participant APIKeyHeader as api_key_header
participant FetchUser as fetch_user
participant DB as userdb_connection
participant User as User.fetch
Client->>Tests: send test request with Authorization header
Tests->>Router: HTTP request (Authorization header)
Router->>APIKeyHeader: invoke Depends(api_key_header)
APIKeyHeader-->>Router: APIKey | None
Router->>FetchUser: call fetch_user(api_key=APIKey)
FetchUser->>DB: invoke Depends(userdb_connection)
DB-->>FetchUser: Connection (user_data)
FetchUser->>User: User.fetch(APIKey, user_data)
User-->>FetchUser: User | None
FetchUser-->>Router: return User | None
Router-->>Tests: HTTP response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider temporarily supporting the old api_key query parameter alongside the new Authorization header (e.g., via Depends) and logging a deprecation warning to smooth client migration.
- Enforce or validate a standard Authorization header format (such as 'Bearer ') to align with common API conventions and avoid ambiguous tokens.
- Clean up leftover query‐param logic (e.g. unused api_key_query variables in migration tests) to keep the codebase concise and clear.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider temporarily supporting the old api_key query parameter alongside the new Authorization header (e.g., via Depends) and logging a deprecation warning to smooth client migration.
- Enforce or validate a standard Authorization header format (such as 'Bearer <token>') to align with common API conventions and avoid ambiguous tokens.
- Clean up leftover query‐param logic (e.g. unused api_key_query variables in migration tests) to keep the codebase concise and clear.
## Individual Comments
### Comment 1
<location> `tests/routers/openml/datasets_list_datasets_test.py:55-63` </location>
<code_context>
)
def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None:
- key = f"?api_key={api_key}" if api_key else ""
+ headers = {"Authorization": api_key} if api_key else {}
response = py_api.post(
- f"/datasets/list{key}",
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for malformed or missing Authorization headers.
Tests should also cover cases like empty, incorrectly formatted, or multiple Authorization headers to verify error handling and security.
```suggestion
def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None:
headers = {"Authorization": api_key} if api_key else {}
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
headers=headers,
)
assert response.status_code == HTTPStatus.OK, response.json()
assert len(response.json()) == amount
def test_list_accounts_privacy_missing_authorization(py_api: TestClient) -> None:
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
)
assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN)
def test_list_accounts_privacy_empty_authorization(py_api: TestClient) -> None:
headers = {"Authorization": ""}
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
headers=headers,
)
assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN)
def test_list_accounts_privacy_malformed_authorization(py_api: TestClient) -> None:
headers = {"Authorization": "Bearer"} # Missing token
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
headers=headers,
)
assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN)
def test_list_accounts_privacy_incorrect_format_authorization(py_api: TestClient) -> None:
headers = {"Authorization": "Basic not_a_token"}
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
headers=headers,
)
assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN)
def test_list_accounts_privacy_multiple_authorization_headers(py_api: TestClient) -> None:
# TestClient does not support multiple headers with the same name directly,
# but we can simulate by passing a comma-separated value.
headers = {"Authorization": "Bearer valid_token, Bearer another_token"}
response = py_api.post(
"/datasets/list",
json={"status": "all", "pagination": {"limit": 1000}},
headers=headers,
)
assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN)
```
</issue_to_address>
### Comment 2
<location> `tests/routers/openml/dataset_tag_test.py:52` </location>
<code_context>
- f"/datasets/tag?api_key={ApiKey.ADMIN}",
+ "/datasets/tag",
json={"data_id": dataset_id, "tag": tag},
+ headers={"Authorization": ApiKey.ADMIN},
)
assert response.status_code == HTTPStatus.OK
</code_context>
<issue_to_address>
**suggestion (testing):** Test for missing Authorization header is present, but consider testing for extra/invalid headers.
Consider adding a test where the Authorization header contains an unexpected or malformed value to verify the endpoint correctly rejects such requests.
</issue_to_address>
### Comment 3
<location> `tests/routers/openml/migration/datasets_migration_test.py:113` </location>
<code_context>
)
def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None:
- key = f"?api_key={api_key}" if api_key else ""
+ headers = {"Authorization": api_key} if api_key else {}
response = py_api.post(
- f"/datasets/list{key}",
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for both query parameter and header present.
Add a test case with both the query parameter and Authorization header to confirm the correct precedence and maintain backward compatibility.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None: | ||
| key = f"?api_key={api_key}" if api_key else "" | ||
| headers = {"Authorization": api_key} if api_key else {} | ||
| response = py_api.post( | ||
| f"/datasets/list{key}", | ||
| "/datasets/list", | ||
| json={"status": "all", "pagination": {"limit": 1000}}, | ||
| headers=headers, | ||
| ) | ||
| assert response.status_code == HTTPStatus.OK, response.json() | ||
| assert len(response.json()) == amount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding tests for malformed or missing Authorization headers.
Tests should also cover cases like empty, incorrectly formatted, or multiple Authorization headers to verify error handling and security.
| def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None: | |
| key = f"?api_key={api_key}" if api_key else "" | |
| headers = {"Authorization": api_key} if api_key else {} | |
| response = py_api.post( | |
| f"/datasets/list{key}", | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code == HTTPStatus.OK, response.json() | |
| assert len(response.json()) == amount | |
| def test_list_accounts_privacy(api_key: ApiKey | None, amount: int, py_api: TestClient) -> None: | |
| headers = {"Authorization": api_key} if api_key else {} | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code == HTTPStatus.OK, response.json() | |
| assert len(response.json()) == amount | |
| def test_list_accounts_privacy_missing_authorization(py_api: TestClient) -> None: | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| ) | |
| assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN) | |
| def test_list_accounts_privacy_empty_authorization(py_api: TestClient) -> None: | |
| headers = {"Authorization": ""} | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN) | |
| def test_list_accounts_privacy_malformed_authorization(py_api: TestClient) -> None: | |
| headers = {"Authorization": "Bearer"} # Missing token | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN) | |
| def test_list_accounts_privacy_incorrect_format_authorization(py_api: TestClient) -> None: | |
| headers = {"Authorization": "Basic not_a_token"} | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN) | |
| def test_list_accounts_privacy_multiple_authorization_headers(py_api: TestClient) -> None: | |
| # TestClient does not support multiple headers with the same name directly, | |
| # but we can simulate by passing a comma-separated value. | |
| headers = {"Authorization": "Bearer valid_token, Bearer another_token"} | |
| response = py_api.post( | |
| "/datasets/list", | |
| json={"status": "all", "pagination": {"limit": 1000}}, | |
| headers=headers, | |
| ) | |
| assert response.status_code in (HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN) |
| f"/datasets/tag?api_key={ApiKey.ADMIN}", | ||
| "/datasets/tag", | ||
| json={"data_id": dataset_id, "tag": tag}, | ||
| headers={"Authorization": ApiKey.ADMIN}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Test for missing Authorization header is present, but consider testing for extra/invalid headers.
Consider adding a test where the Authorization header contains an unexpected or malformed value to verify the endpoint correctly rejects such requests.
| ) -> None: | ||
| query = f"?api_key={api_key}" if api_key else "" | ||
| response = py_api.get(f"/datasets/130{query}") | ||
| headers = {"Authorization": api_key} if api_key else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding a test for both query parameter and header present.
Add a test case with both the query parameter and Authorization header to confirm the correct precedence and maintain backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/routers/dependencies.py(2 hunks)tests/routers/openml/dataset_tag_test.py(5 hunks)tests/routers/openml/datasets_list_datasets_test.py(9 hunks)tests/routers/openml/datasets_test.py(3 hunks)tests/routers/openml/migration/datasets_migration_test.py(3 hunks)tests/routers/openml/study_test.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
tests/routers/openml/datasets_test.py (1)
tests/conftest.py (1)
py_api(48-53)
tests/routers/openml/dataset_tag_test.py (3)
tests/conftest.py (1)
py_api(48-53)src/database/datasets.py (1)
tag(53-66)tests/users.py (1)
ApiKey(11-15)
src/routers/dependencies.py (1)
src/database/users.py (1)
User(47-63)
tests/routers/openml/migration/datasets_migration_test.py (1)
tests/conftest.py (1)
py_api(48-53)
tests/routers/openml/datasets_list_datasets_test.py (2)
tests/conftest.py (1)
py_api(48-53)tests/users.py (1)
ApiKey(11-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (14)
tests/routers/openml/study_test.py (2)
460-471: Test migration to header-based authentication looks good.The test correctly migrates from query parameter authentication (
?api_key=...) to Authorization header (headers={"Authorization": ...}), maintaining the same test logic and assertions.
515-519: Helper function properly updated for header-based authentication.The
_attach_tasks_to_studyhelper correctly usesheaders={"Authorization": api_key}instead of query parameters, ensuring consistent authentication across all study-related tests.tests/routers/openml/dataset_tag_test.py (2)
17-25: Proper handling of authentication states in tests.The test correctly handles both authenticated and unauthenticated scenarios by conditionally constructing the headers dictionary. The use of an empty dict for
Nonecases and{"Authorization": key}for authenticated cases is clean and appropriate.
33-44: Comprehensive test coverage maintained with header-based authentication.All dataset tagging tests have been consistently updated to use Authorization headers while maintaining the same test logic, assertions, and coverage for different authentication scenarios (admin, non-owner, owner, invalid keys).
Also applies to: 47-55, 58-73, 81-92
tests/routers/openml/migration/datasets_migration_test.py (3)
109-118: Proper authentication handling for private dataset access tests.The test correctly constructs headers conditionally based on whether an API key is provided, testing both authenticated and unauthenticated access scenarios for private datasets.
125-137: Migration parity test correctly compares old and new authentication methods.The test appropriately demonstrates the API migration by having the Python API use header-based authentication (
headers={"Authorization": api_key}) while the PHP API comparison continues using query parameters (?api_key=...), which is expected for parity testing during the migration period.
154-195: Dataset tag parity test properly updated for header-based authentication.The test maintains comprehensive parity checking between old (PHP with query params) and new (Python with headers) authentication methods, ensuring the migration doesn't break existing functionality.
tests/routers/openml/datasets_test.py (3)
169-171: Private dataset feature access test properly authenticates via headers.The parameterized test correctly uses header-based authentication for accessing features of private datasets, maintaining proper test coverage for admin and owner access scenarios.
190-205: Helper function cleanly refactored for header-based authentication.The
_assert_status_update_is_successfulhelper properly encapsulates the authentication logic usingheaders={"Authorization": apikey}, promoting code reuse across multiple status update test cases.
262-273: Unauthorized status update tests maintain proper authentication verification.The parameterized test correctly passes API keys via Authorization headers, ensuring proper testing of authorization failures for non-admin users attempting dataset status updates.
tests/routers/openml/datasets_list_datasets_test.py (3)
55-63: Privacy-aware listing properly implements header-based authentication.The test correctly handles authentication for different user roles (admin, owner, regular user, anonymous) using conditional headers, verifying that private datasets are filtered appropriately based on user permissions.
70-80: Dataset filtering tests comprehensively updated for header authentication.All dataset list filtering tests (by name, version, uploader, etc.) consistently use
headers={"Authorization": api_key}for authenticated requests, maintaining thorough test coverage across different filter combinations.Also applies to: 87-93, 127-136, 139-145, 156-169
253-305: Property-based parity test properly handles both authentication methods.The hypothesis-driven test correctly separates authentication methods: Python API uses
headers={"Authorization": api_key}while the PHP API comparison uses?api_key=...in the query string, ensuring comprehensive parity validation during the migration period.src/routers/dependencies.py (1)
29-32: Query parameter authentication does not exist in the codebase; the review comment's concern is not applicable.The codebase implements authentication exclusively via the
Authorizationheader usingAPIKeyHeader(name="Authorization", auto_error=False). A comprehensive search reveals:
- No
Queryparameter imports or usage for authentication anywhere- All test cases uniformly use header-based authentication (
headers={"Authorization": api_key})- No commented-out query parameter code or legacy patterns
- No migration logic or fallback mechanisms
Since query parameter authentication was never implemented, there is no backward compatibility concern to address or migration path to document.
Likely an incorrect or invalid review comment.
| from typing import Annotated | ||
|
|
||
| from fastapi import Depends | ||
| from fastapi.security import APIKeyHeader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Non-standard Authorization header usage without scheme prefix.
The implementation uses the Authorization header to carry raw API keys without a scheme prefix (e.g., Authorization: <api_key>). This deviates from RFC 7235 and common REST API practices, where the Authorization header should follow the format Authorization: <scheme> <credentials> (e.g., Authorization: Bearer <token> or Authorization: ApiKey <key>).
Recommended alternatives:
- Use a custom header (preferred for API keys):
X-API-KeyorX-Auth-Token - Use a proper scheme:
Authorization: Bearer <api_key>orAuthorization: ApiKey <api_key>
Using a non-standard format may cause confusion for API consumers and compatibility issues with HTTP clients, proxies, or API gateways that expect standard Authorization header formats.
Also applies to: 11-12
🌐 Web query:
What is the standard format for API key authentication in HTTP Authorization headers according to RFC 7235 and REST API best practices?
💡 Result:
Short answer
- RFC 7235 defines the Authorization header as: Authorization: (credentials may be a token68 or auth-params). [1]
- In practice for API keys you should put the key in the Authorization header using a clear auth-scheme token, e.g. Authorization: Bearer (recommended for opaque/JWT tokens per RFC 6750) or Authorization: ApiKey <your_api_key> / Authorization: Api-Key <your_api_key> for a custom scheme. Do not put keys in the URL. [1][2][3][4]
Recommended minimal rules / best practices
- Use the Authorization header (not query string) and always use HTTPS/TLS. [2][3]
- Use a well‑known scheme if available (Bearer for OAuth2 tokens); if you define a custom scheme use a token-style credential (no colon/illegal chars) consistent with RFC 7235. [1][2]
- Make keys short‑lived or rotate them, scope and least‑privilege, log and monitor use, and allow revocation. Avoid storing keys in client-side places that can be leaked. [2][3]
References
- RFC 7235 — HTTP/1.1: Authentication (Authorization header syntax). [1]
- RFC 6750 — OAuth 2.0 Bearer Token Usage (Authorization: Bearer , security recommendations). [2]
- MDN Web Docs — Authorization / WWW-Authenticate header (auth-scheme, token68, auth-param syntax). [3]
- Common REST practice / libraries showing Authorization: Api-Key / ApiKey usage (example implementations). [4]
🏁 Script executed:
#!/bin/bash
# Examine the dependencies.py file to understand the actual implementation
cat src/routers/dependencies.pyLength of output: 976
🏁 Script executed:
#!/bin/bash
# Also search for APIKeyHeader usage patterns in the codebase
rg -n "APIKeyHeader" -A 5 -B 2Length of output: 872
🌐 Web query:
FastAPI APIKeyHeader Authorization header scheme format
💡 Result:
Short answer
- APIKeyHeader just extracts the raw header value (by header name) and returns it as a string — it does NOT parse an "Authorization: Scheme credentials" value. [1]
- If you need an Authorization header with a scheme (e.g. "Authorization: Bearer ") use HTTPBearer / OAuth2 classes (they split scheme vs credentials) or parse the header yourself with get_authorization_scheme_param. [2][3]
Minimal examples
- APIKeyHeader (reads header value directly)
from fastapi import Depends, FastAPI
from fastapi.security import APIKeyHeader
app = FastAPI()
api_key_header = APIKeyHeader(name="X-API-Key")
@app.get("/items/")
async def read_items(key: str = Depends(api_key_header)):
return {"key": key}
(returns the header string exactly as sent). [1]
- Use HTTPBearer to get scheme + credentials
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
bearer = HTTPBearer()
@app.get("/secure/")
async def secure(creds: HTTPAuthorizationCredentials = Depends(bearer)):
# creds.scheme == "Bearer", creds.credentials == ""
return {"scheme": creds.scheme, "token": creds.credentials}
(HTTPBearer splits "Authorization: Bearer " for you). [2]
- Manual parsing with get_authorization_scheme_param
from fastapi.security.utils import get_authorization_scheme_param
authorization = request.headers.get("Authorization")
scheme, param = get_authorization_scheme_param(authorization)
if scheme.lower() == "bearer":
token = param
References
[1] FastAPI — APIKeyHeader docs.
[2] FastAPI — HTTP auth / HTTPAuthorizationCredentials (HTTPBearer) docs.
[3] Example usage of get_authorization_scheme_param in FastAPI (parsing Authorization header).
🌐 Web query:
FastAPI APIKeyHeader documentation how to use with Authorization header
💡 Result:
Short answer: use APIKeyHeader with name="Authorization" as a Security dependency (not plain Depends), then parse/strip the "Bearer " prefix yourself. Example:
from fastapi import FastAPI, Security, HTTPException, status
from fastapi.security.api_key import APIKeyHeader
app = FastAPI()
api_key_header = APIKeyHeader(name="Authorization", auto_error=False)
async def get_api_key(authorization: str = Security(api_key_header)):
if not authorization:
raise HTTPException(status_code=status.HTTP_401_UNAUTHORIZED, detail="Missing auth")
# if using Bearer tokens, strip the prefix:
if authorization.startswith("Bearer "):
token = authorization[len("Bearer "):]
else:
token = authorization
# validate token...
return token
@app.get("/protected")
async def protected(api_key: str = Security(get_api_key)):
return {"token": api_key}
Notes:
- Use Security(...) so FastAPI documents the scheme and shows the Authorize button in /docs. [1]
- APIKeyHeader simply extracts the header value — it won’t remove “Bearer ” for you. Use auto_error=False if you need optional auth or multiple auth methods. [1][2]
Sources: FastAPI security docs and community examples. [1] FastAPI security/APIKeyHeader reference. [2] StackOverflow discussion showing APIKeyHeader with Authorization.
Use HTTPBearer or custom header to enforce RFC 7235 Authorization header format.
The current implementation uses APIKeyHeader(name="Authorization") which accepts raw API key values without enforcing a scheme prefix, violating RFC 7235. RFC 7235 requires Authorization headers to follow the format Authorization: <auth-scheme> <credentials> (e.g., Bearer <token> or ApiKey <key>). FastAPI's APIKeyHeader simply extracts the raw header value without scheme parsing or validation.
Recommended fix:
- For Bearer tokens: Use
HTTPBearer()dependency (automatically parses scheme and credentials) - For API keys: Use custom header
X-API-Keyinstead of Authorization - If Authorization header required: Manually parse with
get_authorization_scheme_param()to extract and validate the scheme
🤖 Prompt for AI Agents
In src/routers/dependencies.py around line 4, the code currently uses
APIKeyHeader(name="Authorization") which accepts raw header values and does not
enforce RFC 7235 scheme parsing; replace it with an HTTPBearer() dependency
(importing HTTPBearer and HTTPAuthorizationCredentials) to automatically parse
scheme + credentials and then validate that the scheme is "bearer"
(case-insensitive) before returning the token. If you intend to use API keys
instead, switch the client header to a custom header name like "X-API-Key" and
continue using APIKeyHeader for that header. Alternatively, if you must keep
Authorization, use get_authorization_scheme_param() to manually split and
validate the scheme and credentials and raise HTTPException(401) on
missing/invalid scheme.
Replace direct dict type with lambda factory to satisfy mypy type checker. This fixes the pre-commit.ci mypy check failure.
The mypy hook needs pydantic installed to use the pydantic.mypy plugin, which properly handles Field(default_factory=dict) in Pydantic models.
- Add pydantic>=2.6,<2.7 to mypy additional_dependencies - Ensures consistent behavior between local and CI environments - Allows Field(default_factory=dict) to work with both mypy and ruff-format
…ed JSON - Remove TODO comment and workaround for single input limitation - Add robust JSON parsing with error handling for malformed data - Support both string (JSON-encoded) and already-parsed constraints - Only extract data_type when constraint is a valid dict with the field - Resolves openml#214
8ae991c to
7decfe1
Compare
Summary
This PR migrates the API authentication from URI-encoded API keys (query parameters) to Authorization headers, following FastAPI security best practices.
Changes
Core Implementation (
src/routers/dependencies.py):APIKeyHeaderfromfastapi.securityapi_key_headerinstance withname='Authorization'andauto_error=Falsefetch_user()to useDepends(api_key_header)instead of direct query parameterTest Updates:
headers={'Authorization': key}instead of?api_key=query parameterstests/routers/openml/dataset_tag_test.pytests/routers/openml/datasets_test.pytests/routers/openml/datasets_list_datasets_test.pytests/routers/openml/study_test.pytests/routers/openml/migration/datasets_migration_test.pyBenefits
✅ Security: API keys are no longer exposed in URLs (logs, caches, browser history, referrer headers)
✅ Standards Compliant: Follows RESTful API best practices and FastAPI security patterns
✅ Better Documentation: FastAPI automatically documents this in OpenAPI/Swagger UI
✅ Backward Compatible:
auto_error=Falseensures public endpoints continue to workMigration Guide
Before:
After:
Testing
All test files have been updated to use the new authentication method. The database authentication logic remains unchanged - only the transport mechanism has changed from query parameters to headers.
Summary by Sourcery
Migrate API authentication from URI-encoded query parameters to an Authorization header using FastAPI’s security utilities, updating the core dependency and test suite accordingly
Enhancements:
Tests:
Summary by CodeRabbit
New Features
Tests
Chores
Style
fixes #203 #214
#esoc2025