-
Notifications
You must be signed in to change notification settings - Fork 36
Pypi installable Client lib for common trainer functions #256
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a small typing issue that I think should be addressed.
The rest are more minor gripes:
- The trusted publisher release option without passwords / secrets
- The naming of the objects when they're not the objects themselves but rather descriptors / references / records of the underlying data
- The dataclass instances can refer to both server side and client side files and it's not always clear which one you're dealing with
- The README/docs duplication
| cd client | ||
| python -m build | ||
| - name: Publish dev distribution to Test PyPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also use Trusted Publisher to avoid needing a password/secret. Like v2 publish. Though notably, there were some permissions that were also needed.
Not a must have, of course.
|
|
||
| @dataclass | ||
| class MCTObj(ABC): | ||
| id: str=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you had mypy, it would complain about the typing. But I suppose it's not supposed to be None. So not necessarily an issue.
EDIT:
Same goes for the following dataclasses.
|
|
||
|
|
||
| @dataclass | ||
| class MCTObj(ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class (and the ones that derive from it) don't refer to the objects themselves, but are a descriptor/record of them.
Perhaps MCTRecord or something like that would be clearer?
There's more details on this under MCTDataset.
|
|
||
|
|
||
| @dataclass | ||
| class MCTDataset(MCTObj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the following derivations as well) isn't really the dataset itself. Rather it's a record / descriptor of the dataset.
So perhaps a name that would make this clearer would be good?
|
|
||
|
|
||
| @dataclass | ||
| class MCTConceptDB(MCTObj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same naming comment as above
|
|
||
|
|
||
| @dataclass | ||
| class MCTRelTask(MCTObj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same naming comment as above
|
|
||
|
|
||
| @dataclass | ||
| class MCTProject(MCTObj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same naming comment as above
| users = json.loads(requests.get(f'{self.server}/api/users/', headers=self.headers).text)['results'] | ||
| return [MCTUser(id=u['id'], username=u['username']) for u in users] | ||
|
|
||
| def get_models(self) -> Tuple[List[str], List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type is wrong here. It's correct in the doc string.
| """ | ||
| cdbs = json.loads(requests.get(f'{self.server}/api/concept-dbs/', headers=self.headers).text)['results'] | ||
| vocabs = json.loads(requests.get(f'{self.server}/api/vocabs/', headers=self.headers).text)['results'] | ||
| mct_cdbs = [MCTConceptDB(id=cdb['id'], name=cdb['name'], conceptdb_file=cdb['cdb_file']) for cdb in cdbs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up until now (as of reading the module), the MCTConceptDB (and similar) instances have referred to files on disk local to the client. But now it's referring to files only available on the server.
I feel like there's a risk that someone uses get_models to get the CDB info and then passes it to (e.g) create_medcat_model and that fails due to the file not existing locally while trying to upload.
Perhaps a field in the base class to differentiate between server side and client side objects? I don't know if that makes sense or not.
| @@ -0,0 +1,88 @@ | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a copy of the client README. So need to maintain both.
I know we have the same issue with medcat README/docs as well. Just thinking out loud, hoping for a better solution.
| uses: pypa/gh-action-pypi-publish@v1.4.2 | ||
| with: | ||
| password: ${{ secrets.TEST_PYPI_API_TOKEN }} | ||
| repository_url: https://test.pypi.org/legacy/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need the absolute path for dist here (in with):
packages-dir: client/dist| if: startsWith(github.ref, 'refs/tags') && ! github.event.release.prerelease | ||
| uses: pypa/gh-action-pypi-publish@v1.4.2 | ||
| with: | ||
| password: ${{ secrets.PYPI_API_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need the absolute path for dist here (in with):
packages-dir: client/dist…lient Pypi installable Client lib for common trainer functions
v simple pypi installable client, that wraps most of the common Trainer API calls.