Skip to content

Conversation

@Luen
Copy link

@Luen Luen commented Apr 11, 2024

Hi team,

I noticed that Scholarly doesn't output the PDF link (only pub_url), so I'm attempting to add this feature.

Description

Edited PublicationParser() fill() in publication_parser.py to add HTML parsing for the pdf link.

Example publication with pdf link: https://scholar.google.com/citations?view_op=view_citation&hl=en&user=ynWS968AAAAJ&citation_for_view=ynWS968AAAAJ:8xutWZnSdmoC
Screenshot of Google Scholar publication with PDF link

Checklist

  • Check that the base branch is set to develop and not main.
  • Ensure that the documentation will be consistent with the code upon merging.
  • Add a line or a few lines that check the new features added.
  • Ensure that unit tests pass.
    If you don't have a premium proxy, some of the tests will be skipped.
    The tests that are run should pass without raising
    MaxTriesExceededException or other exceptions.

@TuTouPower
Copy link

Good job, bro. Use "eprint_url" not "pdf_url" will more suit.

@Luen Luen marked this pull request as ready for review December 12, 2024 02:31
Copy link
Collaborator

@arunkannawadi arunkannawadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR and for expanding the unit tests to test this.

@arunkannawadi arunkannawadi merged commit ae36174 into scholarly-python-package:develop Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants