Skip to content

Clearly define behaviour: leading and trailing whitespace in gazetteer entries #26

@johann-petrak

Description

@johann-petrak

The ANNIE DefaultGazetteer accepts trailing (and leading?) spaces as part of a "word".
I think we simply trim leading and trailing whitespace.

This should be clearly defined and justified:

  • would we ever want to match a space token after/before a token as part of an entry?

More generally, the matching between space tokens and whitespace should be more clearly defined.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions