Skip to content

Conversation

@missinglink
Copy link
Member

branched from stopword-classification, requires #179 be merged first.

This PR is an attempt at solving #178 with minimal changes.

However, it seems to solve a longstanding issue which was more widespread, namely having a space between HouseNumber and UnitNumber:

node bin/cli.js 10 A Main Street

# before
(0.98) ➜ [ { housenumber: '10' }, { street: 'A Main Street' } ]

# after
(0.86) ➜ [ { housenumber: '10 A' }, { street: 'Main Street' } ]

I'm going to have a think about this before merging, it feels 'too easy', I'm sure it comes with some pitfalls.

Closes #178

@missinglink missinglink requested a review from Joxit September 9, 2024 10:02
{ region: 'MD' },
{ postcode: '21613' },
{ country: 'USA' }
]], false)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are required as the parser previously returned a single result for these queries, it now returns two, but the expected result is still the highest scoring.

@missinglink missinglink merged commit c1b0cb6 into master Jun 30, 2025
6 checks passed
@missinglink missinglink deleted the french-bis-ter branch June 30, 2025 10:19
@missinglink
Copy link
Member Author

The improved unit parsing in this PR has caused some issues where in some cases the resulting parse, when using in pelias/api produces zero results:

https://pelias.github.io/compare/#/v1/autocomplete?text=Calle+Principal+20+B

vs.

https://pelias.github.io/compare/#/v1/autocomplete?text=Calle+Principal+20+Ba

@missinglink
Copy link
Member Author

We may need to revert this is we can't come up with a workaround, one which comes to mind is to strip non-numeric tokens from the housenumber field in pelias/api before generating the subject field.

It feels like one step forward in support for units and then a step back 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

France: bis/ter housenumber suffixes

2 participants