Etymology prefixes: we should first replace *long* prefixes #2624

Closed
opened 2026-01-07 11:39:17 +00:00 by Valerio_Bozz · 1 comment

I noticed that, when an Italian street is called "Viale XYZ" (example: "Viale Marebello"), then the prefix "Viale" is not omitted correctly.

The "le" part remains by mistake.

Screenshot of Etymology map with Wikidata search box starting with LE by mistake

The same mistake happens with "Piazzale". The prefix "Piazzale" is never removed correctly and "le" always remains.

Exploration

If I understand correctly, the dataset of prefixes is already OK, since it already includes "Viale" and "Piazzale":

So, I think the problem is in the string replacer:

https://source.mapcomplete.org/MapComplete/MapComplete/src/branch/develop/src/UI/InputElement/Validators/WikidataValidator.ts#L156

This replaces strings by order of appearance in the array. Note that "Via" is listed before "Viale", so "Via" is replaced first, leaving the wrong remaining "le".

Proposal

In WikidataValidator.removePostAndPrefixes() we can manipulate the input arguments prefixesToRemove and postfixesToRemove to sort their values by string length, so that the longest strings are at the beginning of the array, so that the problems of intermediate sub-strings are solved.

P.S.

In the next months I think I will not produce a working patch in Typescript. Thanks in advance for any comment / patch :) Hoping my troubleshooting was enough precise.

(Thank you for such beautiful and useful project <3)

I noticed that, when an Italian street is called "Viale XYZ" (example: "Viale Marebello"), then the prefix "**Viale**" is not omitted correctly. The "le" part remains by mistake. ![Screenshot of Etymology map with Wikidata search box starting with LE by mistake](/attachments/101d4c47-42ed-4c6c-a8f7-694a042093f8) The same mistake happens with "Piazzale". The prefix "Piazzale" is never removed correctly and "le" always remains. ## Exploration If I understand correctly, the dataset of prefixes is already OK, since it already includes "Viale" and "Piazzale": https://source.mapcomplete.org/MapComplete/MapComplete/src/commit/8d628749e13be65f5d77cf25cfb206e2a312b6f3/assets/layers/etymology/etymology.json#L229 https://source.mapcomplete.org/MapComplete/MapComplete/src/commit/8d628749e13be65f5d77cf25cfb206e2a312b6f3/assets/layers/etymology/etymology.json#L218 So, I think the problem is in the string replacer: https://source.mapcomplete.org/MapComplete/MapComplete/src/branch/develop/src/UI/InputElement/Validators/WikidataValidator.ts#L156 This replaces strings by order of appearance in the array. Note that "Via" is listed before "Viale", so "Via" is replaced first, leaving the wrong remaining "le". ## Proposal In `WikidataValidator.removePostAndPrefixes()` we can manipulate the input arguments `prefixesToRemove` and `postfixesToRemove` to sort their values by string length, so that the longest strings are at the beginning of the array, so that the problems of intermediate sub-strings are solved. P.S. In the next months I think I will not produce a working patch in Typescript. Thanks in advance for any comment / patch :) Hoping my troubleshooting was enough precise. (Thank you for such beautiful and useful project <3)
553 KiB
Valerio_Bozz changed title from Etymology prefixes: we should first replace *short* prefixes to Etymology prefixes: we should first replace *long* prefixes 2026-01-07 11:42:01 +00:00
pietervdvn referenced this issue from a commit 2026-01-07 22:39:59 +00:00
Owner

Hey,

Thanks for the super-detailed bug report! It made it super-easy to fix.

Hey, Thanks for the super-detailed bug report! It made it super-easy to fix.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
MapComplete/MapComplete#2624
No description provided.