|
| 1 | +## Grammar support |
| 2 | + |
| 3 | +Many languages - all Slavic (Russian, Ukrainian, Polish, Bulgarian, etc), Finnic (Finnish, Estonian) and others - have [grammatical case feature](https://en.wikipedia.org/wiki/Grammatical_case) that could be supported in OSRM Text Instructions too. |
| 4 | +Originally street names are being inserted into instructions as they're in OSM map - in [nominative case](https://en.wikipedia.org/wiki/Nominative_case). |
| 5 | +To be grammatically correct, street names should be changed according to target language rules and instruction context before insertion. |
| 6 | + |
| 7 | +Actually grammatical case applying is not the simple and obvious task due to real-life languages complexity. |
| 8 | +It even looks so hard so, for example, all known native Russian navigation systems don't speak street names in their pronounceable route instructions at all. |
| 9 | + |
| 10 | +But fortunately street names have restricted lexicon and naming rules and so this task could be relatively easily solved for this particular case. |
| 11 | + |
| 12 | +### Implementation details |
| 13 | + |
| 14 | +The quite universal and simplier solution is the changing street names with the prepared set of regular expressions grouped by required grammatical case. |
| 15 | +The required grammatical case should be specified right in instruction's substitution variables: |
| 16 | + |
| 17 | +- `{way_name}` and `{rotary_name}` variables in translated instructions should be appended with required grammar case name after colon: `{way_name:accusative}` for example |
| 18 | +- [languages/grammar](languages/grammar/) folder should contain language-specific JSON file with regular expressions for specified grammar case: |
| 19 | +```json |
| 20 | +{ |
| 21 | + "v5": { |
| 22 | + "accusative": [ |
| 23 | + ["^ (\\S+)ая-(\\S+)ая [Уу]лица ", " $1ую-$2ую улицу "], |
| 24 | + ["^ (\\S+)ая [Уу]лица ", " $1ую улицу "], |
| 25 | + ... |
| 26 | +``` |
| 27 | +- All such JSON files should be registered in common [languages.js](languages.js) |
| 28 | +- Instruction text formatter ([index.js](index.js) in this module) should: |
| 29 | + - check `{way_name}` and `{rotary_name}` variables for optional grammar case after colon: `{way_name:accusative}` |
| 30 | + - find appropriate regular expressions block for target language and specified grammar case |
| 31 | + - call standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace) for each expression in block passing result from previous call to the next; the first call should enclose original street name with whitespaces to make parsing words in names a bit simplier. |
| 32 | +- Strings replacement with regular expression is available in almost all other programming language and so this should not be the problem for other code used OSRM Text Instructions' data only. |
| 33 | +- If there is no regular expression matched source name (that's for names from foreign country for example), original name is returned without changes. This is also expected behavior of standard [string replace with regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace). And the same behavior is expected in case of missing grammar JSON file or grammar case inside it. |
| 34 | + |
| 35 | +### Example |
| 36 | + |
| 37 | +Russian _"Большая Монетная улица"_ street from St Petersburg (_Big Monetary Street_ in rough translation) after processing with [Russian grammar rules](languages/grammar/ru.json) will look in following instructions as: |
| 38 | +- _"Turn left onto `{way_name}`"_ => `ru`:_"Поверните налево на `{way_name:accusative}`"_ => _"Поверните налево на Большую Монетную улицу"_ |
| 39 | +- _"Continue onto `{way_name}`"_ => `ru`:_"Продолжите движение по `{way_name:dative}`"_ => _"Продолжите движение по Большой Монетной улице"_ |
| 40 | +- _"Make a U-turn onto `{way_name}` at the end of the road"_ => `ru`:_"Развернитесь в конце `{way_name:genitive}`"_ => _"Развернитесь в конце Большой Монетной улицы"_ |
| 41 | +- _"Make a U-turn onto `{way_name}`"_ => `ru`:_"Развернитесь на `{way_name:prepositional}`"_ => _"Развернитесь на Большой Монетной улице"_ |
| 42 | + |
| 43 | +### Design goals |
| 44 | + |
| 45 | +- __Cross platform__ - uses the same data-driven approach as OSRM Text Instructions |
| 46 | +- __Test suite__ - has [prepared test](test/grammar_tests.js) to check available expressions automatically and has easily extendable language-specific names testing pattern |
| 47 | +- __Customization__ - could be easily extended for other languages with adding new regular expressions blocks into [grammar support](languages/grammar/) folder and modifying `{way_name}` and other variables in translated instructions only with necessary grammatical case labels |
| 48 | + |
| 49 | +### Notes |
| 50 | + |
| 51 | +- Russian regular expressions are based on [Garmin Russian TTS voices update](https://github.com/yuryleb/garmin-russian-tts-voices) project; see [file with regular expressions to apply to source text before pronouncing by TTS](https://github.com/yuryleb/garmin-russian-tts-voices/blob/master/src/Pycckuu__Milena%202.10/RULESET.TXT). |
| 52 | +- There is another grammar-supporting module - [jquery.i18n](https://github.com/wikimedia/jquery.i18n) - but unfortunately it has very poor implementation in part of grammatical case applying and is supposed to work with single words only. |
| 53 | +- Actually it would be great to get street names also in target language not from default OSM `name` only - there are several multi-lingual countries supporting several `name:<lang>` names for streets. But this the subject to address to [OSRM engine](https://github.com/Project-OSRM/osrm-backend) first. |
0 commit comments