-
Notifications
You must be signed in to change notification settings - Fork 13
Description
After changing the API endpoint in #8 and spending some time analyzing the naming function, I think there is a more mashable way of doing it. I would really like for it to be discussed a bit, if possible:
Proposal
Instead of going split by split in each name and checking if it is valid, and then checking again before returning it, it is possible to get all splits for each given country, join those splits from the first with the ones from the second, and randomly select one of the resulting options.
A few things to point out:
- It still returns
"{country_name} 2"
if both countries are the same one - Order still matters in that the first country will be the start of the new country, and the second the end of the new name
- If the name of any of the countries reappears as a whole, it gets filtered out. This is to remove a case like
Senegal
andPortugal
returningSenegal
. - There were handlers in case the country had a comma or sections between parenthesis (All name differences between old and new version can be seen here), but with the new API there are only two occurrences of this:
- The only country with parenthesis seems to be
Cocos (Keeling) Island
. In that case the parenthesis are removed and Keeling is used as part of the name - The only country with a comma is
Saint Helena, Ascension and Tristan da Cunha
, where the comma does not interfere with the naming
- The only country with parenthesis seems to be
- Having a multi word country now won't split it and return as before. Before, a combination like
Trinidad and Tobago
andJamaica
would yieldTrinidad Jamaica
. Now, it would be something along the lines ofTrinidad and Tomaica
(see the examples with a single and multiword country). But still can be reverted or something. - Speaking of the length, I set two different minimum lengths:
- If at least one country is a single word:
(length of the LONGEST country selected) - 1
- If both are multi word countries:
(length of the SHORTEST country selected) + 1
- If at least one country is a single word:
- The only country that I found that does not get splits seems to be
Niue
, so in that case I return it directly. This would also remove the check for it in thecheckMashupValid
function informatsvg.py
- This will probably result in the generation of similar countries with the exact same flag, as I don't see that flags are randomized. As an example, all the countries in the examples would have the same flag generated.
Code for this can be found here with unit tests done tests here
It can also be manually tested by checking out the code and running
python3 names.py '{country_name}' '{another_country_name}'
Examples
Example with 1 word countries
Using Mozambique and Guatemala
Start options are: mo, moza, mozambi
End options are: temala, mala, la
Combinations are: mozatemala, mozambitemala, mozambimala, mozambila
Example with a single and a multi word country
Using Trinidad and Tobago and Jamaica
Start options are: tri, trini, trinida, trinidad a, trinidad and to, trinidad and toba
End options are: maica, ca
Combinations are: trinidad and tomaica, trinidad and tobamaica, trinidad and tobaca
#################
# Same example but swapping both countries
#################
Using Jamaica and Trinidad and Tobago
Start options are: ja, jamai
End options are: nidad and tobago, dad and tobago, d and tobago, nd tobago, bago, go
Combinations are: janidad and tobago, jamainidad and tobago, jamaidad and tobago
Example with two multi word countries
Using Antigua and Barbuda and Saint Kitts and Nevis
Start options are: a, anti, antigua, antigua a, antigua and ba, antigua and barbu
End options are: nt kitts and nevis, tts and nevis, nd nevis, vis, s
Combinations are: antint kitts and nevis, antiguant kitts and nevis, antiguatts and nevis,
antigua ant kitts and nevis, antigua atts and nevis, antigua and bant kitts and nevis,
antigua and batts and nevis, antigua and band nevis, antigua and barbunt kitts and nevis,
antigua and barbutts and nevis, antigua and barbund nevis, antigua and barbuvis
Controversies?
I believe people will be civil enough to understand that the API used is provided by the internet and the names used are not generated by the bot. But there are cases like the Islas Malvinas, or Falkland Islands, that with the new API are just Falkland Islands. With the code that the bot runs (not with this PR), that name would be taken in as Malvinas Falkland Islands, and on mashups it will show as either Malvinas Falklands {something_else}
or {something_else} Islands
.
I am sure there are a few other issues like this, but this is the only one I know of so far.
The solutions I have for this are either
- Disregard, because maybe I am seeing an issue where it doesn't exist
- Manually add 'Malvinas' (and other words as needed)
- Have a local database with names stablished by us. That way we would only need to retrieve the flags (flagpedia seems a nice resource). The ISO codes are used only for some checks in the code, and could really be omitted
Just to show, an example of how mashups would be with this addition
Without it, it would be the same but removing all the Malvinas related things.Using Malvinas Falkland Islands and French Polynesia
Start options are: ma, malvi, malvina, malvinas fa, malvinas falkla, malvinas falkland i, malvinas falkland isla
End options are: nch polynesia, lynesia, nesia, sia
Combinations are: malvinch polynesia, malvinanch polynesia, malvinas fanch polynesia, malvinas falynesia, malvinas falklanch polynesia, malvinas falklalynesia, malvinas falklanesia, malvinas falklasia, malvinas falkland inch polynesia, malvinas falkland ilynesia, malvinas falkland inesia, malvinas falkland isia, malvinas falkland islanch polynesia, malvinas falkland islalynesia, malvinas falkland islanesia, malvinas falkland islasia