7

Is there a canonical way in PHP to do this (Java question): Locale: Language name to Country / Language code

This question is the inverse of: standard function to translate iso-639 codes to language name?

Namely, convert from a string such as French to the code fr? The mechanism would only need to support source strings in English, and I would like to avoid creating my own conversion list as was given in this answer: https://stackoverflow.com/a/20520458/760706

I am thinking along the lines of Locale::getLocaleCodeForDisplayLanguage("French", "en") which doesn't exist.

1
  • 1
    Nothing like this in PHP as far as I'm aware, you'll have to implement it yourself. Commented Jul 30, 2014 at 16:00

3 Answers 3

4

Define :

function getLocaleCodeForDisplayLanguage($name){
    $languageCodes = array(
    "aa" => "Afar",
    "ab" => "Abkhazian",
    "ae" => "Avestan",
    "af" => "Afrikaans",
    "ak" => "Akan",
    "am" => "Amharic",
    "an" => "Aragonese",
    "ar" => "Arabic",
    "as" => "Assamese",
    "av" => "Avaric",
    "ay" => "Aymara",
    "az" => "Azerbaijani",
    "ba" => "Bashkir",
    "be" => "Belarusian",
    "bg" => "Bulgarian",
    "bh" => "Bihari",
    "bi" => "Bislama",
    "bm" => "Bambara",
    "bn" => "Bengali",
    "bo" => "Tibetan",
    "br" => "Breton",
    "bs" => "Bosnian",
    "ca" => "Catalan",
    "ce" => "Chechen",
    "ch" => "Chamorro",
    "co" => "Corsican",
    "cr" => "Cree",
    "cs" => "Czech",
    "cu" => "Church Slavic",
    "cv" => "Chuvash",
    "cy" => "Welsh",
    "da" => "Danish",
    "de" => "German",
    "dv" => "Divehi",
    "dz" => "Dzongkha",
    "ee" => "Ewe",
    "el" => "Greek",
    "en" => "English",
    "eo" => "Esperanto",
    "es" => "Spanish",
    "et" => "Estonian",
    "eu" => "Basque",
    "fa" => "Persian",
    "ff" => "Fulah",
    "fi" => "Finnish",
    "fj" => "Fijian",
    "fo" => "Faroese",
    "fr" => "French",
    "fy" => "Western Frisian",
    "ga" => "Irish",
    "gd" => "Scottish Gaelic",
    "gl" => "Galician",
    "gn" => "Guarani",
    "gu" => "Gujarati",
    "gv" => "Manx",
    "ha" => "Hausa",
    "he" => "Hebrew",
    "hi" => "Hindi",
    "ho" => "Hiri Motu",
    "hr" => "Croatian",
    "ht" => "Haitian",
    "hu" => "Hungarian",
    "hy" => "Armenian",
    "hz" => "Herero",
    "ia" => "Interlingua (International Auxiliary Language Association)",
    "id" => "Indonesian",
    "ie" => "Interlingue",
    "ig" => "Igbo",
    "ii" => "Sichuan Yi",
    "ik" => "Inupiaq",
    "io" => "Ido",
    "is" => "Icelandic",
    "it" => "Italian",
    "iu" => "Inuktitut",
    "ja" => "Japanese",
    "jv" => "Javanese",
    "ka" => "Georgian",
    "kg" => "Kongo",
    "ki" => "Kikuyu",
    "kj" => "Kwanyama",
    "kk" => "Kazakh",
    "kl" => "Kalaallisut",
    "km" => "Khmer",
    "kn" => "Kannada",
    "ko" => "Korean",
    "kr" => "Kanuri",
    "ks" => "Kashmiri",
    "ku" => "Kurdish",
    "kv" => "Komi",
    "kw" => "Cornish",
    "ky" => "Kirghiz",
    "la" => "Latin",
    "lb" => "Luxembourgish",
    "lg" => "Ganda",
    "li" => "Limburgish",
    "ln" => "Lingala",
    "lo" => "Lao",
    "lt" => "Lithuanian",
    "lu" => "Luba-Katanga",
    "lv" => "Latvian",
    "mg" => "Malagasy",
    "mh" => "Marshallese",
    "mi" => "Maori",
    "mk" => "Macedonian",
    "ml" => "Malayalam",
    "mn" => "Mongolian",
    "mr" => "Marathi",
    "ms" => "Malay",
    "mt" => "Maltese",
    "my" => "Burmese",
    "na" => "Nauru",
    "nb" => "Norwegian Bokmal",
    "nd" => "North Ndebele",
    "ne" => "Nepali",
    "ng" => "Ndonga",
    "nl" => "Dutch",
    "nn" => "Norwegian Nynorsk",
    "no" => "Norwegian",
    "nr" => "South Ndebele",
    "nv" => "Navajo",
    "ny" => "Chichewa",
    "oc" => "Occitan",
    "oj" => "Ojibwa",
    "om" => "Oromo",
    "or" => "Oriya",
    "os" => "Ossetian",
    "pa" => "Panjabi",
    "pi" => "Pali",
    "pl" => "Polish",
    "ps" => "Pashto",
    "pt" => "Portuguese",
    "qu" => "Quechua",
    "rm" => "Raeto-Romance",
    "rn" => "Kirundi",
    "ro" => "Romanian",
    "ru" => "Russian",
    "rw" => "Kinyarwanda",
    "sa" => "Sanskrit",
    "sc" => "Sardinian",
    "sd" => "Sindhi",
    "se" => "Northern Sami",
    "sg" => "Sango",
    "si" => "Sinhala",
    "sk" => "Slovak",
    "sl" => "Slovenian",
    "sm" => "Samoan",
    "sn" => "Shona",
    "so" => "Somali",
    "sq" => "Albanian",
    "sr" => "Serbian",
    "ss" => "Swati",
    "st" => "Southern Sotho",
    "su" => "Sundanese",
    "sv" => "Swedish",
    "sw" => "Swahili",
    "ta" => "Tamil",
    "te" => "Telugu",
    "tg" => "Tajik",
    "th" => "Thai",
    "ti" => "Tigrinya",
    "tk" => "Turkmen",
    "tl" => "Tagalog",
    "tn" => "Tswana",
    "to" => "Tonga",
    "tr" => "Turkish",
    "ts" => "Tsonga",
    "tt" => "Tatar",
    "tw" => "Twi",
    "ty" => "Tahitian",
    "ug" => "Uighur",
    "uk" => "Ukrainian",
    "ur" => "Urdu",
    "uz" => "Uzbek",
    "ve" => "Venda",
    "vi" => "Vietnamese",
    "vo" => "Volapuk",
    "wa" => "Walloon",
    "wo" => "Wolof",
    "xh" => "Xhosa",
    "yi" => "Yiddish",
    "yo" => "Yoruba",
    "za" => "Zhuang",
    "zh" => "Chinese",
    "zu" => "Zulu"
    );
    return array_search($name, $languageCodes);
}

Then you just call:

echo getLocaleCodeForDisplayLanguage("French");

and it will return fr

I found the list of the ISO 639-1 codes on this site.

Sign up to request clarification or add additional context in comments.

1 Comment

I did say I wanted a solution that avoided creating and maintaining my own list, since I know such code would end up getting neglected.
2

You can do this by PHP intl extension (available on PHP 5.3.2+).

<?php

if (version_compare(PHP_VERSION, '5.3.2', '<=')) {
    exit ('php_intl extension is available on PHP 5.3.2 or later.');
}    
if (!class_exists('Locale')) {
    exit ('You need to install php_intl extension.');
}

function getLocaleByDisplayName($displayName, $localeToSearch = 'en') {
    // get all available locales
    $allLocales = ResourceBundle::getLocales('');
    //var_dump($allLocales);

    $foundLocales = [];
    foreach ($allLocales as $locale) {
        $currentName = Locale::getDisplayLanguage($locale, $localeToSearch);
        if (strncmp($currentName, $displayName, strlen($currentName)) === 0) {
            $foundLocales[] = $locale;
        }
    }
    return $foundLocales;
}

$locales = getLocaleByDisplayName('Japanese', 'en');
var_dump($locales);
/*
array(2) {
[0]=>
string(2) "ja"
[1]=>
string(5) "ja_JP"
}
*/

$locales = getLocaleByDisplayName('スワヒリ語', 'ja');
var_dump($locales);
/*
array(5) {
[0]=>
string(2) "sw"
[1]=>
string(5) "sw_CD"
[2]=>
string(5) "sw_KE"
[3]=>
string(5) "sw_TZ"
[4]=>
string(5) "sw_UG"
}
*/

There could be multiple locales pointing the same language name, if you need to get only one, you may have to search the shortest locale or something.

1 Comment

I think we ended up hard-coding a table since we only had about 8 to deal with (and didn't want country codes too), but this is a good general solution.
1

You can use thephpleague/iso3166 package by php league which implemented ISO 3166-1 data. For example, you can use:

$data = (new League\ISO3166\ISO3166)->name('Netherlands');
$data = (new League\ISO3166\ISO3166)->alpha2('NL');
$data = (new League\ISO3166\ISO3166)->alpha3('NLD');
$data = (new League\ISO3166\ISO3166)->numeric('528');
/*
result is:
[
    'name' => 'Netherlands',
    'alpha2' => 'NL',
    'alpha3' => 'NLD',
    'numeric' => '528',
    'currency' => [
        'EUR',
    ]
]
*/

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.