Fetch all Languages of the World (ISO 639) as Java Array

Recently i needed a Java Array of all Languages, containing some unique id.

I found that ISO 639 is exactly what i needed. The list is offered by the Library of Congress in a CSV like Format (link).

So i parsed this data and thought that maybe someone can use this, either the Java Array or the Code as Example for parsing external data with JavaScript.


To fetch the list i used Ben Almans JSONP Proxy and Ben Nadels CSVToArray Function.


    <script src="http://code.jquery.com/jquery-1.10.1.min.js"></script>
    <!--- --------------------------------------------------------------------------------------- ----
      CSVToArray Function Source:
      Blog Entry:  Ask Ben: Parsing CSV Strings With Javascript Exec() Regular Expression Command  
      Author:  Ben Nadel / Kinky Solutions  
    ---- --------------------------------------------------------------------------------------- ---->
    <script type="text/javascript">
      // This will parse a delimited string into an array of
      // arrays. The default delimiter is the comma, but this
      // can be overriden in the second argument.
      function CSVToArray( strData, strDelimiter ){
        // Check to see if the delimiter is defined. If not,
        // then default to comma.
        strDelimiter = (strDelimiter || ",");
        // Create a regular expression to parse the CSV values.
        var objPattern = new RegExp(
            // Delimiters.
            "(\\" + strDelimiter + "|\\r?\\n|\\r|^)" +
            // Quoted fields.
            "(?:\"([^\"]*(?:\"\"[^\"]*)*)\"|" +
            // Standard fields.
            "([^\"\\" + strDelimiter + "\\r\\n]*))"
        // Create an array to hold our data. Give the array
        // a default empty first row.
        var arrData = [[]];
        // Create an array to hold our individual pattern
        // matching groups.
        var arrMatches = null;
        // Keep looping over the regular expression matches
        // until we can no longer find a match.
        while (arrMatches = objPattern.exec( strData )){
          // Get the delimiter that was found.
          var strMatchedDelimiter = arrMatches[ 1 ];
          // Check to see if the given delimiter has a length
          // (is not the start of string) and if it matches
          // field delimiter. If id does not, then we know
          // that this delimiter is a row delimiter.
          if (
            strMatchedDelimiter.length &&
            (strMatchedDelimiter != strDelimiter)
            // Since we have reached a new row of data,
            // add an empty row to our data array.
            arrData.push( [] );
          // Now that we have our delimiter out of the way,
          // let's check to see which kind of value we
          // captured (quoted or unquoted).
          if (arrMatches[ 2 ]){
            // We found a quoted value. When we capture
            // this value, unescape any double quotes.
            var strMatchedValue = arrMatches[ 2 ].replace(
              new RegExp( "\"\"", "g" ),
          } else {
            // We found a non-quoted value.
            var strMatchedValue = arrMatches[ 3 ];
          // Now that we have our value string, let's add
          // it to the data array.
          arrData[ arrData.length - 1 ].push( strMatchedValue );
        // Return the parsed data.
        return( arrData );
      function fetch(url) {
        jQuery.getJSON( "/ba-simple-proxy.php?url=" + encodeURI(url), function(data){     
          js_arr = CSVToArray(data.contents, '|');
          json = JSON.stringify(js_arr);
          java_arr_str = json.replace(/\[/g,"{").replace(/\]/g,"}");
    <input id="url" type="text" size="60" value="http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt"> <br />
    <input type="button" value="Fetch!" onclick="fetch(jQuery('#url').val()); return false;"> <br />
    <br />    
    <textarea id="result" cols="80" rows="20" readonly="readonly" onClick="this.select();"></textarea>

Leave a Reply

Your email address will not be published. Required fields are marked *