Showing posts with label Windows. Show all posts
Showing posts with label Windows. Show all posts

30 Jan 2013

Supporting (X)GETTEXT on Windows - From the Ground Up

Background Info

So, you may have fiddled with GETTEXT in your php files or you may have used PoEdit to translate an app or a site. Well, that's pretty much as far as you can go with these unless you want to go further by using XGETTEXT on the command line to produce your own .pot file for distribution (or your own use, come to that).
If you read a previous article: Creating .pot Files in PoEdit for Translators, you'll remember that PoEdit can create these for you - well, it can, but not fully-featured .pot files. PoEdit has been developed to create and edit .po files, not .pot files, which I got from the horse's mouth, author Václav. Slavik. So what's the difference?

.pot vs .po Files

.pot files are meant as templates for producing other localizations and shouldn't be used to create the all important .mo files. W e can use Gnu's (X)GETTEXT.

Get XGETTEXT for Windows

XGETTEXT works off the Unix command line, so it can't be run natively on Windows, but the clever people at Gnu have provided a Windows version here - GnuWin32.
Follow the link, download and install the package.
You may install it anywhere, but don't place it in the root directory. When installed, you may have something like the image on the right.
Congratulations, you've now installed XGETTEXT. Well. Almost.
In order to get ease of use out of XGETTEXT, you'll need to make the executable commands available from everywhere, other than just the gnuwin32/bin/ directory. For this, we need to add a path to our Windows Environment Variables.

Add a Path to gnuwin32

  1. Right-click on Computer (or My Computer) in the Start Menu choose Properties in the displayed shortcut menu.
  2. On the Properties dialog, click on the Advanced Systems Settings link:
  3. Then, press the Environment Variables... button on the Advanced tab.
  4. Then on the next dialog, choose Path in the System Variables box (or use User variables, if you only want to add XGETTEXT for yourself). Then press Edit...
  5. Now add a path to the directory where the gettext executables live. For me it was: C:\Program Files (x86)\gnuwin32\bin, but your installation may be different. You must separate this path from the last existing one with a semi-colon (;):
  6. Ok, now we should be done with setting up XGETTEXT on Windows - we just need to test it, so open up the command prompt (cmd.exe) and lets type:
    C:\> xgettext --help
    You should then see something that ends like this:

XGETTEXT Commands and Options

I've been informed on many a website that the XGETTEXT documentation is excellent. I really don't want to disagree, but unless you're a unix-junkie and have experience in decoding hieroglyphics, you won't find this easy-to-follow documentation. It gave me a nosebleed! Mind you, it may say more about me than the documentation. So, how do you get it to work? Well, it's surprisingly easy if you ignore the --help menu, which by the way, also makes very little sense.
You may think that I'm being a bit unfair, and yes, I suppose I am, since once you understand the various options and the syntax, it does kind of make sense.

Examples of Command Line Statements to Build a (.pot) File

Before I just give a blind example, I'll run through the basic structure:
ElementDescription
Set Directories-D
e.g. -D c:/xampp/htdocs/timetabler/includes/*.php c:/xampp/htdocs/timetabler/mainsite/*.php
This describes a list of all the directories that hold files that contain your gettext strings. Directories are separated by a space.
Adding translator comments -c
A plain -c will add all php comments that precede the gettext strings as 'Notes for Translators'.
-c[tag]
e.g. -cTRANSLATORS
You may add a tag to this which will only extract the comments starting with this string.
You should note that successive comments between this and gettext strings will be included too. The tag option is useful if you have general comments directly preceding your specific translator comments. This prevents them from being included.
Sorting strings by file location -F
This will group gettext strings according to directories.
Search for strings -k[keywordspec]
e.g. -k__
This gets the keywords, like a prefix, for which to search to identify gettext strings, e.g. __ refers to something like echo _('hello'); so the __ means _(...).
Apply encoding --from-code=name
e.g. --from-code=UTF-8
You should use this if your files may contain non-ASCII text.
File language -L langname
e.g. -L php
This helps xgettext to know how to parse the files containing the gettext strings and comments.
Output location -p directory
e.g.1 -p c:/xampp/htdocs/timetabler/lang (absolute path)
e.g.2 -p timetabler/lang (relative reference)
This sets the directory for saving the output .po file. If this is not included, the file will be saved to the relative directory from where the xgettext command is run.
Output filename -o filename
e.g. -o mytemplate
The setting above will create a file called mytemplate.po.
Add filename and line info-n
This generates the #: filename: line lines in the .pot file, which can be displayed in PoEdit by right-clicking the string to be translated.
So armed with the codes above, you should now be able to enter something like the following at the command prompt:
C:\> xgettext -n -c -D c:/xampp/htdocs/diafol/includes/*.php c:/xampp/htdocs/diafol/config/*.php --from-code=UTF-8 -k__ -L php -p c:/xampp/htdocs/diafol/lang/orig -o mytemplate.pot
Breaking down this into parts, we're trying to:
  1. Add filename lines to the output file (-n).
  2. Add translator comments to the output file -c. No particular tag.
  3. GETTEXT strings can be found in files with .php extension in c:/xampp/htdocs/diafol/includes/ and c:/xampp/htdocs/diafol/config/ directories.
  4. Fix the encoding to UTF-8 --from-code=UTF-8
  5. Search the .php files for the _(...) format strings (-k__).
  6. Let XGETTEXT know that you're using input files using the php language (-L php).
  7. Output the new file to the c:/xampp/htdocs/diafol/lang/orig directory.
  8. Name the file mytemplate.pot (-o mytemplate.pot)

Plural Forms

One thing that I have not covered is the 'plural forms' aspect. A simple case could be with regard to something like: '1 click' or '2 clicks', where there are a number of formats depending on how many subjects that you have. This can be quite complicated as different languages will have different plural forms. For example, English is relatively simple: 'click' singular would be applied to just the digit 1, but all others 0,2,3,4,5... as 'clicks', but other languages may have a very different pattern.
So how do we go about treating plural forms properly? It seems that we have to get a bit crafty in our source files, using the ngettext keyword. I have to admit, I struggled to find decent resources on this with php, but with a bit of fiddling, I believe that I managed to find a working method.
If XGETTEXT knows that we're working with php files, it should automatically pick up the ngettext keyword, so DON'T include this as a keyword in the command line. If you do, you may find that you produce a single string entry for translation and not a plural entry.
Here's a trivial example of ngettext() in a source file:
printf(ngettext("%d file", "%d files", $num), $num);
Once you've run your XGETETXT in the command line, open up the new .pot in a text editor and you should see something like this:
#: c:/xampp/htdocs/timetabler/includes/step1.php:20
#, php-format
msgid "%d file"
msgid_plural "%d files"
msgstr[0] ""
msgstr[1] ""
As far as manipulating creating the .pot file via the command line, we're done, but let's look at a better example. The following code is pretty poor with regard to php, but it's just for illustration purposes, with the focus on ngettext() and _().
<?php
//temp display variables
$num= 1;
$username= "diafol";
$email = "diafol@example.com";

//Form label
$user_label = _("Username:");
//Form label
$email_label = _("Email:");
//Form button label
$submit_label = _("Change Email");
//You must include %d as is 
$lastvisit = ngettext("Your last visit was %d day ago", "Your last visit was %d days ago:", $num);
?>
  
<p><?php printf($lastvisit,$num);?></p>   
<form id="profile" name="profile" action="scripts/profile_handler.php" method="post">
 <label for="username"><?php echo $user_label;?></label>
 <input id="username" name="username" value="<?php echo $username;?>" disabled="disabled" />
    <label for="chemail"><?php echo $email_label;?></label>
 <input id="chemail" name="chemail" value="<?php echo $email;?>" />
    <input type="submit" name="submitprofile" id="submitprofile" value="<?php echo $submit_label;?>" />
</form>
This code gives the following form:
If we change the $num variable to:
//temp display variables
$num= 7;
You should see:
So, we can confirm that the printf()/ngettext() is working as expected in the native language. Okay, here comes the magic bit again, lets run our XGETTEXT:
xgettext -n -c -D c:/xampp/htdocs/timetabler/includes/*.php c:/xampp/htdocs/
timetabler/config/*.php --from-code=UTF-8 -k__  -L php -p c:/xampp/htdocs/timetabler/lang -o messages.pot
This should produce a file messages.pot with something like the following content:
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2013-01-30 00:18+0000\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
"Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n"

#. Form label
#: c:/xampp/htdocs/timetabler/includes/step1.php:8
msgid "Username:"
msgstr ""

#. Form label
#: c:/xampp/htdocs/timetabler/includes/step1.php:10
msgid "Email:"
msgstr ""

#. Form button label
#: c:/xampp/htdocs/timetabler/includes/step1.php:12
msgid "Change Email"
msgstr ""

#. You must include %d as is
#: c:/xampp/htdocs/timetabler/includes/step1.php:14
#, php-format
msgid "Your last visit was %d day ago"
msgid_plural "Your last visit was %d days ago:"
msgstr[0] ""
msgstr[1] ""
We can see the 'Notes for Translators' comments with the #.-prefixed lines, the filename: line comments, prefixed by #: and the plural format with the msgid_plural key.
We are now ready to edit the file in order to make it distributable. We should edit certain parts of the messages.pot file directly.

Editing Parts of the .pot file Directly

Open up the messages.pot file in a text editor and apply your info, for example, the default:
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
Could be changed to this:
# Translation File for diafolCode TIMETABLER. Download the package from http://www.example.com/downloads).
# Copyright (C) 2013 ALAN DAVIES
# This file is distributed under the same license as the diafolCode TIMETABLER package.
# ALAN DAVIES <diafol@example.com>, 2013.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: diafolCode Timetabler 1.01b\n"
"Report-Msgid-Bugs-To: http://www.diafolcode-bugs.example.com\n"
Finally, we've produced our finished messages.pot file. It is now ready for distribution. Either as a downloadable .pot file, or as part of the whole package, if this is to be distributed.

20 Jan 2013

Updating to a new Version of PHP in XAMPP (Windows)

Still working on on old version of pHp, due to an old install of XAMPP? Hmm, I was too until I realised that some functions just didn't work. In my case it was the DateTime object and its failure to calculate the difference in the number of days between two dates. On further investigation, I realised that I had to have the VC9 build as opposed to the old and knackered VC6. Okay, armed with that info I sought to update pHp. Hmm (again) - didn't seem to be an easy way to do it, although I found a few references, it all looked a bit daunting. I didn't want to get caught up with pHp's binaries, so a I was looking at a fresh install of XAMPP. This actually proved to be far less of a problem than I thought, although it did take a little while. Here goes:
  1. Back up your data folders and virtual hosts file (assuming that XAMPP is installed in root):
    • C:\xampp\htdocs
    • C:\xampp\mysql\data
    • C:\xampp\apache\conf\extra\httpd-vhosts.conf
  2. It may be worth making a backup of your php.ini file too, just to check that you have the same configuration - although, don't just blindly use the old version as there may be differences.
  3. Once you've backed up all the important stuff, uninstall XAMPP. XAMPP comes with its own uninstaller, so use that. It may ask you if you need to keep htdocs and mysql data, but you can now safely answer no to those.
  4. Once uninstalled, hop over to the XAMPP site and download the version of your choice.
  5. Install XAMPP and copy over the htdocs and mysql data files to the new directories and then reinstate your virtual hosts in the httpd-vhosts.conf file. Ensure that the paths in that file are the same as the new installation. For example:
    NameVirtualHost *:80
        <VirtualHost *:80>
            DocumentRoot "C:/xampp/htdocs"
            ServerName localhost
        </VirtualHost>
    
        <VirtualHost *:80>
            DocumentRoot "C:/xampp/htdocs/grapenun"
            ServerName grapenun.local
            <Directory "C:/xampp/htdocs/grapenun">
              Order allow,deny
              Allow from all
            </Directory>
        </VirtualHost>
    
    If your new xampp installation is not at C:/xampp/, then you'll need to change all the references in the file.
  6. Fire up the XAMPP control panel - it may look different to the old panel - and start Apache and MySQL. Be patient - Apache sometimes has a little think about it before telling you it's started!

19 Jan 2013

pHp Localization with GETTEXT on Windows

What's GETTEXT?

Have you, like me built multilingual sites with multiple language files holding strings in multidimensional arrays, only to find synching the adding, editing and deleting items across files to be almost impossible? Well, the answer may be to use GETTEXT. Although this basically does the same thing - create separate files for all your localizations, it does have a number of advantages:
  • you can write pretty much plain text in your html, as opposed to providing a tricky array item, e.g. echo __('this is plain text') vs. echo $lang['plain_text'].
  • localizations are kept in .po and .mo files which can be modified and updated via editors like PoEdit, making the process relatively painless.
  • these editors scan your .php files for __('...') and update the po files for you.
  • if translation strings are not found in the target language, the default language string is used instead - although not ideal, at least you don't have to worry too much if a localization is only 99% complete - it will still display some flavour of text. For this reason, the original texts in the php files are usually written in the main language of your target audience, which may or may not be English.
  • if your localizations are disseminated amongst a number of translators or you have a lot of localizations, you can create .pot files, which you or your users can then use as a template for creating the .po files.
  • .po files can be uploaded to Pootle, a Python-based translation platform
  • It's fast - I mean really fast. Different to arrays, you don't have to load the whole localization!
So, this approach should help you keep a better handle on your localizations and their updating.

How do I Create po/mo Localizations?

There are two main ways to create .po files - either directly from the source files, or from a .pot file. From source would probably be easier in the case of a simple bilingual site, maintained entirely by one person for example. The .pot file, as mentioned, would be best for the outsourced or many languages situation.
Let's look at setting up a page with GETTEXT text in a simple page.
<?php include('includes/config.php'); //contains our gettext setup - see later?>
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title><?php echo _('This is the page title'); ?></title>
</head>
<body>
<h2><?php echo _('This is the main page title'); ?></h2>
<p><?php echo _('Enter your intro text here'); ?></p>
</body>
</html>
OK, from the above we can see 3 translation strings. If you've downloaded and installed PoEdit, follow these steps to create a translation from source.
  1. Create a directory structure for your localizations:
    That will create localization areas for Welsh (cy_GB) and Italian (it_IT). You could name the localizations directory anything you want, e.g. langs, locales, l10n, etc - it's not important.
  2. Next open up PoEdit and create a new catalog and add some catalog properties:
    The image above shows an imaginary language, Xoravian - you'd enter the name of the language for the localisation. The charsets will usually be set to UTF-8 and the plural forms can be gleaned from following the link near the textbox.
  3. On the second tab, save the basename as '.', then give the paths you need to check for strings in the paths textbox:
  4. To finish the catalog properties, just check the keywords:
    The usual suspects that you may want to use are:
    • __
    • gettext
    If you use Wordpress, you may wish to add _e.
  5. Once you've entered all the properties, save the .po file to the appropriate LC_MESSAGES directory (let's save it as lang.po into the Italian directory):
  6. Next, PoEdit will run an update to grab all the gettext strings in the paths you've entered. If successful a list of strings will appear. As below:
    In addition, your directory, should now contain two new files:

That's it, we're done creating the localization file. Now all that we need to do is enter the translation strings and save.

Need the create from .pot approach? I'll create that as a new post on PoEdit soon, but in short, just create a .po file as above and name it lang.pot. They have the same file formats.

Getting GETTEXT to work in Windows

OK, that last bit looked a bit complicated, but getting GETTEXT to play nicely with Windows is a bit of a nightmare and documentation is very patchy. So, I hope the following is useful and works for you.

In the first code snippet, we came across:

<?php include('includes/config.php'); //contains our gettext setup - see later?>

So here now is the all-important code in the includes/config.php file to get everything to work:

<?php
    /*  
    check url for lang parameter and limit the allowed languages,
    otherwise fallback to default text
     */ 
    if (isset($_GET['lang']) && in_array($_GET['lang'],array('it_IT','cy_GB')){
        $lang = $_GET['lang'];
        putenv("LC_ALL=$lang");
        setlocale(LC_ALL, $lang);
        $fn = 'lang'; //the name of the .po file
        bindtextdomain($fn, "./localization");
        bind_textdomain_codeset($fn, 'UTF-8');
        textdomain($fn);
    }
?>