In an effort to knock-off one last, major item from Qwicap's "to do" list before version 1.4 is released, I've begun trying to internationalize it. As I currently conceive the problem, this mostly involves removing from the code the error messages that Qwicap automatically adds to a web application's XHTML pages. For example, there's the message that the various numeric input retrieval methods (Qwicap.getInt
, Qwicap.getDouble
, etc.) add to pages when input is outside of the application-defined valid range of values: "The number must be in the range 0 to 100 (inclusive). The number '900' is not in that range." Complicating matters a bit is a standard feature of Qwicap that automatically replaces phrases like "The number" in such messages with the label of the relevant form control, as found in the web page, if there is such a label in the page.
I knew that there were internationalization features somewhere in Java (I'm using Java 1.5, by the way), but that's almost all that I knew. (I haven't made a serious effort to internationalize anything since my Macintosh programming days.) So, I worked my way through my colleagues until I found one who knew more than me on this subject, and doing some follow-on Google searches and related reading, I learned that the basic facility in Java for abstracting language-specific text from one's code is the ResourceBundle
, and, typically, the PropertyResourceBundle
subclass, which loads language/nation-specific text from a hierarchy of "properties" files.
So I now understood that, for a start, I needed to setup a properly named and formatted "properties" file for PropertyResourceBundle
to discover and load. And the documentation for the PropertyResourceBundle
class promptly referred me to the Properties
class for information about character encoding issues. The main issue, it turns-out, is that the Properties
class only supports one character set, ISO-8859-1. Therefore, if you need to represent any non-Latin characters, a cumbersome Unicode escape sequence must be used for each and every one of them. I find an internationalization "feature" designed without direct support for non-Latin characters tough to take seriously, and in a language like Java, which uses Unicode natively, such a design beggars belief. Of course, we all having things in our past that we wish we could go back and do differently. Maybe this is one such thing for the Java platform.
Fortunately, Java 1.5 added to the Properties
class a method for loading "properties" from XML files, and XML can be represented in any character set, since the XML declaration (for example: <?xml version="1.0" encoding="UTF-8"?>
) can tell an XML parser how its characters were encoded. It appeared that the problem was solved.
Of course, the problem wasn't really solved, because it turns-out that the PropertyResourceBundle
class does not support XML "properties" files. The solution to that problem seemed to be creating a subclass of ResourceBundle
that does support them. And creating that subclass looked straightforward at first glance - just create appropriate implementations of the abstract methods handleGetObject
and getKeys
, then declare victory. Unfortunately, doing so is nearly useless, because the static ResourceBundle.getBundle
methods that implement the hierarchical search for language- and/or nation-specific resource bundles, and which then instantiate the list of appropriate ResourceBundle
subclasses that are necessary to represent the hierarchy of potentially applicable resource bundles, have their choice of subclasses hard-coded into them. So, they can instantiate the built-in ListResourceBundle
and PropertyResourceBundle
classes, and nothing else.
Having come that far, I couldn't admit defeat, however, so I took the time to completely re-implement the ResourceBundle.getBundle(String, Locale, ClassLoader)
method in my subclass. I thought that that would finally do the trick, but, I was wrong again, because I'd forgotten that static methods can't be overridden, they can only be hidden. Which meant that the lesser implementations of ResourceBundle.getBundle
(getBundle(String)
and getBundle(String, Locale)
) were still invoking the original implementation of getBundle(String, Locale, ClassLoader)
, rather than mine. That left me feeling dumb, but creating my own implementations of those lesser getBundle
methods would be a piece of a cake, and, with all of the original implementations hidden, I would finally have a subclass of ResourceBundle
that looked and acted just like a normal ResourceBundle
, but which supported XML "properties" files. So that's what I did (mistaking the light that was drawing ever nearer for the daylight at the end of the tunnel).
At this point, it should go without saying that that didn't work, which it didn't. My IDE almost immediately pointed-out something that I hadn't noticed in the API documenation: the two lesser getBundle
methods are marked final
, and therefore can't even be hidden by the methods of a subclass. For some reason the primary getBundle
method isn't final
, but the two little convenience methods that front-end for it, are final
. Like so many other aspects of my day's dalliance with internationalization, that seems utterly pointless to me, but there it is.
The only good news at that stage was that hiding those two methods didn't really matter, since I was only going to use my subclass of ResourceBundle
internally, and I know to use my implementation of the getBundle
method when instantiating it. In fact, having already re-written all of the hard parts of the ResourceBundle
class well enough for my purposes, I didn't even need to subclass ResourceBundle
anymore; I could pretty much just remove the "extends ResourceBundle
" phrase from my class' declaration and be done with it.
On the other hand, the main reasons for not implementing my own resource scheme from scratch at the beginning were (1) the belief that by using the familiar mechanism represented by the ResourceBundle
class, other developers would have an easier time understanding my code, if the need ever arose, and (2) the hope that somewhere under all that rubbish lay a core of undiscovered, but wonderful, internationalization functionality, from which my code would benefit. There's some sense to the former concern, but the latter appears to be utterly groundless; if there is anything wonderful below, I haven't found it, and I don't see anywhere left for it to hide.
The niftiest feature associated with using "properties" files for internationalization that I'm aware of, is that the internationalized text can contain MessageFormat
patterns. However, that feature is orthogonal to the Properties
and ResourceBundle
classes, which make no use of MessageFormat
, and therefore leave it as an exercise to the developer to make his/her MessageFormat
patterns do anything.
By the way, the documentation for the Properties
class says of its loadFromXML
method and the XML files that it loads: "the system URI (http://java.sun.com/dtd/properties.dtd) is not accessed when exporting or importing properties; it merely serves as a string to uniquely identify the DTD". This turns out to be somewhere between "misleading" and "wrong"; either that DTD really is accessed, or a local copy of it is used instead, because the rules in that DTD are enforced when loadFromXML
reads XML. Which makes the XML support in the Properties
class useless for my purpose, because the strings that Qwicap needs to make internationalizable frequently contain elements of XHTML markup. Those XHTML elements are well-formed (in the simple sense that any start tags are matched by corresponding end tags), so they would pose no problems for an XML parser that wasn't trying to enforce the rules in a DTD, but, sadly, that is not the case here.
Qwicap's XML engine made implementing my own support for XML "properties" files a trivial matter, but the end result of doing so was that, of the three Java classes that appear to be the cornerstones of Java of internationalization (ResourceBundle
, Properties
, and MessageFormat
), I had to re-implement two of them to get an internationalization capability that was useful to my application.
Maybe my requirements are unusual, but they boil down to nothing more than supporting encodings other than the obviously limited ISO-8859-1 for my application's internationalized text, and needing to include XML (specifically XHTML) elements within some of that text. Neither requirement strikes me as remarkable individually, or in combination.
I'm new to Java internationalization, so it's easy to believe I'm missing something. Please set me straight if I am. If I'm not, put me down as lightly stunned by the substantially roll-your-own character of the Java platform's internationalization "solution".