The text encoding FSO is using has always been a bit
weird. In order to actually display special characters like the German umlauts FSO needed a special font file which had support for these characters. These font files also weren't the same for all localizations which lead to a lot of problems if you wanted to play the game in German but only had the English font files. The TrueType font rendering feature improved that situation a bit by adding the ability to
render UTF-8 encoded Unicode strings but FSO still assumed that every string was in the special encoding FSO used so no one could use that ability.
That is what these test builds are about. They introduce the ability to let FSO process UTF-8 encoded files and handle them properly in the entire engine.
Test builds for all platforms:
http://swc.fs2downloads.com/builds/test/unicodeSupport/Pull request:
https://github.com/scp-fs2open/fs2open.github.com/pull/1416Since this feature changes the way the engine handles text pretty extensively I chose to introduce a new "Unicode mode" for FSO. In this mode FSO expects that every text data it handles must be UTF-8 encoded Unicode strings. It also completely disables support for the old VFNT bitmap fonts.
You can enable Unicode mode by using the
$Unicode mode option in the mod table. It has to appear after the location where the engine expects
$Window title.
Now that you have enabled unicode mode you will probably encounter some issues. The standard retail files are all Latin1 encoded and FSO can't read that encoding anymore (it expects UTF-8). Since FSO is a nice program it will automagically detect if a file uses Latin1 encoding and then convert that data to UTF-8. Since that is not the desired data format FSO will show a warning to let you know that you should really convert the encoding.
Converting the encoding can either be done with a command line utility like
iconv or a text editor like Notepad++ which has support for reencoding a file. The files you must convert from a retail install are
string.tbl (this one requires a few other small changes),
tstrings.tbl and
weapons.tbl. The weapons table is only an issue because it uses some special characters in a comment but FSO still reads those characters and doesn't understand them.
The string table contains one entry that previously used a special character. This is the entry with the number 385. Replace the
%c with © symbol to restore the old behavior. Another issue with this table is that it contains syntax errors that were not recognized by the retail parsing code. These parser errors were corrected in FSO but since we never break retail compatibility there is a workaround which fixes that for retail data but that workaround does not work with UTF-8 encoded files since it assumes that the text data is Latin-1 encoded. I already fixed these issues for my tests and uploaded those files in the test mod below.
While I was testing these changes I needed a test mod which should be a good starting point for your tests:
http://www.mediafire.com/file/pw5788071fljm1m/unicodeTest.7zTheoretically, this should also allow translations into languages with radically different characters like Japanese.
Please test these changes and let me know if you find something that breaks the new code.