Difference between revisions of "Language Tips And Tricks"

From GeeklogWiki
Jump to: navigation, search
(Multilingual Sites)
Line 9: Line 9:
  
 
If you plan to run a site where the content will be in several languages, it is better to stick to languages that support UTF-8. Since they all use the same code, you will be able to display German Umlaut next to Japanese Characters for example. Otherwise, readers will always see unreadable characters next to readables, and will have to switch languages in order to read everything. Read below how to convert your favorite Language to UTF-8.
 
If you plan to run a site where the content will be in several languages, it is better to stick to languages that support UTF-8. Since they all use the same code, you will be able to display German Umlaut next to Japanese Characters for example. Otherwise, readers will always see unreadable characters next to readables, and will have to switch languages in order to read everything. Read below how to convert your favorite Language to UTF-8.
 +
 +
Further, to support proper search and other functions, you should enable --with-mb_string functions in your PHP. In Linux, this might mean that you have to recompile linux! If you do this remember also to set your php.ini to have "mbstring.func_overload = 7". This makes php use multibyte-functions by default whenever a single-byte function is called.
  
 
== Unwanted Languages ==
 
== Unwanted Languages ==

Revision as of 14:44, 17 May 2005

General Things

When you switch to another language in Geeklog, not only the menu text changes, but also the character set might change. This means that certain international characters might not be readable anymore. So if your site has German, Japanese or eastern European content, switching the language might turn your complete content unreadable.


Multilingual Sites

If you plan to run a site where the content will be in several languages, it is better to stick to languages that support UTF-8. Since they all use the same code, you will be able to display German Umlaut next to Japanese Characters for example. Otherwise, readers will always see unreadable characters next to readables, and will have to switch languages in order to read everything. Read below how to convert your favorite Language to UTF-8.

Further, to support proper search and other functions, you should enable --with-mb_string functions in your PHP. In Linux, this might mean that you have to recompile linux! If you do this remember also to set your php.ini to have "mbstring.func_overload = 7". This makes php use multibyte-functions by default whenever a single-byte function is called.

Unwanted Languages

You can remove unwanted languages by simply deleting the unwanted files from the /language/ -directory. Or you might want to create a subfolder there and move the unwanted files into it. Like this you can avoid that users choose Languages with character sets that will turn content unreadable.

How to convert Language file Character Sets

First check in the download section of Geeklog.net if nobody already converted the language file after the last release of Geeklog!

If you have Windows & MS Word, there is a very easy way to convert your language files to UTF-8 or other formats: Simply open the language-file with Word (rename it .doc and double-click for example). Then you will be asked what Character set the Source has. Choose the one that allows you to read the characters properly in the preview. Then save the file again as "Text Only"-Format. You will be asked again what Character Set you want to use. Choose your preferred ("Unicode UTF-8" for example). Make sure you also correct the line 32 of the file so that it represents the format you want:

$LANG_CHARSET = "utf-8";

Then, rename the file to "language_utf-8.php".

If you have done so, you might want to publish the file also on the geeklog.net-website.


How to convert your complete Geeklog to Unicode/UTF-8 in 11 steps.

If you started your server with one language, and want to move to a multilanguage system later on, it can be quite difficult if you need different charachter sets for the different languages. With UTF-8 you can display all langauges and letters at the same time in the same page without any problems. The best solution therefore would be to convert your _complete_ data into Unicode in one go. You dont want to go through each story and comment and edit letters by hand after you switched the encoding of the page.

For this procedure to work you need (1) a text editor like notepad that can save raw text data while choosing the encoding (Notepad unter windows for example), (2) direct access to you MySQL executable (Windows cmd or Linux Shell) and (3) a NON-Unicode-Capable Editor (!) such as PHPEdit (or linux shell editors AFAIK). You can find out which editor cannot handle Unicode by saving a small text containing a "xxöxx" or similar with notepad as UTF-8 (See step 3) and opening it with the other editor. If you the letter "ö" appears now as "ö" you have the right one.

Now let's start.

1. Make a Backup of your database using the Geeklog Backup function.

2. Download the created file from your server, zip it, burn it on a CD and send it to your lawyer. He should keep it as a backup in case you need it later to go back to step one.

3. Open the file in Notepad/unicode-able editor (Notepad.exe). If your database is large, this will take some time, but its worth waiting for. Save the file again under a different (!) filename. Before you press the save-button, you have a field called "Encoding" below the field where you entered the filename. Choose "UTF-8".

4. Now open the file in the non-Unicode-capable editor. You will see three strange letters "" in the beginning of the file. Remove them. They cause error-messages in the SQL later. Save the file again.

5. Upload the file to your server. You might want to zip it first (also before downloading) to reduce transfer time.

6. Create a new database if you can. If not delete all tables from your old one (call your lawyer to have your backup ready in an envelope).

7. Write the data back into the database. Do this by typing

mysql --user=root --password databasename file.txt

replace "databasename" with the name of the newly created or emtpy database, and file.txt with the name of the file that you saved with notepad. The file has to be in the same directory as you are, if not move the file there. In Windows, it has to be in the /Mysql/bin-directory, where your MySQL.exe resides. If you do not want to move the files, prepend the path to the mysql.exe and/or the filename path.

If you do not have root access to the database, exchange "root" with the username that you use to access your MySQL server.

8. Hit enter. You will be prompted for the user's password. Enter it, and press return again. Now the database will be parsed back into the server.

9. If you created a second database, edit your geeklog's config.php so that it accesses the new one.

10. Switch your geeklog language to one of the _UTF-8 sets. If you want to reduce hassle with your users that have been using non-UTF-8 languages, convert all the language-files you have to UTF-8 with notepad (You dont need to remove the first three letter there) and change the encoding string in each file (Line 30) to $LANG_CHARSET = "utf-8";

11. Change config.php so that the standard language file will be one of the UTF-8 sets also (somehwere line 239), $_CONF['language']="english_UTF-8";

12. You might want to delete all non-utf-8 languages from your language folders.

Thats all. Good luck :-)

Switching back to one language (1.3.9sr1 and before)

If you set your config.php so that you use only one lanugage, you might run into trouble if you have users that already registered and selected another language. These users will continue to use their pre-set language. Even if you set their language manually to the standard or NULL (with a tool such as phpMyAdmin), the cookies on the user's computers will still call up the old one.