UTF8 configuration

Linux related discussion about netjukebox
Locked
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

UTF8 configuration

Post by petchema »

Hi all,

I've read that netjukebox now fully supports UTF8 under Linux, however I have problems with accents (and single quotes) and I don't understand what is not correctly configured...
  • Distro: Debian Squeeze
  • Locales: {en,fr}.{iso-8859-1,utf8} all enabled
  • /etc/php5/apache2/php.ini:default_charset = "utf-8"
  • mysql: database created with default charset utf8
  • netjukebox config.inc.php: modified media_dir, 7za's working directory, and that's it
What did I miss?

Regards,
Pierre.
User avatar
wbartels
netjukebox developer
Posts: 881
Joined: Thu Nov 04, 2004 3:12 pm
Location: Netherlands
Contact:

Re: UTF8 configuration

Post by wbartels »

What is working and what not?

On my Ubuntu system all characters are display correctly: http://live.netjukebox.nl/index.php?act ... dkmydhkh0a
Only the created zip files are not UTF-8, but my current Ubuntu uses an older 7za (9.04 beta).

I haven’t made any changes to Apache, PHP or MySQL!
MySQL uses latin1_swedish_ci collation.
And PHP has comment ou the line: ;default_charset = "iso-8859-1", this will be set by netjukebox with ini_set('default_charset', NJB_DEFAULT_CHARSET);
The $cfg['default_charset'] = ''; in config.inc.php can be left empty, netjukebox will automatically use UTF-8 on none Windows operating systems.
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

Re: UTF8 configuration

Post by petchema »

Hi,
All tracks are found and are displayed with correct name, but tracks involving accents or single quote in their name will have a 0s length; And when played, it's another track (always the same) that is played.
User avatar
wbartels
netjukebox developer
Posts: 881
Joined: Thu Nov 04, 2004 3:12 pm
Location: Netherlands
Contact:

Re: UTF8 configuration

Post by wbartels »

petchema wrote:Hi,
All tracks are found and are displayed with correct name, but tracks involving accents or single quote in their name will have a 0s length; And when played, it's another track (always the same) that is played.
I have many tracks with single quotes that display the correct playtime.
These tracks can also be played with mpd.
Can you give me some example file names that are not working on your system?
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

Re: UTF8 configuration

Post by petchema »

wbartels wrote:Can you give me some example file names that are not working on your system?
Sure,
  • quotes:
    Image
  • accents:
    Image
  • special case: quotes in the path (album name or artist name): no track works
    ImageImage
Regards,
Pierre.
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

Re: UTF8 configuration

Post by petchema »

Update: after some more investigations real single quotes work perfectly, it's some quote look-alike characters imported from CDDB that don't:
Image
So I take back the problem with single quotes, I only have a problem with "non ascii characters".
User avatar
wbartels
netjukebox developer
Posts: 881
Joined: Thu Nov 04, 2004 3:12 pm
Location: Netherlands
Contact:

Re: UTF8 configuration

Post by wbartels »

petchema wrote:So I take back the problem with single quotes, I only have a problem with "non ascii characters".
With my setup none ascii caracters work perfectly:
http://live.netjukebox.nl/index.php?act ... c7lt3eoa5a
http://live.netjukebox.nl/index.php?act ... b730a35358
http://live.netjukebox.nl/index.php?act ... baiddiogjh
I will check later if I can reproduce you problem.

What you can try in the meantime is add some text to update.php like // 05-03-212 and then start config > update again.
This wil force a complete update.
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

Re: UTF8 configuration

Post by petchema »

wbartels wrote:What you can try in the meantime is add some text to update.php like // 05-03-212 and then start config > update again.
This wil force a complete update.
No change.

I don't know if it's related, but running update.php from CLI, I got:

Code: Select all

...
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weather Report - Sportin Life
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weather Report - This Is This
Weather Report - Live and Unreleased
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weber, Eberhard - Yellow Fields
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weber, Eberhard - Silent Feet
...
(that is, one warning for most albums, but not all). Line 280 of update.php is (version 5.33, plus the inserted line):

Code: Select all

...
        if ($match[1] != '' && $match[2] != '' && preg_match('#^(\d{' . strlen($match[1] . $match[2]) . '})\s+-\s+.+#', $filename[count($filename)-1])) {
...
Hope it helps.
User avatar
wbartels
netjukebox developer
Posts: 881
Joined: Thu Nov 04, 2004 3:12 pm
Location: Netherlands
Contact:

Re: UTF8 configuration

Post by wbartels »

Now I can reproduce your problem, when setting MySQL collation to utf8_general_ci or utf8_unicode_ci I get the same problem.
But with the default latin1_swedish_ci collation the playtime are calculated correctly.

With all collation settings the filenames are displayed correctly in the web browser.
I suspect that somehow with utf8_general_ci or utf8_unicode_ci collation the filename isn't valid for PHP file functions like fopen(), file_get_contents(), etc...
I will investigate this later.

For now I would suggest to use latin1_swedish_ci collation.
If it works for Ubuntu it should also work for Debian :wink:
Please let me know if this did the trick.
User avatar
wbartels
netjukebox developer
Posts: 881
Joined: Thu Nov 04, 2004 3:12 pm
Location: Netherlands
Contact:

Re: UTF8 configuration

Post by wbartels »

petchema wrote:
wbartels wrote:What you can try in the meantime is add some text to update.php like // 05-03-212 and then start config > update again.
This wil force a complete update.
No change.

I don't know if it's related, but running update.php from CLI, I got:

Code: Select all

...
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weather Report - Sportin Life
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weather Report - This Is This
Weather Report - Live and Unreleased
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weber, Eberhard - Yellow Fields
PHP Notice:  Undefined offset: 1 in /var/www/.netjukebox/update.php on line 280
Weber, Eberhard - Silent Feet
...
(that is, one warning for most albums, but not all). Line 280 of update.php is (version 5.33, plus the inserted line):

Code: Select all

...
        if ($match[1] != '' && $match[2] != '' && preg_match('#^(\d{' . strlen($match[1] . $match[2]) . '})\s+-\s+.+#', $filename[count($filename)-1])) {
...
Hope it helps.
This has nothing to do with the UTF8 problem, but with the directory structure.
It must have at least two directory levels (artist/album/01 - music file.mp3).
petchema
User
Posts: 6
Joined: Sun Mar 04, 2012 8:58 am

Re: UTF8 configuration

Post by petchema »

wbartels wrote:Now I can reproduce your problem, when setting MySQL collation to utf8_general_ci or utf8_unicode_ci I get the same problem.
But with the default latin1_swedish_ci collation the playtime are calculated correctly.
The best is the enemy of good, indeed this does the trick! :oops:

Thanks for your patience, now I should be able to enjoy my CDs from anywhere :D
Regards,
Pierre.
Locked