Issue with transifex/php/unicode characters

Hi,

For anyone who might be interested, has the knowledge and time. I posted an issue related to the Transifex translations on Github.

I’m essentially looking for someone who’s good with scripting, php arrays, potentially python.

I’d also like to add that there’s no rush, it’s more of a nice to have at the moment.

Thanks

@Fuzzybear @kazzkiq

Because @Vitalicus worked so hard on the romanian translations I went through and took another look. I was able to put together a script that cleans up and merges the transifex files how I wanted it to. I haven’t automated it but can run it when we add another language or make lots of changes to translations.

3 Likes

First… SIte have problems with UNICODE simbols.

A very nice write up by some one who spent a lot of time on the topic is here:

http://kunststube.net/encoding/

Makes pretty interesting reading.

This website goes on to explain how php 6 will have built-in support for unicode, so if you are using php 5 there is a work around:

http://kuikie.com/snippet/69-16/php/arrays/php-function-to-convert-multibyte-string-to-array-mbstrtoarray/

@Vitalicus - I’m not too sure at this point. Are the characters still correct but just bold? I’m not seeing this with any other language and it’s UTF-8 everywhere when I look at the php files themselves. I also added UTF-8 as default in the http header and as the default for php5-fpm on the server itself.

@ppcman - It seems they had issues with adding unicode support in php6 and ended up stripping it out and then jumping immediately to php7. The site you linked is copyrighted 2012 so maybe it’s an old post. It’s an interesting read either way though. Kinda frustrating that this hasn’t been simplified by now though :).

Anyone seeing issues with unicode characters in any of the other languages? I can’t reproduce it on my browser but urlpng or browserling does show the bolded characters for romanian.

There is still the old python script that has this:

for resource in resources:
o = requests.get(resource, cookies=cookies)
o.encoding = “utf-8”

But I’m not sure what it’s doing.

I changed “Arial” instead of “Lato” font. It work ok, and no big diferences between fonts. Why to download all these Lato fonts ?

P.S. Lato fonts was updated…

I changed it from Lato to Open Sans, the font looks nice and from what I was able to test it fixed the display issue you were having at least in Firefox and Chrome. Internet Explorer 11 is still having problems but I’m not sure that really matters. I didn’t test Edge since it’s not available on Browserling.com.

Also pushed a new update of the romanian translations you made.

Please let me know if you’re still having issues. This new font still looks nice in my opinion and should make it more readable and compatible now that we’re adding more languages.

1 Like

Looks really nice.