Author Topic: Can I haz full unicode support @forums plz?  (Read 8827 times)

0 Members and 2 Guests are viewing this topic.

Offline frozenfrogzTopic starter

  • Frequent Contributor
  • **
  • Posts: 936
  • Country: de
  • Having fun with Arduino and Raspberry Pi
Can I haz full unicode support @forums plz?
« on: September 11, 2017, 06:20:50 pm »
Hello everyone.
I did not really know where I should put this, because I could not find a forum feedback section.

I am running into issues regarding not supported glyphs on a more or less regular basis. This is no big deal in a general sense, but it annoys the hell out of me. As far as I can see it, the site uses the UTF-8 charset, but the editor allows usage of Unicode characters - these are all correctly displayed. This might be a browser-side thing, but IDK. After posting, all characters not part of UTF-8 get replaced by a question mark.

I would love to see either of two options implemented:

A: Full support of unicode characters (preferred)
B: Disallow everything not UTF-8 compliant in the editor.

Is it only me? What is your opinion on that?

Best regards,
Frederik
He’s like a trained ape. Without the training.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11717
  • Country: us
    • Personal site
Re: Can I haz full unicode support @forums plz?
« Reply #1 on: September 11, 2017, 07:41:22 pm »
Editor is browser-side, it will let you enter whatever your OS allows. The forum then goes and filters stuff that is not ASCII.
Alex
 

Offline frenky

  • Supporter
  • ****
  • Posts: 1003
  • Country: si
    • Frenki.net
Re: Can I haz full unicode support @forums plz?
« Reply #2 on: September 11, 2017, 07:45:31 pm »
In html of site there is charset=UTF-8 tag so that is ok. Probably DB does not support unicode...
Test: ??ž?š
Should be:

EDIT:
Interesting... in preview of the post all characters were ok...
« Last Edit: September 11, 2017, 07:48:51 pm by frenky »
 

Offline frozenfrogzTopic starter

  • Frequent Contributor
  • **
  • Posts: 936
  • Country: de
  • Having fun with Arduino and Raspberry Pi
Re: Can I haz full unicode support @forums plz?
« Reply #3 on: September 11, 2017, 08:19:30 pm »
EDIT:
Interesting... in preview of the post all characters were ok...

And that is the annoying part. Since SOME accents / special chars can be displayed and others can not, it is pretty inconvenient.

acute: é á í ú ó
grave: è à ì ù ò
diaeresis ? ä ï ü ö
Umlaut: ä ü ö
cedilla: ç
scandinavian characters: å æ ø

That is as good as it gets, maybe some Slavic characters, but no Russian etc.

Edit: Noe_el is out as well xD
He’s like a trained ape. Without the training.
 

Offline cdev

  • Super Contributor
  • ***
  • !
  • Posts: 7350
  • Country: 00
Re: Can I haz full unicode support @forums plz?
« Reply #4 on: September 11, 2017, 08:28:32 pm »
I could see the accented characters..
"What the large print giveth, the small print taketh away."
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11717
  • Country: us
    • Personal site
Re: Can I haz full unicode support @forums plz?
« Reply #5 on: September 11, 2017, 08:29:44 pm »
Test of Russian: ???? ????????.

Nope, not at all. And apparently Russian is a lot of confused smiles :)
« Last Edit: September 11, 2017, 08:31:16 pm by ataradov »
Alex
 
The following users thanked this post: jancumps

Online Zero999

  • Super Contributor
  • ***
  • Posts: 19888
  • Country: gb
  • 0999
Re: Can I haz full unicode support @forums plz?
« Reply #6 on: September 11, 2017, 08:54:39 pm »
This has been an issue for as long as the forum has existed and people complain about it, from time to time.

Yes, i do find it odd that some non-ASCII characters are supported and not others. I use the clippings extension for Firefox to keep a list of useful non-ASCII characters, which this site supports. Here they are: µ×÷±¼½¾°
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11717
  • Country: us
    • Personal site
Re: Can I haz full unicode support @forums plz?
« Reply #7 on: September 11, 2017, 08:56:31 pm »
I think it just supports full extended-ASCII (8-bit): Here is a test of some old-school DOS-style frames: ? ? ?

Nope: frames did not go though. So there is some explicit white list.

??? <- Here are Cyrillic leters entered as Alt-{numpad 150, 151, 152}. So it does support simple 8-bit set as input.

Nope did not work either. They show up correctly after the posting, but change back to garbage after a page refresh. Strange.
« Last Edit: September 11, 2017, 09:00:50 pm by ataradov »
Alex
 

Online Monkeh

  • Super Contributor
  • ***
  • Posts: 8049
  • Country: gb
Re: Can I haz full unicode support @forums plz?
« Reply #8 on: September 11, 2017, 09:02:44 pm »
I think it just supports full extended-ASCII (8-bit): Here is a test of some old-school DOS-style frames: ? ? ?

Which one is that? 8859-1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16? CP-1252?


‡ ‰ … €

© ª

Looks like perhaps CP-1252. The horror.

I know conversions often don't go well, but.. please. This sucks.
« Last Edit: September 11, 2017, 09:08:21 pm by Monkeh »
 

Offline bitseeker

  • Super Contributor
  • ***
  • Posts: 9057
  • Country: us
  • Lots of engineer-tweakable parts inside!
Re: Can I haz full unicode support @forums plz?
« Reply #9 on: September 12, 2017, 12:15:36 am »
It could be a major overhaul to make the forum truly UTF-8. Limitations are OK if it's clear what they are. That's where the preview functionality could do a better job of delivering an accurate preview prior to posting.

At least we can do °C and µV. If only \$\Omega\$ was supported as a character, I think that'd cover the "special" characters I generally use in posts here.
TEA is the way. | TEA Time channel
 

Offline Bud

  • Super Contributor
  • ***
  • Posts: 7081
  • Country: ca
Re: Can I haz full unicode support @forums plz?
« Reply #10 on: September 12, 2017, 01:19:27 am »
Why is this an issue and why is it needed on an English speaking forum? If people will happily post in their favorite languages it will be a mess.
Facebook-free life and Rigol-free shack.
 

Online ataradov

  • Super Contributor
  • ***
  • Posts: 11717
  • Country: us
    • Personal site
Re: Can I haz full unicode support @forums plz?
« Reply #11 on: September 12, 2017, 01:22:38 am »
Why is this an issue and why is it needed on an English speaking forum? If people will happily post in their favorite languages it will be a mess.
Well, the first time I ran into this problem is when I tried to do a translation of a document for a Soviet-made part someone asked about here.

There are uses for full language support. It is not critical, but in 2017 is important, IMO.
Alex
 
The following users thanked this post: wraper

Online Monkeh

  • Super Contributor
  • ***
  • Posts: 8049
  • Country: gb
Re: Can I haz full unicode support @forums plz?
« Reply #12 on: September 12, 2017, 01:35:56 am »
Why is this an issue and why is it needed on an English speaking forum? If people will happily post in their favorite languages it will be a mess.

???? (up, down, left, right arrows)

¹ ² ³ × ÷ ± (superscript 1, 2, 3, multiply, divide, plus/minus)

?  (Omega)

®  © ™ ((R), (C), TM)

° (degree)

? ¼ ? ½ ? ¾ ? (1/8 1/4 3/8 1/2 5/8 3/4 7/8)

Some gets through. Some doesn't. Remembering what does and doesn't work is annoying.

E: For giggles,a few more by row:
¬!"£$%^&*()_+
|¡?£¼???™±°¿
??E®?¥??ØÞƧЪ???
¦<>©‘’Nº×÷
« Last Edit: September 12, 2017, 01:42:48 am by Monkeh »
 

Offline Bud

  • Super Contributor
  • ***
  • Posts: 7081
  • Country: ca
Re: Can I haz full unicode support @forums plz?
« Reply #13 on: September 12, 2017, 01:40:50 am »
Honestly, from your list it is not a problem for most of those to use conventional ASCII equivalents and the other ones such as the copyright character may only be needed once in a blue moon.
Facebook-free life and Rigol-free shack.
 

Online Monkeh

  • Super Contributor
  • ***
  • Posts: 8049
  • Country: gb
Re: Can I haz full unicode support @forums plz?
« Reply #14 on: September 12, 2017, 01:42:29 am »
Honestly, from your list it is not a problem for most of those to use conventional ASCII equivalents and the other ones such as the copyright character may only be needed once in a blue moon.

And when I decide to copy and paste something from a document which makes use of these wonderful computers we've developed since the 1960s? I can't tell what's screwed up until it's posted and I can't remember the list of needless limitations.

I really like typing µ and ?. Ah, crap, no ?.

?·m. N·m.

Oh hey, we do get interpunct, that's something.
« Last Edit: September 12, 2017, 01:45:57 am by Monkeh »
 

Offline Ampera

  • Super Contributor
  • ***
  • Posts: 2578
  • Country: us
    • Ampera's Forums
Re: Can I haz full unicode support @forums plz?
« Reply #15 on: September 12, 2017, 04:00:12 am »
Well you can always do


I forget who I am sometimes, but then I remember that it's probably not worth remembering.
EEVBlog IRC Admin - Join us on irc.austnet.org #eevblog
 

Offline bluevd

  • Contributor
  • Posts: 13
Re: Can I haz full unicode support @forums plz?
« Reply #16 on: September 12, 2017, 07:13:24 am »
The problem is much more complicated that it would look at first sight.
Consider the software involved: web server (most likely Apache), script interpreters (PHP), database (MySQL) and dbms (mysqli or whatever php has).
And last but not least, the forum software.
If you ever tried to get the above combo to support everything you know how much of a pain it can be. Sometimes you go through all the pain to realize the forum scripts strip special characters out.
Most likely, you'd be better served by a Pastebin service that supports unicode encoding for posting manual translations and such.
 :)
 

Offline Ian.M

  • Super Contributor
  • ***
  • Posts: 13045
Re: Can I haz full unicode support @forums plz?
« Reply #17 on: September 12, 2017, 07:24:28 am »
Actually v2 of SMF forum does support Unicode *IF* it is enabled.  The problem is the message database that is in whatever legacy encoding that was chosen when the forum was originally installed before SMF unicode support.   There is an official conversion procedure but I believe there is a considerable risk of data loss so it would be a lot of work for Gnif to activate it then resolve the resulting problems.

See https://wiki.simplemachines.org/smf/UTF-8_Readme
 

Offline rs20

  • Super Contributor
  • ***
  • Posts: 2320
  • Country: au
Re: Can I haz full unicode support @forums plz?
« Reply #18 on: September 12, 2017, 07:31:52 am »
As far as I can see it, the site uses the UTF-8 charset, but the editor allows usage of Unicode characters.

UTF-8 is an encoding of Unicode, not an alternative. UTF-8 can encode any Unicode character.

The problem is much more complicated that it would look at first sight.
Consider the software involved: web server (most likely Apache), script interpreters (PHP), database (MySQL) and dbms (mysqli or whatever php has).
And last but not least, the forum software.
If you ever tried to get the above combo to support everything you know how much of a pain it can be.

Nope, done it before, it's trivial if you're paying attention. If the mods were to set up a brand new EEVBlog forum from scratch, tomorrow, I bet they'd use UTF-8 throughout without any trouble. But what's really difficult, and the semi-legit reason why this change isn't being made, is that the migration from ASCII to UTF-8 is a huge and risky job -- you have to change all these moving parts in parallel, simultaneously, and also modify the encoding in the database and possibly re-encode everything in the database. It's not the sort of thing you can do casually over a few weekends.
 

Offline A Hellene

  • Frequent Contributor
  • **
  • Posts: 602
  • Country: gr
Re: Can I haz full unicode support @forums plz?
« Reply #19 on: September 12, 2017, 08:48:07 am »
Yet, Unicode support is still there for older messages; for example:
https://www.eevblog.com/forum/chat/an-partial-goodbye-could-be/msg42748/#msg42748

How can this be?


-George
Hi! This is George; and I am three and a half years old!
(This was one of my latest realisations, now in my early fifties!...)
 
The following users thanked this post: frozenfrogz

Online tooki

  • Super Contributor
  • ***
  • Posts: 12548
  • Country: ch
Re: Can I haz full unicode support @forums plz?
« Reply #20 on: September 12, 2017, 08:56:17 am »
Yes, a switch to Unicode is about 15 years overdue. Aside from scientific notation, there are also things like users asking for translations, or phonetic transcription, just to list some examples. It boggles my mind that anyone in this century would configure a forum as ASCII...
 
The following users thanked this post: wraper, frozenfrogz

Offline wraper

  • Supporter
  • ****
  • Posts: 17579
  • Country: lv
Re: Can I haz full unicode support @forums plz?
« Reply #21 on: September 12, 2017, 09:08:28 am »
Well you can always do



You can, and it's a big PITA. When people ask to translate something about Soviet components or similar, it's a huge time waste. And the worst part is that preview shows everything correctly. So you post just to find a bunch of question marks and smilies. Also, when you replace text with pictures, it's no longer possible to copy and google it, which often would be very useful. For example, to find Russian documentation, or suggest a search term for taobao.
« Last Edit: September 12, 2017, 09:15:08 am by wraper »
 
The following users thanked this post: tooki, frozenfrogz

Offline frozenfrogzTopic starter

  • Frequent Contributor
  • **
  • Posts: 936
  • Country: de
  • Having fun with Arduino and Raspberry Pi
Re: Can I haz full unicode support @forums plz?
« Reply #22 on: September 12, 2017, 09:38:51 am »
UTF-8 is an encoding of Unicode, not an alternative. UTF-8 can encode any Unicode character.

Sorry, I got that wrong. So my request would be for full Unicode char set.
Since the forum seems to fully support UTF-8 encoding and can display all Unicode characters (?), the editor might be to blame. Maybe it is just a mismatch of the character encoding in the editor vs. the database.
He’s like a trained ape. Without the training.
 

Offline A Hellene

  • Frequent Contributor
  • **
  • Posts: 602
  • Country: gr
Re: Can I haz full unicode support @forums plz?
« Reply #23 on: September 12, 2017, 10:03:03 am »
[...]
Maybe it is just a mismatch of the character encoding in the editor vs. the database.

Or, perhaps, some kind of filter between the Editor (that seems to be supporting Unicode during preview) and the Database (that supports Unicode, as can be seen below)?



I do not really know...


-George
Hi! This is George; and I am three and a half years old!
(This was one of my latest realisations, now in my early fifties!...)
 

Offline capt bullshot

  • Super Contributor
  • ***
  • Posts: 3033
  • Country: de
    • Mostly useless stuff, but nice to have: wunderkis.de
Re: Can I haz full unicode support @forums plz?
« Reply #24 on: September 12, 2017, 10:03:13 am »
Bwargh. Stay away with all that charset confusion.

Anything else but 7 Bit ASCII is to be considered bad!
Safety devices hinder evolution
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf