Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extended KotD #19: _UCHARPOS
#1
Okies guys, this is one of those keywords that I've kinda been dreading to get to have to cover.  

Why??

Because the *concept* behind this one and what it does for us is rather complex to describe fully.  I have a feeling this is going to be a long arse post, so I'll probably end up wrapping it into spoiler tags so folks can navigate it a little easier, but grab yourself a soda and a snack before delving into this topic.  Tongue

INTRO
Show Content

UTF-8 vs UTF-16 vs UTF-32

Show Content

CODE POINTS vs GLYPHS

Show Content

And THAT gets us to the point where _UCharPos comes into existence for us!!

As I explained above, a string of unicode data may use 1, 2, 3, 4, 5 bytes to generate a codepoint.  And then it might take multiple codepoints and merge them together to make a final glyph/character, before printing it to the screen.

So a string of 20-bytes, in UTF-8 format, might be 20 characters.  Or 19 characters.  Or it might be only 2.   Heck, it might even be formatted wrong and not be any!!

UGHH!!!

So how the BLEEP would someone know how many characters an unicode string has??  How would you underline the word "FOO" in bight red, if you can't even know how many characters are before it, or after it??   

HOW THE FLIP DOES ANYONE DO ANYTHING MUCH AT ALL WITH UTF-8 FORMATTED CRAP???

UGGGHHHHHH!!!



Have no fear, _UCharPos is here!!   Tongue 

(to be continued in the next post below this one, as the forum has text limits and I don't want to write and write and write, just to have it cut me off or lose my work)
Reply


Messages In This Thread
Extended KotD #19: _UCHARPOS - by SMcNeill - 06-26-2024, 12:46 AM
RE: Extended KotD #19: _UCharPos - by SMcNeill - 06-26-2024, 12:54 AM



Users browsing this thread: 1 Guest(s)