Search, explore and find the perfect course for you
Text and Strings: Not So Simple After All
Code points! Graphemes! Surrogate pairs! Combining characters! Normalization forms! And there you were thinking strings were a simple data structure...
Representing all the world’s languages digitally turns out to be a pretty hard problem. As a result, the Unicode standard is packaged full of complexity. This session starts by looking at some of the world’s languages and how their writing systems work. It then digs into how Unicode rises to the challenge of representing these writing systems, explaining the relationship between codepoints, graphemes, encodings and normalization.
Next time there’s a Unicode confusion at the office, be the string hero!