Text and Strings: Not So Simple After All

Code points! Graphemes! Surrogate pairs! Combining characters! Normalization forms! And there you were thinking strings were a simple data structure...

Representing all the world’s languages digitally turns out to be a pretty hard problem. As a result, the Unicode standard is packaged full of complexity. This session starts by looking at some of the world’s languages and how their writing systems work. It then digs into how Unicode rises to the challenge of representing these writing systems, explaining the relationship between codepoints, graphemes, encodings and normalization.

Next time there’s a Unicode confusion at the office, be the string hero!

  • .NET

Target audience



Course info

Course code: A006
Duration: 1 days
Price: Free



