See the article CBM 16/+4/116/64 character generator and Unicode (pdf, 110K) (it also is here) and the page PETSCII to Unicode mapping.

The mentioned above page documents contain a lot of useful information but also several inaccuracies:

  1. It considers c64/16/+4 box drawing symbols as light but they are heavy. They are light only for vic-20.
  2. Some symbols names and Unicode numbers do not match - unicode 0x2501 is heavy not light as in the petscii*.txt files.
  3. The value of shift for the box drawing vertical and horizontal symbols is twice greater than it should be.
  4. The symbols of quadrants (unicodes 0x2596-0x259e) aren't present in the mappings files.
  5. It's very hard to see the medium shade symbol in the symbol similar to part of chessboard.

The main disadvantage of proposed by the second article Unicode mapping is using of private use area. This is the same as using of IP Internet address reserved for private use (10.*.*.* or 192.168.*.* etc.) for access to the real WWW.

The best solution will be the registration of the unique Commodore symbols as the part of the probable "8-bits computers compatibility" code chart.

Two bytes area of Unicode is already nearly completely full so for this new chapter only 3 bytes codes may be really allocated.

Some Commodore symbols are not easy to identify. One of this is "Check mark" or "Square root" symbol. The mention collision is reflected even in the official Unicode code charts — see "Dingbats" or "Mathematical Operators" charts of Unicode. It's more probable that in the CBM character generator this symbol means "Check mark".

The basis idea of Unicode mapping is to create the possibility of exact text conversion between any computer systems. The mapping of control characters has significantly less importance for this task.

The commodore code 160 may be mapped as "unbreakable space" symbol (0x00a0).