The IsSpace function will first check if your rune
is in the Latin1 char space. If it is, it will use the space characters you listed to determine white-spacing.
If not, isExcludingLatin
(http://golang.org/src/unicode/letter.go?h=isExcludingLatin#L170) is called which looks like:
170 func isExcludingLatin(rangeTab *RangeTable, r rune) bool {
171 r16 := rangeTab.R16
172 if off := rangeTab.LatinOffset; len(r16) > off && r <= rune(r16[len(r16)-1].Hi) {
173 return is16(r16[off:], uint16(r))
174 }
175 r32 := rangeTab.R32
176 if len(r32) > 0 && r >= rune(r32[0].Lo) {
177 return is32(r32, uint32(r))
178 }
179 return false
180 }
The *RangeTable
being passed in is White_Space
which looks is defined here:
http://golang.org/src/unicode/tables.go?h=White_Space#L6069
6069 var _White_Space = &RangeTable{
6070 R16: []Range16{
6071 {0x0009, 0x000d, 1},
6072 {0x0020, 0x0020, 1},
6073 {0x0085, 0x0085, 1},
6074 {0x00a0, 0x00a0, 1},
6075 {0x1680, 0x1680, 1},
6076 {0x2000, 0x200a, 1},
6077 {0x2028, 0x2029, 1},
6078 {0x202f, 0x202f, 1},
6079 {0x205f, 0x205f, 1},
6080 {0x3000, 0x3000, 1},
6081 },
6082 LatinOffset: 4,
6083 }
To answer your main question, the IsSpace
check is not limited to Latin-1.
EDIT
For clarification, if the character you are testing is not in the Latin-1 charset, then the range table lookup is used. The Range16
values in the table represent ranges of 16bit numbers {Low, Hi, Stride}. The isExcludingLatin
will call is16
with that range table sub-section (R16
) and determine if the rune
provided falls in any of the ranges after the index of LatinOffset
(which is 4 in this case).
So, that is checking these ranges:
{0x1680, 0x1680, 1},
{0x2000, 0x200a, 1},
{0x2028, 0x2029, 1},
{0x202f, 0x202f, 1},
{0x205f, 0x205f, 1},
{0x3000, 0x3000, 1},
There are unicode code points for:
http://www.fileformat.info/info/unicode/char/1680/index.htm
http://www.fileformat.info/info/unicode/char/2000/index.htm
http://www.fileformat.info/info/unicode/char/2001/index.htm
http://www.fileformat.info/info/unicode/char/2002/index.htm
http://www.fileformat.info/info/unicode/char/2003/index.htm
http://www.fileformat.info/info/unicode/char/2004/index.htm
http://www.fileformat.info/info/unicode/char/2005/index.htm
http://www.fileformat.info/info/unicode/char/2006/index.htm
http://www.fileformat.info/info/unicode/char/2007/index.htm
http://www.fileformat.info/info/unicode/char/2008/index.htm
http://www.fileformat.info/info/unicode/char/2009/index.htm
http://www.fileformat.info/info/unicode/char/200a/index.htm
http://www.fileformat.info/info/unicode/char/2028/index.htm
http://www.fileformat.info/info/unicode/char/2029/index.htm
http://www.fileformat.info/info/unicode/char/202f/index.htm
http://www.fileformat.info/info/unicode/char/205f/index.htm
http://www.fileformat.info/info/unicode/char/3000/index.htm
All of the above are considers "white space"