Swift 里字符串(三)small String

small string, 只有两个 UInt64
的字,这里面存储了所有的信息。
内存布局如下: 
第二个 UInt64
存储了标记位和长度信息,以及部分字符串的值
// Get an integer equivalent to the _StringObject.discriminatedObjectRawBits
// computed property.
@inlinable @inline(__always)
internal var rawDiscriminatedObject: UInt64 {
// Reverse the bytes on big-endian systems.
return _storage.1.littleEndian
}
rawbit.1 | 值 | 含义 |
---|---|---|
b63 | 1 | 是不可变的 |
b62 | 0/1 | 是否是ASCII |
b61 | 1 | 是 small string |
b60 | 0 | 可以获取联系utf8 code point |
b59-b56 | 0000~1111 | 已经使用的长度 |
b55~b0 | 存储utf8 code point |
第一个UInt64
存储的都是字符串的值
初始化 small string
最基本的初始化
@inlinable @inline(__always)
internal init(leading: UInt64, trailing: UInt64, count: Int) {
_internalInvariant(count <= _SmallString.capacity)
let isASCII = (leading | trailing) & 0x8080_8080_8080_8080 == 0
let discriminator = _StringObject.Nibbles
.small(withCount: count, isASCII: isASCII)
.littleEndian // reversed byte order on big-endian platforms
_internalInvariant(trailing & discriminator == 0)
self.init(raw: (leading, trailing | discriminator))
_internalInvariant(self.count == count)
}
这是最基本的初始化方法。先根据是否是ASCII
和长度值,生成一个discriminator
。然后把discriminator
和trailing
结合,作为第二个UInt64
。
根据缓存区初始化
// Direct from UTF-8
@inlinable @inline(__always)
internal init?(_ input: UnsafeBufferPointer<UInt8>) {
if input.isEmpty {
self.init()
return
}
let count = input.count
guard count <= _SmallString.capacity else { return nil }
// TODO(SIMD): The below can be replaced with just be a masked unaligned
// vector load
let ptr = input.baseAddress._unsafelyUnwrappedUnchecked
let leading = _bytesToUInt64(ptr, Swift.min(input.count, 8))
let trailing = count > 8 ? _bytesToUInt64(ptr + 8, count &- 8) : 0
self.init(leading: leading, trailing: trailing, count: count)
}
即先判断长度是否超过上限,如果超过,返回nil
。
如果没有超过上限,再调用init(leading: UInt64, trailing: UInt64, count: Int)
方法。
small string 创建过程