Swift 里字符串(四)large sting
对于普通的字符串,对应的_StringObject
有两个存储属性:
_countAndFlagsBits: UInt64
_object: Builtin.BridgeObject
_countAndFlagsBits
存储者字符串的长度和一些标记位。
┌─────────┬───────┬──────────────────┬─────────────────┬────────┬───────┐
│ b63 │ b62 │ b61 │ b60 │ b59:48 │ b47:0 │
├─────────┼───────┼──────────────────┼─────────────────┼────────┼───────┤
│ isASCII │ isNFC │ isNativelyStored │ isTailAllocated │ TBD │ count │
└─────────┴───────┴──────────────────┴─────────────────┴────────┴───────┘
其中高16位是flag
,低48位为字符串的长度,是utf8 code point的长度,而不是人眼看到的字符的个数。
@inlinable @inline(__always)
internal init(count: Int, flags: UInt16) {
// Currently, we only use top 4 flags
_internalInvariant(flags & 0xF000 == flags)
let rawBits = UInt64(truncatingIfNeeded: flags) &<< 48
| UInt64(truncatingIfNeeded: count)
self.init(raw: rawBits)
_internalInvariant(self.count == count && self.flags == flags)
}
_object
真正字符串的位置。高四位是 discriminator,指示着字符串的一些属性。
On 64-bit platforms, the discriminator is the most significant 4 bits of the bridge object.
字符串的分类
Large strings can either be "native", "shared", or "foreign".
Native strings have tail-allocated storage, which begins at an offset of
nativeBias
from the storage object's address. String literals, which reside in the constant section, are encoded as their start address minusnativeBias
, unifying code paths for both literals ("immortal native") and native strings. Native Strings are always managed by the Swift runtime.Shared strings do not have tail-allocated storage, but can provide access upon query to contiguous UTF-8 code units. Lazily-bridged NSStrings capable of providing access to contiguous ASCII/UTF-8 set the ObjC bit. Accessing shared string's pointer should always be behind a resilience barrier, permitting future evolution.
Foreign strings cannot provide access to contiguous UTF-8. Currently, this only encompasses lazily-bridged NSStrings that cannot be treated as "shared". Such strings may provide access to contiguous UTF-16, or may be discontiguous in storage. Accessing foreign strings should remain behind a resilience barrier for future evolution. Other foreign forms are reserved for the future.
native | shared | foreign | |
---|---|---|---|
tail-allocated | ✅ | ❌ | ❌ |
连续UTF-8 code unit | ✅ | ✅ | ❌ |
和 NSString
的转换
// Whether the object stored can be bridged directly as a NSString
@usableFromInline // @opaque
internal var hasObjCBridgeableObject: Bool {
@_effects(releasenone) get {
// Currently, all mortal objects can zero-cost bridge
return !self.isImmortal
}
}
// Fetch the stored subclass of NSString for bridging
@inline(__always)
internal var objCBridgeableObject: AnyObject {
_internalInvariant(hasObjCBridgeableObject)
return Builtin.reinterpretCast(largeAddressBits)
}