GO语言MAP转JSon中文乱码 go语言map底层实现原理

转载

mob64ca1409970a 2023-10-01 10:16:13

文章标签 GO语言MAP转JSon中文乱码 golang 散列表哈希算法初始化 文章分类 Go语言后端开发

Map底层原理

map是一种数据结构，用于存储一系列无序的键值对，里面是基于键来存储的，这样我们可以通过键很快的找到对应的值。

内部实现介绍

Go底层是一个散列表，散列表里头包含一组捅，当在存储、删除及查找键值对的时候，所有的操作都是需要选择一个捅，把操作映射时指定的键传给映射的散列函数进行计算，就能找到对应的捅。通过合理数量的桶来平衡键值对的分布，这样大大提高查找效率。

栗子：

p := map[string]string{"Red":"#da23"}

上面声明一个map，键值都是string类型，首先，看一看键是字符串，在map底层是如何存储的。

将字符串作为map的键，底层会通过哈希函数计算出散列值，然而该散列值在映射的序号范围内表示可以用于存储的捅序号。得到的散列值用于选择那个捅，也用于存储在及查找指定的键值对是非常方便。

深入剖析map底层

Go的map有自己的底层实现原理，其中最核心是由hmap 和 bmap这两个结构体实现。

（1）Map初始化，底层会做哪些骚动作？

假设，你初始化一个容量为5个元素的map

mymap := make(map[string]string,10)

当你创建map后，底层会创建一个hmap结构体对象，然后hamp结构体里头会进行初始化，如生成一个哈希因子hash并赋值到hamp对象中、count初始化为0，以及计算捅的个数，源码所在位置：go/src/runtime/map.go

源码：

func makemap(t *maptype, hint int, h *hmap) *hmap {
	mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size)
	if overflow || mem > maxAlloc {
		hint = 0
	}

	// initialize Hmap 初始化Hmap，也就是创建一个Hmap
	if h == nil {
		h = new(hmap)
	}
  //生成哈希因子
	h.hash0 = fastrand()

	// Find the size parameter B which will hold the requested # of elements.
	// For hint < 0 overLoadFactor returns false since hint < bucketCnt.
  //根据你传来的参数，进行计算B，也就是通过计算B后，才知道需要创建多少个捅
	B := uint8(0) //默认0，然后通过overLoadFactor计算捅的
	for overLoadFactor(hint, B) {
		B++
	}
	h.B = B

	// allocate initial hash table
	// if B == 0, the buckets field is allocated lazily later (in mapassign)
	// If hint is large zeroing this memory could take a while.
  //创建捅
	if h.B != 0 {
		var nextOverflow *bmap
		h.buckets, nextOverflow = makeBucketArray(t, h.B, nil)
		if nextOverflow != nil {
			h.extra = new(mapextra)
			h.extra.nextOverflow = nextOverflow
		}
	}

	return h
}

（2）添加数据时，又发生什么呢？

mymap := make(map[string]string,10)
mymap["name"] = "诸葛亮"

第一步：首先会通过你传的key，结合哈希因子生成哈希值

第二步：拿到哈希值后B位（哈希值是二进制）来确定该数据应该存储在那个捅(Bmap)

第三步：确定捅之后，就可以添加数据了，存储的是将高8位存储到Bmap里面tophash，添加捅满的时候，会通过overflow找到溢出捅。

源码：

func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
	//...省略部分代码
  //计算哈希
	hash := t.hasher(key, uintptr(h.hash0))
	h.flags ^= hashWriting
	if h.buckets == nil {
		h.buckets = newobject(t.bucket) // newarray(t.bucket, 1)
	}
  
again:
	bucket := hash & bucketMask(h.B)
	if h.growing() {
		growWork(t, h, bucket)
	}
	b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize)))
  // 计算top
	top := tophash(hash)

	var inserti *uint8
	var insertk unsafe.Pointer
	var elem unsafe.Pointer

  //...省略部分代码

（3）读取数据会发生什么？

mymap := make(map[string]string,10)
mymap["name"] = "诸葛亮"
//读取数据
value := mymap["name"]

第一步：结合哈希因子和键生成哈希值

第二步：通过哈希值的后B位的值来确定该键存储对应的捅(Bmap)

第三步：然后根据tophash(高8位)去捅里面查询对应数据

（4）Map扩容

何时出发扩容机制？

扩容条件：

map数据总个数 / 捅个数 > 6.5 ，会触发翻倍扩容
使用太多的溢出捅（溢出捅使用太多会导致map处理速度降低）
B <= 15 已使用的溢出捅个数 >= 2的B次方时，触发等量扩容
B > 15 已使用的溢出捅个数 >= 2的15次方时，触发等量扩容

源码：

func hashGrow(t *maptype, h *hmap) {
	// If we've hit the load factor, get bigger.
	// Otherwise, there are too many overflow buckets,
	// so keep the same number of buckets and "grow" laterally.
	bigger := uint8(1)
	if !overLoadFactor(h.count+1, h.B) {
		bigger = 0
		h.flags |= sameSizeGrow
	}
	oldbuckets := h.buckets
	newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)

	flags := h.flags &^ (iterator | oldIterator)
	if h.flags&iterator != 0 {
		flags |= oldIterator
	}
	// commit the grow (atomic wrt gc)
	h.B += bigger
	h.flags = flags
	h.oldbuckets = oldbuckets
	h.buckets = newbuckets
	h.nevacuate = 0
	h.noverflow = 0

	if h.extra != nil && h.extra.overflow != nil {
		// Promote current overflow buckets to the old generation.
		if h.extra.oldoverflow != nil {
			throw("oldoverflow is not nil")
		}
		h.extra.oldoverflow = h.extra.overflow
		h.extra.overflow = nil
	}
	if nextOverflow != nil {
		if h.extra == nil {
			h.extra = new(mapextra)
		}
		h.extra.nextOverflow = nextOverflow
	}

	// the actual copying of the hash table data is done incrementally
	// by growWork() and evacuate().
}

（5）扩容后数据迁移发生了什么？

翻倍扩容：如果发生翻倍扩容时，那么迁移是将旧捅数据导入到新捅，根据哈希值来迁移的，具体实现。

将旧捅遍历

（6）其他扩展

map作为函数参数，有什么特点？

func changeMap(m map[string]string) {
	m["name"] = "曹操"
	fmt.Printf("m=m%p,m=%v",m,m)
}
func main() {
	m1 := map[string]string{"name":"诸葛亮"}
	fmt.Printf("m1=%p,m1=%v\n",m1,m1)
	changeMap(m1)
}

注：该栗子，证明map作为函数参数，是地址传递，而不值传递。

（7）map特点