• 对于随机变量X,X的K阶原点矩为 E(Xk)E(Xk)
  • X的K阶中心矩为 E([X−E(X)]k)E([X−E(X)]k)
  • 期望实际上是随机变量X的1阶原点矩,方差实际上是随机变量X的2阶中心矩
  • 变异系数(Coefficient of Variation):标准差与均值(期望)的比值称为变异系数,记为C.V
  • 偏度Skewness(三阶)
  • 峰度Kurtosis(四阶)

偏度与峰度

Python偏度和峰度计算 python求偏度和峰度_直方图

利用matplotlib模拟偏度和峰度

计算期望和方差

<span style="color:#000000"><code class="language-python"><span style="color:#000088">import</span> matplotlib.pyplot <span style="color:#000088">as</span> plt
<span style="color:#000088">import</span> math
<span style="color:#000088">import</span> numpy <span style="color:#000088">as</span> np
<span style="color:#000088">def</span> <span style="color:#009900">calc</span><span style="color:#4f4f4f">(data)</span>:
    n=len(data) <span style="color:#880000"># 10000个数</span>
    niu=<span style="color:#006666">0.0</span> <span style="color:#880000"># niu表示平均值,即期望.</span>
    niu2=<span style="color:#006666">0.0</span> <span style="color:#880000"># niu2表示平方的平均值</span>
    niu3=<span style="color:#006666">0.0</span> <span style="color:#880000"># niu3表示三次方的平均值</span>
    <span style="color:#000088">for</span> a <span style="color:#000088">in</span> data:
        niu += a
        niu2 += a**<span style="color:#006666">2</span>
        niu3 += a**<span style="color:#006666">3</span>
    niu /= n  
    niu2 /= n
    niu3 /= n
    sigma = math.sqrt(niu2 - niu*niu)
    <span style="color:#000088">return</span> [niu,sigma,niu3]
</code></span>
  • niu=Xi¯即期望niu=Xi¯即期望
  • niu2=∑ni=1X2inniu2=∑i=1nXi2n
  • niu3=∑ni=1X3inniu3=∑i=1nXi3n
  • sigma表示标准差公式为 σ=E(x2)−E(x)2−−−−−−−−−−−−√σ=E(x2)−E(x)2
    用python语言表示即为sigma=math.sqrt(niu2−niu∗niu)用python语言表示即为sigma=math.sqrt(niu2−niu∗niu)
  • 返回值为[期望,标准差,E(x3)E(x3)]
  • PS:我们知道期望E(X)的计算公式为 E(X)=∑i=1np(i)x(i)−−−−−(1)E(X)=∑i=1np(i)x(i)−−−−−(1)
  • 我们直接利用 E(x)=Xi¯−−−−(2)E(x)=Xi¯−−−−(2) 表示期望应当明确 
  1. (2)公式中Xi是利用numpy中的伪随机数生成的,其均值用于表示期望Xi是利用numpy中的伪随机数生成的,其均值用于表示期望
  2. 此时(1)公式中对事件赋予的权值默认为1,即公式的本来面目为 E(x)=(Xi∗1)¯E(x)=(Xi∗1)¯

计算偏度和峰度

<span style="color:#000000"><code class="language-python"><span style="color:#000088">def</span> <span style="color:#009900">calc_stat</span><span style="color:#4f4f4f">(data)</span>:
    [niu, sigma, niu3]=calc(data)
    n=len(data)
    niu4=<span style="color:#006666">0.0</span> <span style="color:#880000"># niu4计算峰度计算公式的分子</span>
    <span style="color:#000088">for</span> a <span style="color:#000088">in</span> data:
        a -= niu
        niu4 += a**<span style="color:#006666">4</span>
    niu4 /= n

    skew =(niu3 -<span style="color:#006666">3</span>*niu*sigma**<span style="color:#006666">2</span>-niu**<span style="color:#006666">3</span>)/(sigma**<span style="color:#006666">3</span>) <span style="color:#880000"># 偏度计算公式</span>
    kurt=niu4/(sigma**<span style="color:#006666">4</span>) <span style="color:#880000"># 峰度计算公式:下方为方差的平方即为标准差的四次方</span>
    <span style="color:#000088">return</span> [niu, sigma,skew,kurt]</code></span>

利用matplotlib模拟图像

<span style="color:#000000"><code class="language-python"><span style="color:#000088">if</span> __name__ == <span style="color:#009900">"__main__"</span>:
    data =  list(np.random.randn(<span style="color:#006666">10000</span>)) <span style="color:#880000"># 满足高斯分布的10000个数</span>
    data2 = list(<span style="color:#006666">2</span>*np.random.randn(<span style="color:#006666">10000</span>))  <span style="color:#880000"># 将满足好高斯分布的10000个数乘以两倍,方差变成四倍</span>
    data3 =[x <span style="color:#000088">for</span> x <span style="color:#000088">in</span> data <span style="color:#000088">if</span> x>-<span style="color:#006666">0.5</span>] <span style="color:#880000"># 取data中>-0.5的值</span>
    data4 = list(np.random.uniform(<span style="color:#006666">0</span>,<span style="color:#006666">4</span>,<span style="color:#006666">10000</span>)) <span style="color:#880000"># 取0~4的均匀分布</span>
    [niu, sigma, skew, kurt] = calc_stat(data)
    [niu_2, sigma2, skew2, kurt2] = calc_stat(data2)
    [niu_3, sigma3, skew3, kurt3] = calc_stat(data3)
    [niu_4, sigma4, skew4, kurt4] = calc_stat(data4)
    <span style="color:#000088">print</span> (niu, sigma, skew, kurt)
    <span style="color:#000088">print</span> (niu2, sigma2, skew2, kurt2)
    <span style="color:#000088">print</span> (niu3, sigma3, skew3, kurt3)
    <span style="color:#000088">print</span> (niu4, sigma4, skew4, kurt4)
    info = <span style="color:#009900">r'$\mu=%.2f,\ \sigma=%.2f,\ skew=%.2f,\ kurt=%.2f$'</span> %(niu,sigma, skew, kurt) <span style="color:#880000"># 标注</span>
    info2 = <span style="color:#009900">r'$\mu=%.2f,\ \sigma=%.2f,\ skew=%.2f,\ kurt=%.2f$'</span> %(niu_2,sigma2, skew2, kurt2)
    info3 = <span style="color:#009900">r'$\mu=%.2f,\ \sigma=%.2f,\ skew=%.2f,\ kurt=%.2f$'</span> %(niu_3,sigma3, skew3, kurt3)
    plt.text(<span style="color:#006666">1</span>,<span style="color:#006666">0.38</span>,info,bbox=dict(facecolor=<span style="color:#009900">'red'</span>,alpha=<span style="color:#006666">0.25</span>))
    plt.text(<span style="color:#006666">1</span>,<span style="color:#006666">0.35</span>,info2,bbox=dict(facecolor=<span style="color:#009900">'green'</span>,alpha=<span style="color:#006666">0.25</span>))
    plt.text(<span style="color:#006666">1</span>,<span style="color:#006666">0.32</span>,info3,bbox=dict(facecolor=<span style="color:#009900">'blue'</span>,alpha=<span style="color:#006666">0.25</span>))
    plt.hist(data,<span style="color:#006666">100</span>,normed=<span style="color:#000088">True</span>,facecolor=<span style="color:#009900">'r'</span>,alpha=<span style="color:#006666">0.9</span>)
    plt.hist(data2,<span style="color:#006666">100</span>,normed=<span style="color:#000088">True</span>,facecolor=<span style="color:#009900">'g'</span>,alpha=<span style="color:#006666">0.8</span>)
    plt.hist(data4,<span style="color:#006666">100</span>,normed=<span style="color:#000088">True</span>,facecolor=<span style="color:#009900">'b'</span>,alpha=<span style="color:#006666">0.7</span>)
    plt.grid(<span style="color:#000088">True</span>)
    plt.show()</code></span>



  • 图形表示的是利用numpy随机数生成函数生成的随机数的统计分布,利用matplotlib.pyplot.hist绘制的直方图.即是出现数字的分布统计,并且是归一化到0~1区间后的结果.
  • 即横轴表示数字,纵轴表示在1000个随机数中横轴对应的数出现的百分比.若不使用归一化横轴表示数字(normed=False),纵轴表示出现的次数.
  • 若不使用归一化–纵轴表示出现次数

Python偏度和峰度计算 python求偏度和峰度_归一化_02

  • 关于matplotlib.pyplot.hist函数
<span style="color:#000000"><code class="language-python">n, bins, patches = plt.hist(arr, bins=<span style="color:#006666">10</span>, normed=<span style="color:#006666">0</span>, facecolor=<span style="color:#009900">'black'</span>, edgecolor=<span style="color:#009900">'black'</span>,alpha=<span style="color:#006666">1</span>,histtype=<span style="color:#009900">'b'</span>)
hist的参数非常多,但常用的就这六个,只有第一个是必须的,后面四个可选

arr: 需要计算直方图的一维数组

bins: 直方图的柱数,可选项,默认为<span style="color:#006666">10</span>

normed: 是否将得到的直方图向量归一化。默认为<span style="color:#006666">0</span>

facecolor: 直方图颜色

edgecolor: 直方图边框颜色

alpha: 透明度

histtype: 直方图类型,‘bar’, ‘barstacked’, ‘step’, ‘stepfilled’

返回值 :

n: 直方图向量,是否归一化由参数normed设定

bins: 返回各个bin的区间范围

patches: 返回每个bin里面包含的数据,是一个list</code></span>