记录JMP course 学习。


estimate_profiles函数 estimate value_Graph





JMP001: Medical Malpractice







Data variable:

Amount Amount of the claim payment in dollars

Severity The severity rating of damage to the patient, from 1 (emotional trauma) to 9 (death)

Age Age of the claimant in years

Private Attorney Whether the claimant was represented by a private attorney

Marital Status Marital status of the claimant Specialty Specialty of the physician involved in the lawsuit

Insurance Type of medical insurance carried by the patient

Gender Patient Gender


变量类型为连续型,序列,名义型。a Continuous, Ordinal or Nominal modeling type



直方图分布:(Analyze > Distribution; Select Amount as Y, Columns, and click OK. For a horizontal layout select Stack under the top red triangle.)


estimate_profiles函数 estimate value_数据分布_02

从图中可以看出数据分布偏右, 右边有比较大的数据拖尾。 mean值为91044, median为22750. 分布偏右的话,平均值会比中值大很多,被平均。。。

数据分布的衡量data为std(standard deviation),值越大,数据分布越离散。




estimate_profiles函数 estimate value_Graph_03

箱子的左边线为1四分卫,中间是median,右边是3四分位,菱形是mean值。 box的宽度是IQR,右边的胡须线是1.5倍IQR,超出视为异常值。

IQR(四分位距)是一种测量数据分布的方法,表示数据的中间50%的范围。它是第三四分位数(Q3)与第一四分位数(Q1)之差,即IQR = Q3 - Q1。IQR可以用来识别潜在的异常值。如果数据点超出了箱线图的1.5倍IQR范围,则通常被视为异常值。


数据筛选:设置hide/exclude, 不要直接删除数据。



estimate_profiles函数 estimate value_ide_04




Along with bar charts, Pareto plots and pie charts can be used to display information about nominal (categorical) variables.


estimate_profiles函数 estimate value_ide_05




a. Dynamic plot-linking

 b. Data Filter   (Rows > Data Filter; select Gender and click Add)

c. Side-by-Side (Comparative) Box Plots   (Analyze > Fit Y by X; use Private Attorney as X, Factor and Amount as Y, Response. Then, select Quantiles under the top red triangle).

d. Graph Builder


(Graph > Graph Builder; Drag and drop Amount in Y, Gender in X, and Private Attorney in Group X. Click on the box plot icon at the top. Or, right-click in the graph and select Points > Change to > Box Plot.)


estimate_profiles函数 estimate value_数据分布_06