Example 1: GMM

Given a set of observed data, two clusters generated by GMM of two components, we need to find a model to fit them. 



For parametric model, we can see the observed data's appearance or empirically assume the model is a GMM of two components. 


However, if we add more training data, i.e. we provide a additional cluster, totally three clusters, for parametric model GMM of two components,

we can just change the component weight to fit the training data. Obviously, it cannot fit data well. 


For nonparametric model, it can change the model structure or number of parameter depending on the data size.


Example 2: topic modelling 


In the similar way, if we have a training set of 6 words w1-w6, and each word belongs to distinct topic, z1-z6 all different, for theta has 6 components, it can fit

these words well. If we add an additional topic word w7, z7, z1-z7 all different, for theta has 6 components, it can not fit the data well, so that LDA generate

the example topic - word distribution


topic 1: w1

topic 2: w2

topic 3: w3

topic 4: w4, w7

topic 5: w5

topic 6: w6

这样就会导致,topic 4 is not meaningful. topic words are not coherent !!

if using nonparametric method, theta can be expanded to 7 components depending on the prior process.


For nonparametric, usually we will give a prior process on the distribution that we estimate. like GP, DP, BP,IBP


Note: nonparametric doesn't mean the model without parameters!!!!!