学习笔记: Predicting Clicks: Estimating the Click-Through Rate for New Ads

转载

mob604756f44f2a 2021-07-22 15:29:00

文章标签 标准差方差特征工程 microsoft 权重 文章分类 代码人生

计费方式

cost-per-click (CPC): the search engine is paid every time the ad is clicked by a user

cost-per-impression (CPI): where advertisers are charged according to the number of times their ad was shown

cost-per-action (CPA): where advertisers are charged only when the ad display leads to some desired action by the user, such as purchasing a product or signing up for a newsletter).

模型

文中采用LR模型

\[ CTR = \frac{1}{1+e^{-Z}} \]

\[ Z = \sum_i \omega_i f_i(ad) \]

其中$f_i(ad)$指ad的第i个feature, $\omega_i$指该特征对应的学习到的权重.

feature可以是任意给定的, 例如title中word的数量, 是否存在某个word等

特征工程

增加一个恒为1的bias feature

对每个feature $f_i$, 增加特征$f_i^2$ 和 $\log ( f_i + 1)$. (+1的原因是避免出现$log(0)$, 例如count计数值特征)

标准化: 均值为0, 标准差为1.(均值和标准差从训练集计算得出, 然后用该均值、标准差对训练集和测试集都进行标准化)

截断异常值：任何超过5倍标准差的值被截断

term CTR

第一个特征是其他具有相同bid terms的广告的平均CTR（非现有广告主）。如果是从没有出现过的terms，在训练集上计算这些terms的平均CTR，并平滑到总平均CTR

学习笔记: Predicting Clicks: Estimating the Click-Through Rate for New Ads_microsoft

其中N(ad_term)指具有该term的广告数，CTR(ad_term)指这些广告的平均CTR，alpha指平滑的强度。

related term CTR

measure of performance

measure: 在测试集上，predicted CTR与true CTR间的平均KL散度

\[ KL = - \sum_k p_k \log (\frac{p^'_k} {p_k}) \]

baseline model : average CTR on the training set

模型以KL散度的提升百分比来评估, 所有的提升被证明是统计显著的(p值<0.01)

此外还会额外展现一个metric: MSE

估计ad quality

如果限制到某个term(如surgery), CTR的方差可能仍然很大, e.g. surgery的CTR最大值是平均值的5倍

对于影响用户是否点击ad的因素, 文中假设有4大类影响因素:

外观: 是否符合用户审美

注意力获取: 是否吸引用户的注意

名声: 广告主是否是知名的品牌

落地页质量: 落地页的呈现在click行为之后, 假设许多clicks去向的是用户已经熟知的广告主(如Amazon, eBay). 于是, 落地页质量可以作为一个隐含影响因素, 并且有可能造成多次重复访问, 当用户寻找新产品时

相关性: ad与search query之间的相关性

度量order specificity

广告主的一个order:

Title: Buy shoes now,

Text: Shop at our discount shoe warehouse!

Url: shoes.com

Terms: {buy shoes, shoes, cheap shoes}.

在某些时候, 一个order的指向范围会更广:

Title: Buy [term] now,

Text: Shop at our discount warehouse!

Url: store.com

Terms: {shoes, TVs, grass, paint}.

于是可以考虑度量Terms对CTR的影响, 文中是这么做的:

用一个naive Bayes trigram classifier将terms分为74个category.

计算每一个order的entropy of the distribution of categories(?), 将其作为预测模型的一个feature.

同时也将一个order中term的数量作为一个feature加入模型

以上两个feature统称为order specificity, 加入后模型表现提升了5.5%

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：dbconfig.properties

下一篇：ELF 格式简介

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

学习笔记: Predicting Clicks: Estimating the Click-Through Rate for New Ads

学习笔记: Predicting Clicks: Estimating the Click-Through Rate for New Ads

51CTO博客