# Covariate Shift Correction

## July 03, 2013

在线上生产环境进行实际竞价时，通常需要对竞价模型的参数做调整。


Covariate Shift 就是说， training and test data were so different，我们在 training 过程中 sampling 假设的 distribution 和实际真实的 distribution 差异太大了导致我们最后的training 是 waste。

• Step 1: 得到真实分布 q 和假设的分布 p 之间的 ratio
• Step 2: reweight training set

Step 1就需要考虑如何去衡量两个分布之间的差异。直观的方法是：训练一个 LR 模型，数据为“训练+待预测”数据，Label 为是否属于训练集。分得准，差异大。分不准，差异小。 理论上这里 用 任意learning方法出来的 classifier 都是可以的（见 paper: Discriminative Learning Under Covariate Shift conclusion 的部分）。

• Step 2 学到了这个 ratio 就可以做 reweight。

re-weight each instance by the ratio of probabilities that it would have been drawn from the correct distribution, that is, we need to reweight things by p(xi)q(xi). This is the ratio of how frequently the instances would have occurred in the correct set vs. how frequently it occurred with the sampling distribution q.

### Highway Networks and Deep Residual Networks

Recently, a breakthrough news spread over social networks. In this post, I will explain this ResNet as a special case of Highway Networks, which has been proposed before. Both of the work is amazing and thought-provoking. Continue reading

#### NIPS 2015 Deep Learning Symposium Part II

Published on January 09, 2016

#### NIPS 2015 Deep Learning Symposium Part I

Published on December 11, 2015