Redian新闻
>
求大牛指教 “How do you deal with missing data?”
avatar
求大牛指教 “How do you deal with missing data?”# DataSciences - 数据科学
a*y
1
不sure 怎么回答这个题,求大牛指教。
要不要把missing value拿median/average补上?
十分感谢!
avatar
s*h
2
My answers for my onsite interview:
1. Do as you did
2. Assuming some distributions hidden in your missing, then filling the
missing value accordingly
3. If we can justify the missing at random (or complete at random, but it is
too optimistic), try several imputation methods like multiple imputation
methods. Basic idea is to assume some dependency relationship with other
predictors, using them to predict the missings.
For information, see Rubin's paper about Multiple imputation.
avatar
s*a
3
不可以用avg补。除非是random。但是你一般又不知道他是不是random的。
avatar
a*g
4
em算法是个选择吧
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。