Redian新闻
>
Re: 检查说发动机废了,想问下要不要换一家再问问。 (转载)
avatar
Re: 检查说发动机废了,想问下要不要换一家再问问。 (转载)# Joke - 肚皮舞运动
I*x
1
去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
Yelp, Pinterest,
Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
同学们。
题目写的简略,请大家见谅
====================
1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
get 1 tail and 5 head. Determine whether it’s fair or not. What’s your
confidence value?
2. Given Amazon data, how to predict which users are going to be top
shoppers in this holiday season.
3. Which regression methods are you familiar? How to evaluate regression
result?
4. Write down the formula for logistic regression. How to determine the
coefficients given the data?
5. How do you evaluate regression?
For example, in this particular case:
item click-through-rate predicted rate
1 0.04 0.06
2 0.68 0.78
3 0.27 0.19
4 0.52 0.57

6. What’s the formula for SVM? What is decision boundary?
7. A field with unknown number of rabbits. Catch 100 rabbits and put a label
on each of them. A few days later, catch 300 rabbits and found 60 with
labels. Estimate how many rabbits are there?
8. Given 10 coins with 1 unfair coin and 9 fair coins. The unfair coin has &
#8532; prob. to be head. Now random select 1 coin and throw it 3 times. You
observe head, head, tail. What’s the probability that the selected coin is
the unfair one?
9. What’s the formula for Naive Bayesian classifier? What’s the assumption
in the formula? What kind of data is Naive Bayesian good at? What is not?
10. What is the real distribution of click-through rate of items? If you
want to build a predictor/classifier for this data, how do you do it? How do
you divide the data?
11. You have a stream of data coming in, in the format as the following:
item_id, views, clicks, time
1 100 10 2013-11-28
1 1000 350 2013-11-29
1 200 14 2013-11-30
2 127 13 2013-12-1

The same id are consecutive.
Click through rate = clicks / views.
On every day, I want to output the item id when its click through rate is
larger than a given threshold.
For example, at day 1, item 1’s rate is 10/100=10%, day2, its (10+350)/(100
+1000)=0.32. day3 it is (10+350+14)/(100+1000+200)=0.28.
If my threshold is 0.3, then at day 1, I don’t output. On day2 I output. On
day3, I don’t output.
11. Given a dictionary and a string. Write a function, if every word is in
the dictionary return true, otherwise return false.
12. Generate all the permutation of a string.
For example, abc, acb, cba, …
13. We want to add a new feature to our product. How to determine if people
like it?
A/B testing. How to do A/B testing? How many ways? pros and cons?
14. 44.3% vs 47.2% is it significant?
15. Design a function to calculate people’s interest to a place against the
distance to the place.
16. How to encourage people to write more reviews on Yelp? How to determine
who are likely to write reviews? How to increase the registration rate of
Yelp? What features to add for a better Yelp app? We are expanding to other
countries. Which country we should enter first?
17. What’s the difference between classification and regression?
18. Can you explain how decision tree works? How to build a decision tree
from data?
19. What is regularization in regression? Why do regularization? How to do
regularization?
20. What is gradient descent? stochastic gradient descent?
21. We have a database of . When user
inputs a product name, how to return results fast?
22. If user gives a budget value, how to find the most expensive product
under budget? Assume the data fits in memory. What data structure, or
algorithm you use to find the product quickly? Write the program for it.
23. Given yelp data, how to find top 10 restaurants in America?
24. Given a large file that we don’t know how many lines are there. It
doesn’t fit into memory. We want to sample K lines from the file uniformly.
Write a program for it.
25. How to determine if one advertisement is performing better than the
other?
26. How to evaluate classification result? What if the results are in
probability mode?
If I want to build a classifier, but the data is very unbalanced. I have a
few positive samples but a lot of negative samples. What should I do?
27. Given a lot of data, I want to random sample 1% of them. How to do it
efficiently?
28. When a new user signs up Pinterest, we want to know its interests. We
decide to show the user a few pins, 2 pins at a time. Let the user choose
which pin s/he likes. After the user clicks on one of the 2, we select
another 2 pins.
Question: how to design the system and select the pins so that we can
achieve our goal?
29. Write a function to compute sqrt(X). Write a function to compute pow(x,
n) [square root and power)
30. Given a matrix
a b c d
e f g h
i j k l
Print it in this order:
a f k
b g l
c h
d
e j
i
31. Given a matrix and an array of words, find if the words are in the
matrix. You can search the
matrix in all directions: from left to right, right to left, up to down,
down to up, or diagonally.
For example
w o r x b
h e l o v
i n d e m
then the word “world” is in the matrix.
32. Given a coordinates, and two points A and B. How many ways to go from A
to B? You can only move up or right.
For example, from (1, 1) to (5, 7), one possible way is 1,1 -> 2, 1… 5, 1 -
> 5,2 -> ..5, 7
33. In a city where there are only vertical and horizontal streets. There
are people on the cross point. These people want to meet. Please find a
cross point to minimize the cost for all the people to move.
34. Design a job search ranking algorithm on glassdoor
35. How to identify review spam?
36. Glassdoor has this kind of data about a job : (position, company,
location, salary). For example (Software Engineer, Microsoft, Seattle, $125K
). For some records, all four entires are available. But for others, the
salary is missing. Design a way to estimate salary for those records.
37. When to send emails to users in a day can get maximum click through rate?
38. Youtube has video play log like this:
Video ID, time
vid1 t1
vid2 t2
... ...
The log is super large.
Find out the top 10 played videos on youtube in a given week.
39. Write a program to copy a graph
40. A bank has this access log:
IP address, time
ip1 t1
ip2 t2
... ...
If one ip accessed K times within m seconds, it may be an attack.
Given the log, identify all IPs that may cause attack.
avatar
t*n
2
有办法弄回来么?
avatar
m*s
3
【 以下文字转载自 Auto_Repair_DIY 俱乐部 】
发信人: geniushanb (公孙轩辕), 信区: Auto_Repair_DIY
标 题: Re: 检查说发动机废了,想问下要不要换一家再问问。
发信站: BBS 未名空间站 (Sun Jan 12 21:39:42 2014, 美东)
跟你讲,会舔机油的技术差不了.我就学不会.
ps: 舔机油学问大了.一个资深技师舔下机油能知道的事,咱们得拿到显微镜下看才能知
道.
avatar
t*t
4
Thanks a lot!!
avatar
s*k
5
Move on吧…注册新的也花不了几秒钟
avatar
C*n
6
lz 什么背景?cs phd?
avatar
I*x
7
Forgot to mention it. I'm a CS PhD specialized in data mining. Some intern
experiences and some publications in top conferences.
avatar
j*t
8
mark~
avatar
C*r
9
那么多what is the formula of 吧啦吧啦吧啦……
马克下,以后俺不会少见多怪了。请问lz那些公式都允许你推导吗?
avatar
I*x
10
what is the formula 之类的问题都允许推导,而且个人觉得能推导出来比直接背出来
更体现水平。这些问题都不难,面试之前过一遍上课的ppt就好了 :)

【在 C**********r 的大作中提到】
: 那么多what is the formula of 吧啦吧啦吧啦……
: 马克下,以后俺不会少见多怪了。请问lz那些公式都允许你推导吗?

avatar
p*0
11
好贴mark!
LZ你有没有面过uber和airbnb的data scientist?
avatar
C*r
12

好。非常感谢lz分享!

【在 I*******x 的大作中提到】
: what is the formula 之类的问题都允许推导,而且个人觉得能推导出来比直接背出来
: 更体现水平。这些问题都不难,面试之前过一遍上课的ppt就好了 :)

avatar
e*5
13
mark!谢谢牛人!
avatar
d*0
14
mark,真心大牛啊
avatar
t*e
15
mark
avatar
f*y
16
Mark!!
膜拜大牛!!
avatar
f*a
17
niu

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
f*a
18
niu

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
R*E
19
多谢分享,收藏了

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
m*1
20
能不能给一点答案提示?
好多题完全不知道怎么做
avatar
r*o
21
mark thanks
avatar
w*k
22
Mark
avatar
w*u
23
多谢分享!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
m*a
24
狂赞楼主!mark
感觉题目还是偏统计多一些。楼主可以简单说说编程方面需要掌握哪些工具/语言吗
谢谢

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
u*g
25
Mark

去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这........

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
f*y
26
mark!
avatar
S*e
27
马克下,谢楼主分享
avatar
t*5
28
楼主好人,好久之前就在某论坛里问类似的问题,最后也没人回答。。。另外想问下楼
主编程方面都怎么考的,和一般码工一样要刷leetcode考算法之类的题吗,machine
learning 这个方面更多用哪个语言呢?
avatar
f*2
29
mark
avatar
s*y
30
mark
avatar
w*e
31
马克 多谢楼主

去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这........

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
a*o
32
多谢
avatar
N*n
33
marked
avatar
I*x
34
多谢大家,我不是什么牛人,多几个面试有什么好牛的。只是这些东西能帮助大家提高
水平就很好了。对于各位在上面提出的问题,这里统一回复一下。
1. 如果哪些题目有问题,欢迎跟贴讨论。题目比较多,就不一一分析给提示了。
2. Machine learning也是编程写算法,用什么语言应该都和其他的职位类似。但是确
实python和java有不少ML的package现成的。不过也有大牛一直用c++的。这个没有定数
,看个人喜好。
3. 基础知识怎么准备的问题,不是这个方向的同学,还在学校的可以上上课,在公司
的可以参与到相关的项目里。对于是这个方向的同学来说,那些面试题真的不难。
4. 编程要刷题吗?答案:要。leetcode什么的该做还是要做。真正的machine
learning的职位对编程要求不比software engineer低,而且加了machine learning方
向的问题。应该对人整体要求更高才是。不过不同公司或者不同的组对data scientist
的定义不同,有的不考编程,只是问问sql,但是那些职位我没申请过,不好给建议。
avatar
z*m
35
谢谢楼主分享
avatar
o*0
36
niu
avatar
M*l
37
Mark赞啊!
avatar
j*a
38
Nice!
avatar
b*y
39
多谢分享!
avatar
M*c
40
感谢分享,有没有可能分享下思路?

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
d*k
41
多谢分享!!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
s*0
42
楼主好人一生平安阿!
avatar
c*r
43
mark
avatar
x*5
44
Mark~楼主真牛人~
avatar
g*s
45
Thanks a lot, Mark!
avatar
g*t
46
多谢分享,收藏!
avatar
s*r
47
已经收藏的,是非常不错的一手面试经验。肯定很有用。
avatar
d*y
48
mark
avatar
W*y
49
赞,多谢分享
avatar
f*e
50
mark
avatar
b*f
51
Mark
avatar
f*k
52
mark
avatar
r*g
53
好题, 让我每天做一点

Microsoft,
label
&
You
is
assumption
do
100
On
people
the
determine
other
user
uniformly.
,
A
-
125K
rate?

【在 I*******x 的大作中提到】
: 多谢大家,我不是什么牛人,多几个面试有什么好牛的。只是这些东西能帮助大家提高
: 水平就很好了。对于各位在上面提出的问题,这里统一回复一下。
: 1. 如果哪些题目有问题,欢迎跟贴讨论。题目比较多,就不一一分析给提示了。
: 2. Machine learning也是编程写算法,用什么语言应该都和其他的职位类似。但是确
: 实python和java有不少ML的package现成的。不过也有大牛一直用c++的。这个没有定数
: ,看个人喜好。
: 3. 基础知识怎么准备的问题,不是这个方向的同学,还在学校的可以上上课,在公司
: 的可以参与到相关的项目里。对于是这个方向的同学来说,那些面试题真的不难。
: 4. 编程要刷题吗?答案:要。leetcode什么的该做还是要做。真正的machine
: learning的职位对编程要求不比software engineer低,而且加了machine learning方

avatar
z*e
54
收藏.感谢楼主.
avatar
b*y
55
28. When a new user signs up Pinterest, we want to know its interests. We
decide to show the user a few pins, 2 pins at a time. Let the user choose
which pin s/he likes. After the user clicks on one of the 2, we select
another 2 pins.
Question: how to design the system and select the pins so that we can
achieve our goal?
这个题有意思,不知道怎么做呢?
avatar
x*0
56
mark
avatar
x*0
57
楼主,请问
Given yelp data, how to find top 10 restaurants in America?
这题怎么答呢,有什么思路吗

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
w*d
58
mark
avatar
k*y
59
Mark

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
h*a
60
Hi,
Thanks for your post. I am PhD student in HKU, I am looking for a data
mining related job in US now. I think I have a lot of things to learn from
you. Can I have your Wechat or QQ?
Thanks a lot.
Best regard,
Min Yang

【在 I*******x 的大作中提到】
: Forgot to mention it. I'm a CS PhD specialized in data mining. Some intern
: experiences and some publications in top conferences.

avatar
I*x
61
去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
Yelp, Pinterest,
Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
同学们。
题目写的简略,请大家见谅
====================
1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
get 1 tail and 5 head. Determine whether it’s fair or not. What’s your
confidence value?
2. Given Amazon data, how to predict which users are going to be top
shoppers in this holiday season.
3. Which regression methods are you familiar? How to evaluate regression
result?
4. Write down the formula for logistic regression. How to determine the
coefficients given the data?
5. How do you evaluate regression?
For example, in this particular case:
item click-through-rate predicted rate
1 0.04 0.06
2 0.68 0.78
3 0.27 0.19
4 0.52 0.57

6. What’s the formula for SVM? What is decision boundary?
7. A field with unknown number of rabbits. Catch 100 rabbits and put a label
on each of them. A few days later, catch 300 rabbits and found 60 with
labels. Estimate how many rabbits are there?
8. Given 10 coins with 1 unfair coin and 9 fair coins. The unfair coin has &
#8532; prob. to be head. Now random select 1 coin and throw it 3 times. You
observe head, head, tail. What’s the probability that the selected coin is
the unfair one?
9. What’s the formula for Naive Bayesian classifier? What’s the assumption
in the formula? What kind of data is Naive Bayesian good at? What is not?
10. What is the real distribution of click-through rate of items? If you
want to build a predictor/classifier for this data, how do you do it? How do
you divide the data?
11. You have a stream of data coming in, in the format as the following:
item_id, views, clicks, time
1 100 10 2013-11-28
1 1000 350 2013-11-29
1 200 14 2013-11-30
2 127 13 2013-12-1

The same id are consecutive.
Click through rate = clicks / views.
On every day, I want to output the item id when its click through rate is
larger than a given threshold.
For example, at day 1, item 1’s rate is 10/100=10%, day2, its (10+350)/(100
+1000)=0.32. day3 it is (10+350+14)/(100+1000+200)=0.28.
If my threshold is 0.3, then at day 1, I don’t output. On day2 I output. On
day3, I don’t output.
11. Given a dictionary and a string. Write a function, if every word is in
the dictionary return true, otherwise return false.
12. Generate all the permutation of a string.
For example, abc, acb, cba, …
13. We want to add a new feature to our product. How to determine if people
like it?
A/B testing. How to do A/B testing? How many ways? pros and cons?
14. 44.3% vs 47.2% is it significant?
15. Design a function to calculate people’s interest to a place against the
distance to the place.
16. How to encourage people to write more reviews on Yelp? How to determine
who are likely to write reviews? How to increase the registration rate of
Yelp? What features to add for a better Yelp app? We are expanding to other
countries. Which country we should enter first?
17. What’s the difference between classification and regression?
18. Can you explain how decision tree works? How to build a decision tree
from data?
19. What is regularization in regression? Why do regularization? How to do
regularization?
20. What is gradient descent? stochastic gradient descent?
21. We have a database of . When user
inputs a product name, how to return results fast?
22. If user gives a budget value, how to find the most expensive product
under budget? Assume the data fits in memory. What data structure, or
algorithm you use to find the product quickly? Write the program for it.
23. Given yelp data, how to find top 10 restaurants in America?
24. Given a large file that we don’t know how many lines are there. It
doesn’t fit into memory. We want to sample K lines from the file uniformly.
Write a program for it.
25. How to determine if one advertisement is performing better than the
other?
26. How to evaluate classification result? What if the results are in
probability mode?
If I want to build a classifier, but the data is very unbalanced. I have a
few positive samples but a lot of negative samples. What should I do?
27. Given a lot of data, I want to random sample 1% of them. How to do it
efficiently?
28. When a new user signs up Pinterest, we want to know its interests. We
decide to show the user a few pins, 2 pins at a time. Let the user choose
which pin s/he likes. After the user clicks on one of the 2, we select
another 2 pins.
Question: how to design the system and select the pins so that we can
achieve our goal?
29. Write a function to compute sqrt(X). Write a function to compute pow(x,
n) [square root and power)
30. Given a matrix
a b c d
e f g h
i j k l
Print it in this order:
a f k
b g l
c h
d
e j
i
31. Given a matrix and an array of words, find if the words are in the
matrix. You can search the
matrix in all directions: from left to right, right to left, up to down,
down to up, or diagonally.
For example
w o r x b
h e l o v
i n d e m
then the word “world” is in the matrix.
32. Given a coordinates, and two points A and B. How many ways to go from A
to B? You can only move up or right.
For example, from (1, 1) to (5, 7), one possible way is 1,1 -> 2, 1… 5, 1 -
> 5,2 -> ..5, 7
33. In a city where there are only vertical and horizontal streets. There
are people on the cross point. These people want to meet. Please find a
cross point to minimize the cost for all the people to move.
34. Design a job search ranking algorithm on glassdoor
35. How to identify review spam?
36. Glassdoor has this kind of data about a job : (position, company,
location, salary). For example (Software Engineer, Microsoft, Seattle, $125K
). For some records, all four entires are available. But for others, the
salary is missing. Design a way to estimate salary for those records.
37. When to send emails to users in a day can get maximum click through rate?
38. Youtube has video play log like this:
Video ID, time
vid1 t1
vid2 t2
... ...
The log is super large.
Find out the top 10 played videos on youtube in a given week.
39. Write a program to copy a graph
40. A bank has this access log:
IP address, time
ip1 t1
ip2 t2
... ...
If one ip accessed K times within m seconds, it may be an attack.
Given the log, identify all IPs that may cause attack.
avatar
t*t
62
Thanks a lot!!
avatar
C*n
63
lz 什么背景?cs phd?
avatar
I*x
64
Forgot to mention it. I'm a CS PhD specialized in data mining. Some intern
experiences and some publications in top conferences.
avatar
j*t
65
mark~
avatar
C*r
66
那么多what is the formula of 吧啦吧啦吧啦……
马克下,以后俺不会少见多怪了。请问lz那些公式都允许你推导吗?
avatar
I*x
67
what is the formula 之类的问题都允许推导,而且个人觉得能推导出来比直接背出来
更体现水平。这些问题都不难,面试之前过一遍上课的ppt就好了 :)

【在 C**********r 的大作中提到】
: 那么多what is the formula of 吧啦吧啦吧啦……
: 马克下,以后俺不会少见多怪了。请问lz那些公式都允许你推导吗?

avatar
p*0
68
好贴mark!
LZ你有没有面过uber和airbnb的data scientist?
avatar
C*r
69

好。非常感谢lz分享!

【在 I*******x 的大作中提到】
: what is the formula 之类的问题都允许推导,而且个人觉得能推导出来比直接背出来
: 更体现水平。这些问题都不难,面试之前过一遍上课的ppt就好了 :)

avatar
e*5
70
mark!谢谢牛人!
avatar
d*0
71
mark,真心大牛啊
avatar
t*e
72
mark
avatar
f*y
73
Mark!!
膜拜大牛!!
avatar
f*a
74
niu

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
f*a
75
niu

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
m*1
76
能不能给一点答案提示?
好多题完全不知道怎么做
avatar
r*o
77
mark thanks
avatar
w*k
78
Mark
avatar
w*u
79
多谢分享!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
m*a
80
狂赞楼主!mark
感觉题目还是偏统计多一些。楼主可以简单说说编程方面需要掌握哪些工具/语言吗
谢谢

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
u*g
81
Mark

去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这........

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
f*y
82
mark!
avatar
S*e
83
马克下,谢楼主分享
avatar
t*5
84
楼主好人,好久之前就在某论坛里问类似的问题,最后也没人回答。。。另外想问下楼
主编程方面都怎么考的,和一般码工一样要刷leetcode考算法之类的题吗,machine
learning 这个方面更多用哪个语言呢?
avatar
f*2
85
mark
avatar
s*y
86
mark
avatar
w*e
87
马克 多谢楼主

去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这........

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
a*o
88
多谢
avatar
N*n
89
marked
avatar
I*x
90
多谢大家,我不是什么牛人,多几个面试有什么好牛的。只是这些东西能帮助大家提高
水平就很好了。对于各位在上面提出的问题,这里统一回复一下。
1. 如果哪些题目有问题,欢迎跟贴讨论。题目比较多,就不一一分析给提示了。
2. Machine learning也是编程写算法,用什么语言应该都和其他的职位类似。但是确
实python和java有不少ML的package现成的。不过也有大牛一直用c++的。这个没有定数
,看个人喜好。
3. 基础知识怎么准备的问题,不是这个方向的同学,还在学校的可以上上课,在公司
的可以参与到相关的项目里。对于是这个方向的同学来说,那些面试题真的不难。
4. 编程要刷题吗?答案:要。leetcode什么的该做还是要做。真正的machine
learning的职位对编程要求不比software engineer低,而且加了machine learning方
向的问题。应该对人整体要求更高才是。不过不同公司或者不同的组对data scientist
的定义不同,有的不考编程,只是问问sql,但是那些职位我没申请过,不好给建议。
avatar
z*m
91
谢谢楼主分享
avatar
o*0
92
niu
avatar
M*l
93
Mark赞啊!
avatar
j*a
94
Nice!
avatar
b*y
95
多谢分享!
avatar
M*c
96
感谢分享,有没有可能分享下思路?

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
d*k
97
多谢分享!!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
s*0
98
楼主好人一生平安阿!
avatar
c*r
99
mark
avatar
x*5
100
Mark~楼主真牛人~
avatar
g*s
101
Thanks a lot, Mark!
avatar
g*t
102
多谢分享,收藏!
avatar
s*r
103
已经收藏的,是非常不错的一手面试经验。肯定很有用。
avatar
d*y
104
mark
avatar
W*y
105
赞,多谢分享
avatar
f*e
106
mark
avatar
b*f
107
Mark
avatar
f*k
108
mark
avatar
r*g
109
好题, 让我每天做一点

Microsoft,
label
&
You
is
assumption
do
100
On
people
the
determine
other
user
uniformly.
,
A
-
125K
rate?

【在 I*******x 的大作中提到】
: 多谢大家,我不是什么牛人,多几个面试有什么好牛的。只是这些东西能帮助大家提高
: 水平就很好了。对于各位在上面提出的问题,这里统一回复一下。
: 1. 如果哪些题目有问题,欢迎跟贴讨论。题目比较多,就不一一分析给提示了。
: 2. Machine learning也是编程写算法,用什么语言应该都和其他的职位类似。但是确
: 实python和java有不少ML的package现成的。不过也有大牛一直用c++的。这个没有定数
: ,看个人喜好。
: 3. 基础知识怎么准备的问题,不是这个方向的同学,还在学校的可以上上课,在公司
: 的可以参与到相关的项目里。对于是这个方向的同学来说,那些面试题真的不难。
: 4. 编程要刷题吗?答案:要。leetcode什么的该做还是要做。真正的machine
: learning的职位对编程要求不比software engineer低,而且加了machine learning方

avatar
z*e
110
收藏.感谢楼主.
avatar
b*y
111
28. When a new user signs up Pinterest, we want to know its interests. We
decide to show the user a few pins, 2 pins at a time. Let the user choose
which pin s/he likes. After the user clicks on one of the 2, we select
another 2 pins.
Question: how to design the system and select the pins so that we can
achieve our goal?
这个题有意思,不知道怎么做呢?
avatar
x*0
112
mark
avatar
x*0
113
楼主,请问
Given yelp data, how to find top 10 restaurants in America?
这题怎么答呢,有什么思路吗

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
w*d
114
mark
avatar
k*y
115
Mark

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
h*a
116
Hi,
Thanks for your post. I am PhD student in HKU, I am looking for a data
mining related job in US now. I think I have a lot of things to learn from
you. Can I have your Wechat or QQ?
Thanks a lot.
Best regard,
Min Yang

【在 I*******x 的大作中提到】
: Forgot to mention it. I'm a CS PhD specialized in data mining. Some intern
: experiences and some publications in top conferences.

avatar
E*g
117
好文!
Mark!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
c*l
118
都是统计加regression的问题。被CS强奸成machine learning了。

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
a*u
119
墙贴留名
avatar
T*n
120
四个字牛!
avatar
M*6
121
不能更赞!感谢楼主!!
avatar
k*i
122
大牛.
avatar
E*g
123
好文!
Mark!

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
c*l
124
都是统计加regression的问题。被CS强奸成machine learning了。

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
a*u
125
墙贴留名
avatar
T*n
126
四个字牛!
avatar
M*6
127
不能更赞!感谢楼主!!
avatar
k*i
128
大牛.
avatar
x*q
129
终于找到一点依靠了,
感谢楼主
你就是神!
avatar
d*e
130
mark璋z鍒嗕韩
avatar
A*y
131
马克,多谢楼主分享
avatar
p*x
132
Mark
[在 ISphoenix (beta3) 的大作中提到:]

:去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
:...........
avatar
a*i
133
thanks
avatar
D*e
134
markmark
avatar
p*e
135
thanks
avatar
x*q
136
终于找到一点依靠了,
感谢楼主
你就是神!
avatar
d*e
137
mark璋z鍒嗕韩
avatar
A*y
138
马克,多谢楼主分享
avatar
p*x
139
Mark
[在 ISphoenix (beta3) 的大作中提到:]

:去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
:...........
avatar
a*i
140
thanks
avatar
D*e
141
markmark
avatar
p*e
142
thanks
avatar
r*9
143
慢慢看大牛们 华山论剑

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
i*t
144
都没人讨论答案 光有题没用阿
avatar
w*t
145
马克,大赞LZ
avatar
r*9
146
慢慢看大牛们 华山论剑

Microsoft,

【在 I*******x 的大作中提到】
: 去年我找工作的时候发现板上针对data scientist,machine learning engineer面试
: 题总结很少,所以尽量申请了很多公司面试相关职位,想看看行业里这个方向都在问什
: 么。有幸去过不少地方面试,现在把那些题目整理整理(全部来自Amazon, Microsoft,
: Yelp, Pinterest,
: Square, Google, Glassdoor, Groupon的电面和onsite),希望能帮助在找相关工作的
: 同学们。
: 题目写的简略,请大家见谅
: ====================
: 1. Given a coin you don’t know it’s fair or unfair. Throw it 6 times and
: get 1 tail and 5 head. Determine whether it’s fair or not. What’s your

avatar
i*t
147
都没人讨论答案 光有题没用阿
avatar
w*t
148
马克,大赞LZ
avatar
b*i
149
mark!感谢
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。