r*s
3 楼
#1 QPS:
Probably 10k~100k concurrent access (my guess)
Meaning we need to heavily cache the data. We may need some write through
cache in place. Well sharded.
resource wise, I'm not worried at all. for cache https://redis.io/topics/
benchmarks) redis can be super fast. for persistency with nosql we can
achieve almost infinite scalability.
our system should provide the following APIs:
- Like(user, post)
- Unlike(user, post)
- Liked(user, post)
- CountLikes(post)
Extra:
- RecentLikes(user)
- FriendsLiked(user, post)
If we only consider the first 4 APIs, it is obvious: post id should be hash/
partition key and user id should be range/sort key.
The extras can have longer latencies - by using offline process or secondary
global index.
Or, we can store recent likes in user's metadata. Or it could be just
published as an implicit timeline entry.
#2 access pattern:
Read dominant. can be 10x-50x comparing to writes.
#3 consistency:
Eventual consistency ok, but the writer clients should read back same value.
We may need to consider sticky sessions (which can be bad for load-
balancing. but luckily we don't need to worry too much about server going
down.)
Another approach we can consider/combine is local (client) cache. This might
be hard on browser (because user could use several browsers on same
computer!), but in mobile app it is easy. As far as I know FB uses local
cache trick on several products to offload their services.
#4 availability:
very important. if you do not allow people to like posts, some attention
whore may DIE because of this unavailability.
combining #3 and #4 we decide it should be an AP system. in network
partition, we store Like data in each partition, and after recovering, we
merge as best as we can.
#5 how to shard?
I always prefer quorum in loosely consistent systems. it has better fault
tolerance than master-slave or leader/leaders systems. Availability is
important in our use case. It doesn't mean we need to implement it, but when
choosing our tools, we should use cassandra / dynamodb other than mongodb /
mysql
#7 improve latency?
latency is important too. however we still need to confirm the write. (
immediate notice that the like didn't take effect vs find it after 1 hour or
never find out)
as mentioned above, use write thru cache. also tune the w/r values in quorum
so we have better write performance with acceptable fault tolerance.
#6 how to balance load in app layer?
load balancer....
在下只是餐馆洗碗的,设计功力有限。。暂时想到这么多,大牛们不要见笑。。。
Probably 10k~100k concurrent access (my guess)
Meaning we need to heavily cache the data. We may need some write through
cache in place. Well sharded.
resource wise, I'm not worried at all. for cache https://redis.io/topics/
benchmarks) redis can be super fast. for persistency with nosql we can
achieve almost infinite scalability.
our system should provide the following APIs:
- Like(user, post)
- Unlike(user, post)
- Liked(user, post)
- CountLikes(post)
Extra:
- RecentLikes(user)
- FriendsLiked(user, post)
If we only consider the first 4 APIs, it is obvious: post id should be hash/
partition key and user id should be range/sort key.
The extras can have longer latencies - by using offline process or secondary
global index.
Or, we can store recent likes in user's metadata. Or it could be just
published as an implicit timeline entry.
#2 access pattern:
Read dominant. can be 10x-50x comparing to writes.
#3 consistency:
Eventual consistency ok, but the writer clients should read back same value.
We may need to consider sticky sessions (which can be bad for load-
balancing. but luckily we don't need to worry too much about server going
down.)
Another approach we can consider/combine is local (client) cache. This might
be hard on browser (because user could use several browsers on same
computer!), but in mobile app it is easy. As far as I know FB uses local
cache trick on several products to offload their services.
#4 availability:
very important. if you do not allow people to like posts, some attention
whore may DIE because of this unavailability.
combining #3 and #4 we decide it should be an AP system. in network
partition, we store Like data in each partition, and after recovering, we
merge as best as we can.
#5 how to shard?
I always prefer quorum in loosely consistent systems. it has better fault
tolerance than master-slave or leader/leaders systems. Availability is
important in our use case. It doesn't mean we need to implement it, but when
choosing our tools, we should use cassandra / dynamodb other than mongodb /
mysql
#7 improve latency?
latency is important too. however we still need to confirm the write. (
immediate notice that the like didn't take effect vs find it after 1 hour or
never find out)
as mentioned above, use write thru cache. also tune the w/r values in quorum
so we have better write performance with acceptable fault tolerance.
#6 how to balance load in app layer?
load balancer....
在下只是餐馆洗碗的,设计功力有限。。暂时想到这么多,大牛们不要见笑。。。
相关阅读
文学城上那个美国老土培训家庭主妇几个月拿高薪让我们CSdata science project 求大神解答有西雅图的吗?连下6天雨,心情好低落,很堵心进入了google intern host matching阶段,求助!~现在是CPT,可以用本科申请H1B吗?5月底MS才毕业分享Facebook的设计题,并求高手解答。H1b transfer 何时开始到新公司上班 (转载)monster.com 上现在这么多烙印recruiters报个Epic offerRocket Fuel的HR打了3次电话说要给Offer,但是就是不肯透露数字思科的烙印怎么回事问一下Repeated DNA Sequences如果找两个array的intersection的题也没做出来Go 前景怎么样Design InterviewG家有非general hire吗?帮朋友 问一下,统计专业,会sas,r ,在湾区工作机会多么?CoderPad个人预感google在rideshare上会成功问一道Uber的电面题