DS in my understanding - 未名空间MITBBS历史存档

国际科技财经博客移民网络热点娱乐民生时事公众号

Redian新闻

>未名空间

>DataSciences - 数据科学

DS in my understanding

DS in my understanding# DataSciences - 数据科学

w*22014-08-16 07:08

1 楼

Not sure if this is right (I did similar thing for some biomedical projects):
You are facing a chunk of data without any previous knowledge from
literature or any other means, then ask yourself:
1. Is there any interesting question in this data set?
2. If so, how many groups/types can be formed?
3. If so, is there any difference among these groups?
4. If so, what is the difference?
5. If so, what are the differentiators?
6. If so, identify all of the major differentiators?
7. If so, build a model or index combining all of the differentiators and
predict the training data set. Do not over fit. Validate with another
independent data set. Apply the model to new data sets for prediction.
8. Reiterate to optimize the model.
9. Interpret the results according to different backgrounds/professions.
Is this logic/workflow right for DS? Thanks.

w*22014-08-16 07:08

2 楼

no response?
so ds is about data collection and cleaning?
not yet for later steps?