Redian新闻
>
数据清理和数据质量控制---大数据时代的挑战之三
avatar
数据清理和数据质量控制---大数据时代的挑战之三# DataSciences - 数据科学
g*n
1
【 以下文字转载自 PhotoGear 讨论区 】
发信人: gongren (gongren), 信区: PhotoGear
标 题: 谁闲的可以进来练习一下PS, 偶拍的狗
发信站: BBS 未名空间站 (Wed Jul 27 09:14:34 2011, 美东)
原图以及PS后的
avatar
l*h
2
人物简介:男,一九六O年生于上海,一九七九年考入东北工业大学,一九八五年毕业
被分配到上海宝钢工程公司,曾任工程师,技术翻译。一九八七年去西德培训,后回国
服务,一九八九年去德国,在柏林工业大学材料工程系读研究生。一九九三年一月进入
美国福乐神学院念神学,一九九五年毕业,获宣教学硕士,并且因成绩优秀被列入这一
年美国大学研究生院名人录。一九九五年九月又进入台福神学院攻读道学硕士课程,一
九九六年六月毕业,获道学硕士,并在洛杉矶西区神州团契做全职传道人。
一、第一次背井离乡
(一)
上海霞飞路上有幢大房子,大房子里藏着一个美丽的故事,张路加儿时的记忆就从
这儿开始。这幢房子是他外祖父的遗产,三、四十年代,外祖父是上海滩上有头有脸的
人物,他从洋行职员做起,没几年就做到了经理,他在法国的租界购置了很多房地产。
路加的母亲是沪江大学学生,上下学都有私家轿车接送,这位千金小姐后来嫁给了一位
流浪汉出身的传道人,这个传道人就是路加的父亲。
路加的父亲是个孤儿,他生在杭州,三岁丧母,跟姐姐长大。后来姐姐嫁到温州,
他就跟到温州,在一家皮匠铺当学徒。一九四三年他十四岁时,姐姐也死了,他无依无
靠,一
avatar
l*o
3
Or data cleansing, data quality control etc.
Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
要。请记住,"Garbage in, garbage out".
这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
们做下广告。
avatar
l*s
4
赞!仔细看还是花了不少时间,下了不少功夫的

【在 g*****n 的大作中提到】
: 【 以下文字转载自 PhotoGear 讨论区 】
: 发信人: gongren (gongren), 信区: PhotoGear
: 标 题: 谁闲的可以进来练习一下PS, 偶拍的狗
: 发信站: BBS 未名空间站 (Wed Jul 27 09:14:34 2011, 美东)
: 原图以及PS后的

avatar
P*M
5
很感人的见证
很显然,人需要神

【在 l**h 的大作中提到】
: 人物简介:男,一九六O年生于上海,一九七九年考入东北工业大学,一九八五年毕业
: 被分配到上海宝钢工程公司,曾任工程师,技术翻译。一九八七年去西德培训,后回国
: 服务,一九八九年去德国,在柏林工业大学材料工程系读研究生。一九九三年一月进入
: 美国福乐神学院念神学,一九九五年毕业,获宣教学硕士,并且因成绩优秀被列入这一
: 年美国大学研究生院名人录。一九九五年九月又进入台福神学院攻读道学硕士课程,一
: 九九六年六月毕业,获道学硕士,并在洛杉矶西区神州团契做全职传道人。
: 一、第一次背井离乡
: (一)
: 上海霞飞路上有幢大房子,大房子里藏着一个美丽的故事,张路加儿时的记忆就从
: 这儿开始。这幢房子是他外祖父的遗产,三、四十年代,外祖父是上海滩上有头有脸的

avatar
l*o
6
Magic Quadrant for Data Quality Tools
gartner.comOctober 7
Data quality assurance is a discipline focused on ensuring that data is fit
for use in business processes ranging from core operations to analytics and
decision-making, regulatory compliance, and engagement and interaction with
external entities.
As a discipline, it comprises much more than technology — it also includes
roles and organizational structures, processes for monitoring, measuring,
reporting and remediating data quality issues, and links to broader
information governance activities via data-quality-specific policies.
Given the scale and complexity of the data landscape across organizations of
all sizes and in all industries, tools to help automate key elements of the
discipline continue to attract more interest and to grow in value. As such,
the data quality tools market continues to show substantial growth, while
exhibiting innovation and change.
The data quality tools market includes vendors that offer stand-alone
software products to address the core functional requirements of the
discipline, which are:
Data profiling and data quality measurement: The analysis of data to capture
statistics (metadata) that provide insight into the quality of data and
help to identify data quality issues.
Parsing and standardization: The decomposition of text fields into component
parts and the formatting of values into consistent layouts based on
industry standards, local standards (for example, postal authority standards
for address data), user-defined business rules, and knowledge bases of
values and patterns.
Generalized "cleansing": The modification of data values to meet domain
restrictions, integrity constraints or other business rules that define when
the quality of data is sufficient for an organization.
Matching: Identifying, linking or merging related entries within or across
sets of data.
Monitoring: Deploying controls to ensure that data continues to conform to
business rules that define data quality for the organization.
Enrichment: Enhancing the value of internally-held data by appending related
attributes from external sources (for example, consumer demographic
attributes and geographic descriptors).
In addition, data quality tools provide a range of related functional
abilities that are not unique to this market but that are required to
execute many of the core functions of data quality, or for specific data
quality applications:
Connectivity/adapters: The ability to interact with a range of different
data structure types.
Subject-area-specific support: Standardization capabilities for specific
data subject areas.
International support: The ability to offer relevant data quality operations
on a global basis (such as handling data in multiple languages and writing
systems).
Metadata management: The ability to capture, reconcile and interoperate
metadata related to the data quality process.
Configuration environment: Capabilities for creating, managing and deploying
data quality rules.
Operations and administration: Facilities for supporting, managing and
controlling data quality processes.
Workflow/data quality process support: Processes and user interfaces for
various data quality roles, such as data stewards.
Service enablement: Service-oriented characteristics and support for service
-oriented architecture (SOA) deployments.
The tools provided by vendors in this market are generally consumed by end-
user organizations for internal deployment in their IT infrastructure — to
directly support transactional processes that require data quality
operations and to enable staff in data-quality-oriented roles (such as data
stewards) to engage in data quality improvement work. Off-premises solutions
in the form of hosted data quality offerings, SaaS delivery models and
cloud services continue to evolve and grow in popularity.
Return to Top
For vendors to be included in the Magic Quadrant, they must meet the
following criteria:
They must offer stand-alone packaged software tools or cloud-based services
(not only embedded in, or dependent on, other products

【在 l******o 的大作中提到】
: Or data cleansing, data quality control etc.
: Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
: Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
: 制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
: 要。请记住,"Garbage in, garbage out".
: 这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
: But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
: 们做下广告。

avatar
g*n
7
拍的时候有问题往回救就要花心思了

【在 l*******s 的大作中提到】
: 赞!仔细看还是花了不少时间,下了不少功夫的
avatar
a*e
8


【在 l**h 的大作中提到】
: 人物简介:男,一九六O年生于上海,一九七九年考入东北工业大学,一九八五年毕业
: 被分配到上海宝钢工程公司,曾任工程师,技术翻译。一九八七年去西德培训,后回国
: 服务,一九八九年去德国,在柏林工业大学材料工程系读研究生。一九九三年一月进入
: 美国福乐神学院念神学,一九九五年毕业,获宣教学硕士,并且因成绩优秀被列入这一
: 年美国大学研究生院名人录。一九九五年九月又进入台福神学院攻读道学硕士课程,一
: 九九六年六月毕业,获道学硕士,并在洛杉矶西区神州团契做全职传道人。
: 一、第一次背井离乡
: (一)
: 上海霞飞路上有幢大房子,大房子里藏着一个美丽的故事,张路加儿时的记忆就从
: 这儿开始。这幢房子是他外祖父的遗产,三、四十年代,外祖父是上海滩上有头有脸的

avatar
l*o
9
付费link: http://gtnr.it/1tdIeVw

fit
and
with
includes

【在 l******o 的大作中提到】
: Magic Quadrant for Data Quality Tools
: gartner.comOctober 7
: Data quality assurance is a discipline focused on ensuring that data is fit
: for use in business processes ranging from core operations to analytics and
: decision-making, regulatory compliance, and engagement and interaction with
: external entities.
: As a discipline, it comprises much more than technology — it also includes
: roles and organizational structures, processes for monitoring, measuring,
: reporting and remediating data quality issues, and links to broader
: information governance activities via data-quality-specific policies.

avatar
x*i
10
希望简单介绍一下过程和技巧。
avatar
l*o
11
Or data cleansing, data quality control etc.
Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
要。请记住,"Garbage in, garbage out".
这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
们做下广告。
avatar
y*l
12
求教程
avatar
l*o
13
Magic Quadrant for Data Quality Tools
gartner.comOctober 7
Data quality assurance is a discipline focused on ensuring that data is fit
for use in business processes ranging from core operations to analytics and
decision-making, regulatory compliance, and engagement and interaction with
external entities.
As a discipline, it comprises much more than technology — it also includes
roles and organizational structures, processes for monitoring, measuring,
reporting and remediating data quality issues, and links to broader
information governance activities via data-quality-specific policies.
Given the scale and complexity of the data landscape across organizations of
all sizes and in all industries, tools to help automate key elements of the
discipline continue to attract more interest and to grow in value. As such,
the data quality tools market continues to show substantial growth, while
exhibiting innovation and change.
The data quality tools market includes vendors that offer stand-alone
software products to address the core functional requirements of the
discipline, which are:
Data profiling and data quality measurement: The analysis of data to capture
statistics (metadata) that provide insight into the quality of data and
help to identify data quality issues.
Parsing and standardization: The decomposition of text fields into component
parts and the formatting of values into consistent layouts based on
industry standards, local standards (for example, postal authority standards
for address data), user-defined business rules, and knowledge bases of
values and patterns.
Generalized "cleansing": The modification of data values to meet domain
restrictions, integrity constraints or other business rules that define when
the quality of data is sufficient for an organization.
Matching: Identifying, linking or merging related entries within or across
sets of data.
Monitoring: Deploying controls to ensure that data continues to conform to
business rules that define data quality for the organization.
Enrichment: Enhancing the value of internally-held data by appending related
attributes from external sources (for example, consumer demographic
attributes and geographic descriptors).
In addition, data quality tools provide a range of related functional
abilities that are not unique to this market but that are required to
execute many of the core functions of data quality, or for specific data
quality applications:
Connectivity/adapters: The ability to interact with a range of different
data structure types.
Subject-area-specific support: Standardization capabilities for specific
data subject areas.
International support: The ability to offer relevant data quality operations
on a global basis (such as handling data in multiple languages and writing
systems).
Metadata management: The ability to capture, reconcile and interoperate
metadata related to the data quality process.
Configuration environment: Capabilities for creating, managing and deploying
data quality rules.
Operations and administration: Facilities for supporting, managing and
controlling data quality processes.
Workflow/data quality process support: Processes and user interfaces for
various data quality roles, such as data stewards.
Service enablement: Service-oriented characteristics and support for service
-oriented architecture (SOA) deployments.
The tools provided by vendors in this market are generally consumed by end-
user organizations for internal deployment in their IT infrastructure — to
directly support transactional processes that require data quality
operations and to enable staff in data-quality-oriented roles (such as data
stewards) to engage in data quality improvement work. Off-premises solutions
in the form of hosted data quality offerings, SaaS delivery models and
cloud services continue to evolve and grow in popularity.
Return to Top
For vendors to be included in the Magic Quadrant, they must meet the
following criteria:
They must offer stand-alone packaged software tools or cloud-based services
(not only embedded in, or dependent on, other products

【在 l******o 的大作中提到】
: Or data cleansing, data quality control etc.
: Gartner 去年底发表过一个Dara Quality Tools Magic Quadrant 的 report, 对相关
: Vendor做了些总结。我不很了解这些Vendor 的选择是否靠谱,但他们对于数据质量控
: 制的总结还很到位。在数据被大量收集的今天,强调数据清理和数据质量控制,尤为必
: 要。请记住,"Garbage in, garbage out".
: 这个Report originally available from http://www.gartner.com/technology/reprints.do?id=1-1LCD5XL&ct=131007&st=sb,
: But not any more. 我这里摘一点,同时附上他们现在付费网址,供大家参考,也帮他
: 们做下广告。

avatar
l*s
14
lz没时间说,我来滥竽充数一下吧。
需要用到局部区域的锐化,局部区域去紫边(选择颜色取色),鼻头的暗部调整(PS里
面的exposure, highlight/shadow选项)

【在 x****i 的大作中提到】
: 希望简单介绍一下过程和技巧。
avatar
l*o
15
付费link: http://gtnr.it/1tdIeVw

fit
and
with
includes

【在 l******o 的大作中提到】
: Magic Quadrant for Data Quality Tools
: gartner.comOctober 7
: Data quality assurance is a discipline focused on ensuring that data is fit
: for use in business processes ranging from core operations to analytics and
: decision-making, regulatory compliance, and engagement and interaction with
: external entities.
: As a discipline, it comprises much more than technology — it also includes
: roles and organizational structures, processes for monitoring, measuring,
: reporting and remediating data quality issues, and links to broader
: information governance activities via data-quality-specific policies.

avatar
m*z
16

这有啥好ps..用Digital Photo Professional改一下了

【在 g*****n 的大作中提到】
: 拍的时候有问题往回救就要花心思了
avatar
g*s
17
data warehouse 里的第一步就是ETL extract transform, load 就是除了clean data.
avatar
g*n
18
再仔细看

【在 m*z 的大作中提到】
:
: 这有啥好ps..用Digital Photo Professional改一下了

avatar
k*e
19
lz ps的都是非常细微的地方。锐化毛发,眼睛,局部调色...

【在 m*z 的大作中提到】
:
: 这有啥好ps..用Digital Photo Professional改一下了

avatar
g*n
20
screen capture

【在 g*****n 的大作中提到】
: 再仔细看
avatar
g*n
21
稍微解释一下, 从下往上
background 原图
复制background+sharpen: (此处不确定这个图是否能救的回来, 先sharpen一下看看)
去紫边, 这个主要是狗胡子附近的, 因为是黑白的, 所以用了HSL去的
color mode layer, 这个是用了和湖水一样颜色的color mode, 随便涂了涂
slective color1 (调整了一下颜色)
levels (整体处理了一下亮度)
curves2(狗鼻子处理, 加亮)
slective color3 (局部处理狗舌头)
selective color 4(处理狗眼)
hue/saturation2, 处理湖水
curves3 处理狗眼以及四周
burn eyes 把狗眼眼黑的地方加深
最后删掉的layer是合并图层后的high pass(softlight mode), 后来觉得过锐, 给删
掉了

【在 g*****n 的大作中提到】
: screen capture
avatar
m*z
22
哈哈刚刚看到了~LZ的dog noise 亮可以看清晰了~
avatar
e*t
23
顶啊 工人哥多发教程啊
相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。