文档章节

Big data defined

我是彩笔
 我是彩笔
发布于 2015/04/28 10:19
字数 377
阅读 82
收藏 0

Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses.

More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.

Big data defined

As far back as 2001, industry analyst Doug Laney (currently with Gartner) articulated the now mainstream definition of big data as the three Vs of big data: volume, velocity and variety.

  • Volume. Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected. In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.

  • Velocity. Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.

  • Variety. Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.

At SAS, we consider two additional dimensions when thinking about big data:

  • Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved.

  • Complexity. Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.


本文转载自:http://www.sas.com/en_us/insights/big-data/what-is-big-data.html

共有 人打赏支持
下一篇: Junk Dimension
我是彩笔
粉丝 7
博文 23
码字总数 1936
作品 0
浦东
私信 提问
《Oracle大数据解决方案》学习笔记12——大数据治理(Big Data Governance)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hpdlzu80100/article/details/84977191 这一节全是干货,值得认真学习。 1. 模型化数据(传统数据)和原始数据...

预见未来to50
2018/12/12
0
0
Ibis: Scaling the Python Data Experience

Ibis: Scaling the Python Data Experience Ibis 0.5 (September 10, 2015) Ibis 0.5.0 is released. Read all about it Please also sign up for the mailing list. What is Ibis? Ibis is ......

openthings
2016/01/10
35
0
白乔/spark-http-stream

spark-http-stream spark-http-stream transfers Spark structured stream over HTTP protocol. Unlike tcp streams, Kafka streams and HDFS file streams, http streams often flow across......

白乔
2017/10/12
0
0
The Data Disconnect With DevOps and Digital Transformation

Industry surveys indicate growing momentum of DevOps with widespread adoption and experimentation. But, a disconnect looms for most organizations due to inadequate access to rel......

Paul Stanton
2017/12/24
0
0
【NoSQL vs RDBMS】 Why and why not to use NoSQL over RDBMS

这是墙外一片关于NOSQL的论文,主要探讨NoSQL与RDBMS直接有什么区别?为什么要使用和不使用NoSQL数据库?说一说NoSQL数据库的几个优点? (水平有限,没有进行翻译,原滋原味的文档) NoSQL vs...

冷冷gg
2016/08/30
32
0

没有更多内容

加载失败,请刷新页面

加载更多

欧拉公式

欧拉公式表达式 欧拉公式的几何意 cosθ + j sinθ 是个复数,实数部分也就是实部为 cosθ ,虚数部分也就是虚部为 j sinθ ,对应复平面单位圆上的一个点。 根据欧拉公式和这个点可以用 复指...

sharelocked
28分钟前
2
0
burpsuite无法抓取https数据包

1.将浏览器和burpsuite的代理都设置好 2.在浏览器地址栏输入: http://burp 3.下载下面的证书,并将证书导入浏览器 cacert.der

Frost729
53分钟前
1
0
JeeSite4.x 消息管理、消息推送、消息提醒

实现统一的消息推送接口,包含PC消息、短信消息、邮件消息、微信消息等,无需让所有开发者了解消息是怎么发送出去的,只需了解消息发送接口即可。 所有推送消息均通过 MsgPushUtils 工具类发...

ThinkGem
今天
6
0
OpenML

https://www.openml.org/search?type=data

shengjuntu
今天
2
0
java强引用,软引用,弱引用和虚引用

先来简要说一下这四种引用的特性: 强引用:如果一个对象具有强引用,那垃圾回收器绝不会回收它 软引用:如果一个对象只具有软引用,则内存空间足够,垃圾回收器就不会回收它 弱引用:在垃圾...

woshixin
今天
1
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部