文档章节

Big data defined

我是彩笔
 我是彩笔
发布于 2015/04/28 10:19
字数 377
阅读 79
收藏 0

Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses.

More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.

Big data defined

As far back as 2001, industry analyst Doug Laney (currently with Gartner) articulated the now mainstream definition of big data as the three Vs of big data: volume, velocity and variety.

  • Volume. Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to-machine data being collected. In the past, excessive data volume was a storage issue. But with decreasing storage costs, other issues emerge, including how to determine relevance within large data volumes and how to use analytics to create value from relevant data.

  • Velocity. Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.

  • Variety. Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.

At SAS, we consider two additional dimensions when thinking about big data:

  • Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved.

  • Complexity. Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.


本文转载自:http://www.sas.com/en_us/insights/big-data/what-is-big-data.html

共有 人打赏支持
我是彩笔
粉丝 7
博文 23
码字总数 1936
作品 0
浦东
Ibis: Scaling the Python Data Experience

Ibis: Scaling the Python Data Experience Ibis 0.5 (September 10, 2015) Ibis 0.5.0 is released. Read all about it Please also sign up for the mailing list. What is Ibis? Ibis is ......

openthings
2016/01/10
35
0
白乔/spark-http-stream

spark-http-stream spark-http-stream transfers Spark structured stream over HTTP protocol. Unlike tcp streams, Kafka streams and HDFS file streams, http streams often flow across......

白乔
2017/10/12
0
0
The Data Disconnect With DevOps and Digital Transformation

Industry surveys indicate growing momentum of DevOps with widespread adoption and experimentation. But, a disconnect looms for most organizations due to inadequate access to rel......

Paul Stanton
2017/12/24
0
0
【NoSQL vs RDBMS】 Why and why not to use NoSQL over RDBMS

这是墙外一片关于NOSQL的论文,主要探讨NoSQL与RDBMS直接有什么区别?为什么要使用和不使用NoSQL数据库?说一说NoSQL数据库的几个优点? (水平有限,没有进行翻译,原滋原味的文档) NoSQL vs...

冷冷gg
2016/08/30
32
0
Comprehensive Introduction to Apache Spark

Introduction Industry estimates that we are creating more than 2.5 Quintillion bytes of data every year. Think of it for a moment – 1 Qunitillion = 1 Million Billion! Can you i......

grasp_D
06/15
0
0

没有更多内容

加载失败,请刷新页面

加载更多

Java中的移位运算符

国庆给自己放了个小长期二十几天,回来继续更新专栏 上一篇文章我们说了Java里的二进制,知道了计算机是以0和1来处理数据的,在阅读源码的过程中,经常会看到这些符号<< ,>>,>>>,这些符号...

SuShine
28分钟前
2
0
linux版QQ

下载地址在这 http://yun.tzmm.com.cn/index.php/s/XRbfi6aOIjv5gwj Appimage包不用做什么别的处理,安装啥的都不需要。。找到文件所在目录,终端中修改一下文件的权限 chmod 777 QQ-2017112...

悲催的古灵武士
33分钟前
1
0
咕泡-MyBatis 实用篇作业

1. Mapper在spring管理下其实是单例,为什么可以是一个单例? 首先,mapper 内部不包含 成员字段,无状态单例是安全的 另外,一直存在不用每次调用都new 一个新实例 2. MyBatis在Spring集成下...

职业搬砖20年
36分钟前
2
0
MQTT协议的初浅认识之连接建立

MQTT百科 MQTT(消息队列遥测传输)是ISO 标准(ISO/IEC PRF 20922)下基于发布/订阅范式的消息协议。它工作在 TCP/IP协议族上,是为硬件性能低下的远程设备以及网络状况糟糕的情况下而设计的发布...

亚林瓜子
53分钟前
1
0
OpenStack部署都有哪些方式

对于每一个刚接触到OpenStack的新人而言,安装无疑是最困难的,同时这也客观上提高了大家学习OpenStack云计算的技术门槛。想一想,自己3年前网上偶然接触到OpenStack时,一头茫然,手动搭建一...

tututu_jiang
53分钟前
0
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部