文档章节

Presto Installation

Yulong_
 Yulong_
发布于 2017/08/14 09:31
字数 1293
阅读 162
收藏 0

#程序员薪资揭榜#你做程序员几年了?月薪多少?发量还在么?>>>

1 集群部署

1.1 集群环境

1.1.1 系统需求

Mac OS X or Linux(测试使用的Centos7.2)

Java 8 Update 92 or higher (8u92+), 64-bit(测试使用的1.8.0_121,64-bit)

1.1.2 组件版本

Presto版本0.172,下载链接

Hadoop版本:Apache Hadoop2.6.4

Hive版本:Apache Hive 1.2.1

MongoDB版本:mongodb-linux-x86_64-rhel70-3.4.2

 

1.2 集群配置

1.2.1 软件部署

下载安装包至目录/opt/beh/core,解压缩,创建软连接

cd /opt/beh/core

tar zxf presto-server-0.172.tar.gz

ln -s presto-server-0.172 presto

cd presto

 

1.2.2 集群配置

创建配置目录,并且创建相关配置文件。

cd /opt/beh/core/presto

mkdir data

mkdir etc

cd etc

touch config.properties

touch jvm.config

touch node.properties

touch log.properties

备注:

Config Properties: configuration for the Presto server

JVM Config: command line options for the Java Virtual Machine

Node Properties: environmental configuration specific to each node

Catalog Properties: configuration for Connectors (data sources)

创建data目录对应的是Node Properties 的参数node.data-dir

 

  • config.properties

coordinator=true

discovery-server.enabled=true

discovery.uri=http://master:8080

node-scheduler.include-coordinator=true

http-server.http.port=8080

query.max-memory=60GB

query.max-memory-per-node=20GB

备注:

These properties require some explanation:

coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).

node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.

http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.

query.max-memory: The maximum amount of distributed memory that a query may use.

query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.

discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.

discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace master:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.

 

  •  jvm.config

-server

-Xmx40G

-XX:+UseG1GC

-XX:G1HeapRegionSize=32M

-XX:+UseGCOverheadLimit

-XX:+ExplicitGCInvokesConcurrent

-XX:+HeapDumpOnOutOfMemoryError

-XX:+ExitOnOutOfMemoryError

备注:

 

  • node.properties

node.environment=production

node.id=ffffffff-ffff-ffff-ffff-fffffffffff1

node.data-dir=/opt/beh/core/presto/data

备注:

The above properties are described below:

node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.

node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.

node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here,Two softlink for directory “etc” and “plugin”, and var/run will store server pid file,var/log store log.

 

  • log.properties

com.facebook.presto=INFO

com.facebook.presto.server=INFO

com.facebook.presto.hive=INFO

备注:

The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR.

 

1.2.3 连接器配置

创建连接器配置目录,并且配置相关连接器配置

cd /opt/beh/core/presto/etc

mkdir catalog

cd catalog

touch hive.properties

touch jmx.properties

touch mongodb.properties

touch mysql.properties

备注:

 

  • hive.properties

connector.name=hive-hadoop2

hive.metastore.uri=thrift://localhost:9083

hive.config.resources=/opt/beh/core/hadoop/etc/hadoop/core-site.xml,/opt/beh/core/hadoop/etc/hadoop/hdfs-site.xml

 

  • jmx.properties

connector.name=jmx

jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler

jmx.dump-period=10s

jmx.max-entries=86400

 

  • mongodb.properties

connector.name=mongodb

mongodb.seeds=hadoop001:37025,hadoop002:37025,hadoop003:37025

 

  • mysql.properties

connector.name=mysql

connection-url=jdbc:mysql://mysqlhost:3306

connection-user=mysqluser

connection-password=mysqlpassword

 

1.2.4 服务启动

  • 环境变量配置

export PRESTO_HOME=/opt/beh/core/presto

export PATH=$PATH:$PRESTO_HOME/bin

命令行执行,或者添加到/opt/beh/conf/beh_env

 

  • 启动命令

cd /opt/beh/core/presto

./bin/launcher start

 

备注:

The installation directory contains the launcher script in bin/launcher. Presto can be started as a daemon by running the following:

bin/launcher start

Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools):

bin/launcher run

Run the launcher with --help to see the supported commands and command line options. In particular, the --verbose option is very useful for debugging the installation.

 

日志:

After launching, you can find the log files in var/log:

launcher.log: This log is created by the launcher and is connected to the stdout and stderr streams of the server. It will contain a few log messages that occur while the server logging is being initialized and any errors or diagnostics produced by the JVM.

server.log: This is the main log file used by Presto. It will typically contain the relevant information if the server fails during initialization. It is automatically rotated and compressed.

http-request.log: This is the HTTP request log which contains every HTTP request received by the server. It is automatically rotated and compressed.

 

2 命令行接口

下载命令行接口程序拷贝至/opt/beh/core/presto/bin下载地址

cd /opt/beh/core/presto/bin

chmod -x presto-cli-0.172-executable.jar

ln -s presto-cli-0.172-executable.jar presto

 

测试连接:

./presto --server localhost:8080 --catalog hive --schema default

 

 

[hadoop@sparktest bin]$ ./presto --server localhost:8580 --catalog hive --schema default

presto:default> HELP

 

Supported commands:

QUIT

EXPLAIN [ ( option [, ...] ) ] <query>

    options: FORMAT { TEXT | GRAPHVIZ }

             TYPE { LOGICAL | DISTRIBUTED }

DESCRIBE <table>

SHOW COLUMNS FROM <table>

SHOW FUNCTIONS

SHOW CATALOGS [LIKE <pattern>]

SHOW SCHEMAS [FROM <catalog>] [LIKE <pattern>]

SHOW TABLES [FROM <schema>] [LIKE <pattern>]

SHOW PARTITIONS FROM <table> [WHERE ...] [ORDER BY ...] [LIMIT n]

USE [<catalog>.]<schema>

 

presto:default> SHOW CATALOGS;

 Catalog

---------

 hive   

 jmx    

 mongodb

 mysql  

 system 

(5 rows)

 

Query 20170418_121353_00035_yr3tu, FINISHED, 1 node

Splits: 1 total, 1 done (100.00%)

0:00 [0 rows, 0B] [0 rows/s, 0B/s]

 

presto:default> SHOW SCHEMAS FROM HIVE;

       Schema      

--------------------

 default           

 information_schema

 tmp               

 tpc100g           

(4 rows)

 

Query 20170418_121409_00036_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [4 rows, 55B] [43 rows/s, 601B/s]

 

presto:default> USE hive.tmp;

presto:tmp> show tables;

    Table   

-------------

 date_dim   

 item       

 store_sales

(3 rows)

 

Query 20170418_121459_00040_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [3 rows, 62B] [40 rows/s, 830B/s]

 

presto:tmp> select count(*) from item;

 _col0 

--------

 204000

(1 row)

 

Query 20170418_121540_00041_yr3tu, FINISHED, 3 nodes

Splits: 20 total, 20 done (100.00%)

0:02 [204K rows, 11.8MB] [81.8K rows/s, 4.74MB/s]  

 

presto:tmp> quit

© 著作权归作者所有

Yulong_
粉丝 10
博文 145
码字总数 253510
作品 0
朝阳
部门经理
私信 提问
加载中

评论(0)

【原创】大数据基础之Presto(1)简介、安装、使用

presto 0.217 官方:http://prestodb.github.io/ 一 简介 Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all......

osc_xbmy69c2
2019/03/14
6
0
FaceBook Prestodb 配置文档

安装配置 在etc目录下创建如下几个文件 node.properties jvm.properties config.properties log.properties Catalog 配置 启动 启动之后你可以在/data/store/presto目录下找到输出的日志 命令...

稻草鸟人
2016/05/12
515
3
大数据实时查询-Presto集群部署搭建

Presto介绍 Presto是一个分布式SQL查询引擎, 它被设计为用来专门进行高速、实时的数据分析。它支持标准的ANSI SQL,包括复杂查询、聚合(aggregation)、连接(join)和窗口函数(window fu...

高广超
2018/10/11
0
0
Presto服务发现(Discovery Service)

Presto 集群配置不管是coordinator还是worker配置项中都有一项discovery.uri,这个是一个比较核心的东西,简单来说就是服务发现的地址。 coordinator和worker都会将自身注册到这个服务发现地址...

osc_meg7rtm9
2019/02/27
0
0
Docker+Hadoop+Hive+Presto 使用Docker部署Hadoop环境和Presto

Background 一. 什么是Presto Presto通过使用分布式查询,可以快速高效的完成海量数据的查询。如果你需要处理TB或者PB级别的数据,那么你可能更希望借助于Hadoop和HDFS来完成这些数据的处理。...

osc_demrzfpg
2018/04/11
1
0

没有更多内容

加载失败,请刷新页面

加载更多

PDF如何添加下划线?迅捷PDF编辑器一键添加

“在PDF文件中如何添加下划线?”最近,很多办公室小伙伴都向小编咨询这样一个问题。我们常常需要接触、使用到PDF文件,通过查看、阅读、编辑PDF文件以处理各种各样的学习、工作任务。当我们...

dawda
2分钟前
0
0
go中gin框架+realize实现边写代码边编译,热更新

最近看到了热加载,相关的,就搜索了goland实现热加载 发现了一个插件realize https://github.com/oxequa/realize 然后,为了自己撸代码更方便,配合gin写个教程 1.准备 go get github.com/...

osc_ho8dcqsx
2分钟前
0
0
CAP理论的理解

转自:https://www.cnblogs.com/mingorun/p/11025538.html CAP理论的理解 CAP理论作为分布式系统的基础理论,它描述的是一个分布式系统在以下三个特性中: 一致性(Consistency) 可用性(Ava...

osc_5rgbamh9
3分钟前
0
0
求所有科目都大于80分的学生姓名

   蠢蠢的我=》 select t1.name from ( select name,count(*) as num from table t where fenshu>80 group by name) t1join( select name,count(kecheng) as num from table group ......

osc_gk4myeyk
4分钟前
0
0
Memcache(1.1)Memcache 基本概述与架构概述

【1】基本概念介绍 官网:https://memcached.org/ 【1.1】memcache与memcached memcache:是早期使用的,与php结合的,是Php中常用的一个原生插件,完全在php框架内开发的 memcached:是建立...

osc_7ie26pzn
5分钟前
0
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部