文档章节

Presto Installation

Yulong_
 Yulong_
发布于 2017/08/14 09:31
字数 1236
阅读 52
收藏 0

1 集群部署

1.1 集群环境

1.1.1 系统需求

Mac OS X or Linux(测试使用的Centos7.2)

Java 8 Update 92 or higher (8u92+), 64-bit(测试使用的1.8.0_121,64-bit)

1.1.2 组件版本

Presto版本0.172,下载链接

Hadoop版本:Apache Hadoop2.6.4

Hive版本:Apache Hive 1.2.1

MongoDB版本:mongodb-linux-x86_64-rhel70-3.4.2

 

1.2 集群配置

1.2.1 软件部署

下载安装包至目录/opt/beh/core,解压缩,创建软连接

cd /opt/beh/core

tar zxf presto-server-0.172.tar.gz

ln -s presto-server-0.172 presto

cd presto

 

1.2.2 集群配置

创建配置目录,并且创建相关配置文件。

cd /opt/beh/core/presto

mkdir data

mkdir etc

cd etc

touch config.properties

touch jvm.config

touch node.properties

touch log.properties

备注:

Config Properties: configuration for the Presto server

JVM Config: command line options for the Java Virtual Machine

Node Properties: environmental configuration specific to each node

Catalog Properties: configuration for Connectors (data sources)

创建data目录对应的是Node Properties 的参数node.data-dir

 

  • config.properties

coordinator=true

discovery-server.enabled=true

discovery.uri=http://master:8080

node-scheduler.include-coordinator=true

http-server.http.port=8080

query.max-memory=60GB

query.max-memory-per-node=20GB

备注:

These properties require some explanation:

coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).

node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.

http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.

query.max-memory: The maximum amount of distributed memory that a query may use.

query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.

discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.

discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace master:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.

 

  •  jvm.config

-server

-Xmx40G

-XX:+UseG1GC

-XX:G1HeapRegionSize=32M

-XX:+UseGCOverheadLimit

-XX:+ExplicitGCInvokesConcurrent

-XX:+HeapDumpOnOutOfMemoryError

-XX:+ExitOnOutOfMemoryError

备注:

 

  • node.properties

node.environment=production

node.id=ffffffff-ffff-ffff-ffff-fffffffffff1

node.data-dir=/opt/beh/core/presto/data

备注:

The above properties are described below:

node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.

node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.

node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here,Two softlink for directory “etc” and “plugin”, and var/run will store server pid file,var/log store log.

 

  • log.properties

com.facebook.presto=INFO

com.facebook.presto.server=INFO

com.facebook.presto.hive=INFO

备注:

The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR.

 

1.2.3 连接器配置

创建连接器配置目录,并且配置相关连接器配置

cd /opt/beh/core/presto/etc

mkdir catalog

cd catalog

touch hive.properties

touch jmx.properties

touch mongodb.properties

touch mysql.properties

备注:

 

  • hive.properties

connector.name=hive-hadoop2

hive.metastore.uri=thrift://localhost:9083

hive.config.resources=/opt/beh/core/hadoop/etc/hadoop/core-site.xml,/opt/beh/core/hadoop/etc/hadoop/hdfs-site.xml

 

  • jmx.properties

connector.name=jmx

jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler

jmx.dump-period=10s

jmx.max-entries=86400

 

  • mongodb.properties

connector.name=mongodb

mongodb.seeds=hadoop001:37025,hadoop002:37025,hadoop003:37025

 

  • mysql.properties

connector.name=mysql

connection-url=jdbc:mysql://mysqlhost:3306

connection-user=mysqluser

connection-password=mysqlpassword

 

1.2.4 服务启动

  • 环境变量配置

export PRESTO_HOME=/opt/beh/core/presto

export PATH=$PATH:$PRESTO_HOME/bin

命令行执行,或者添加到/opt/beh/conf/beh_env

 

  • 启动命令

cd /opt/beh/core/presto

./bin/launcher start

 

备注:

The installation directory contains the launcher script in bin/launcher. Presto can be started as a daemon by running the following:

bin/launcher start

Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools):

bin/launcher run

Run the launcher with --help to see the supported commands and command line options. In particular, the --verbose option is very useful for debugging the installation.

 

日志:

After launching, you can find the log files in var/log:

launcher.log: This log is created by the launcher and is connected to the stdout and stderr streams of the server. It will contain a few log messages that occur while the server logging is being initialized and any errors or diagnostics produced by the JVM.

server.log: This is the main log file used by Presto. It will typically contain the relevant information if the server fails during initialization. It is automatically rotated and compressed.

http-request.log: This is the HTTP request log which contains every HTTP request received by the server. It is automatically rotated and compressed.

 

2 命令行接口

下载命令行接口程序拷贝至/opt/beh/core/presto/bin下载地址

cd /opt/beh/core/presto/bin

chmod -x presto-cli-0.172-executable.jar

ln -s presto-cli-0.172-executable.jar presto

 

测试连接:

./presto --server localhost:8080 --catalog hive --schema default

 

 

[hadoop@sparktest bin]$ ./presto --server localhost:8580 --catalog hive --schema default

presto:default> HELP

 

Supported commands:

QUIT

EXPLAIN [ ( option [, ...] ) ] <query>

    options: FORMAT { TEXT | GRAPHVIZ }

             TYPE { LOGICAL | DISTRIBUTED }

DESCRIBE <table>

SHOW COLUMNS FROM <table>

SHOW FUNCTIONS

SHOW CATALOGS [LIKE <pattern>]

SHOW SCHEMAS [FROM <catalog>] [LIKE <pattern>]

SHOW TABLES [FROM <schema>] [LIKE <pattern>]

SHOW PARTITIONS FROM <table> [WHERE ...] [ORDER BY ...] [LIMIT n]

USE [<catalog>.]<schema>

 

presto:default> SHOW CATALOGS;

 Catalog

---------

 hive   

 jmx    

 mongodb

 mysql  

 system 

(5 rows)

 

Query 20170418_121353_00035_yr3tu, FINISHED, 1 node

Splits: 1 total, 1 done (100.00%)

0:00 [0 rows, 0B] [0 rows/s, 0B/s]

 

presto:default> SHOW SCHEMAS FROM HIVE;

       Schema      

--------------------

 default           

 information_schema

 tmp               

 tpc100g           

(4 rows)

 

Query 20170418_121409_00036_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [4 rows, 55B] [43 rows/s, 601B/s]

 

presto:default> USE hive.tmp;

presto:tmp> show tables;

    Table   

-------------

 date_dim   

 item       

 store_sales

(3 rows)

 

Query 20170418_121459_00040_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [3 rows, 62B] [40 rows/s, 830B/s]

 

presto:tmp> select count(*) from item;

 _col0 

--------

 204000

(1 row)

 

Query 20170418_121540_00041_yr3tu, FINISHED, 3 nodes

Splits: 20 total, 20 done (100.00%)

0:02 [204K rows, 11.8MB] [81.8K rows/s, 4.74MB/s]  

 

presto:tmp> quit

© 著作权归作者所有

共有 人打赏支持
Yulong_
粉丝 8
博文 93
码字总数 169760
作品 0
朝阳
部门经理
FaceBook Prestodb 配置文档

安装配置 在etc目录下创建如下几个文件 node.properties jvm.properties config.properties log.properties Catalog 配置 启动 启动之后你可以在/data/store/presto目录下找到输出的日志 命令...

稻草鸟人
2016/05/12
397
3
大数据实时查询-Presto集群部署搭建

Presto介绍 Presto是一个分布式SQL查询引擎, 它被设计为用来专门进行高速、实时的数据分析。它支持标准的ANSI SQL,包括复杂查询、聚合(aggregation)、连接(join)和窗口函数(window fu...

高广超
10/11
0
0
Fedora Linux 11 正式版发布

以下是 Fedora 11 的主要特性: Automatic font and mime-type installation Volume Control Intel, ATI and Nvidia kernel modsetting Fingerprint IBus input method system Presto 这个发......

红薯
2009/06/10
913
2
Centos 6.9 配置 Presto

解压缩 presto-server-0.166.tar.gz 2. 在 presto-server-0.166 目录下创建 etc 目录 3. 在 etc 目录下创建 catalog 目录 4. 在 catalog 目录下创建文件 hive.properties ,文件内容如下 5....

自东土大唐而来
03/05
0
0
presto分布式环境搭建

1.Presto的基本需求 Linux or Mac OS X Java 8, 64-bit Python 2.4+ Presto支持从以下版本的Hadoop中读取Hive数据: Apache Hadoop 1.x Apache Hadoop 2.x Cloudera CDH 4 Cloudera CDH 5 支......

super_yu
2016/06/15
160
0

没有更多内容

加载失败,请刷新页面

加载更多

分布式块存储的引擎如何设计?

前言: 目前在万兆网络和SSD,包括NVMe SSD 都已经非常普及。随着硬件的速度越来越快,性能的瓶颈会从硬件转移到软件。尤其对于存储引擎来说,性能至关重要。 先来看一下我们会对数据存储引擎...

Java干货分享
24分钟前
1
0
docker(五):docker-compose.yml 配置

docker-compose.yml常用命令 image 指定镜像名称或者镜像id,如果该镜像在本地不存在,Compose会尝试pull下来。 示例: image: java build 指定Dockerfile文件的路径。可以是一个路径,例如...

开心的哈士奇
28分钟前
1
0
Tale的升级真是惊心动魄,吓死人

精心动魄的博客升级计划,Tale 这几天在筹划将分散在其他几个地方的博客统一到这里来,也就留意了tale的更新,发现出现了2.0.1版本; 因此动了升级的念头,唉! 高估了个人开放着的系统规划能...

硅步积千里
38分钟前
1
0
tcc分布式事物

因为最近公司的新退货系统用到了spring cloud。所以会涉及到一些分布式事物。 所以需要先了解一下,分布式事物。 shuaiqiyu / hmily 高性能异步分布式事务TCC框架 谭纳 / spring-cloud-rest-...

miaojiangmin
39分钟前
1
0
20181016 上课截图

小丑鱼00
45分钟前
1
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部