文档章节

Presto Installation

Yulong_
 Yulong_
发布于 2017/08/14 09:31
字数 1236
阅读 41
收藏 0

1 集群部署

1.1 集群环境

1.1.1 系统需求

Mac OS X or Linux(测试使用的Centos7.2)

Java 8 Update 92 or higher (8u92+), 64-bit(测试使用的1.8.0_121,64-bit)

1.1.2 组件版本

Presto版本0.172,下载链接

Hadoop版本:Apache Hadoop2.6.4

Hive版本:Apache Hive 1.2.1

MongoDB版本:mongodb-linux-x86_64-rhel70-3.4.2

 

1.2 集群配置

1.2.1 软件部署

下载安装包至目录/opt/beh/core,解压缩,创建软连接

cd /opt/beh/core

tar zxf presto-server-0.172.tar.gz

ln -s presto-server-0.172 presto

cd presto

 

1.2.2 集群配置

创建配置目录,并且创建相关配置文件。

cd /opt/beh/core/presto

mkdir data

mkdir etc

cd etc

touch config.properties

touch jvm.config

touch node.properties

touch log.properties

备注:

Config Properties: configuration for the Presto server

JVM Config: command line options for the Java Virtual Machine

Node Properties: environmental configuration specific to each node

Catalog Properties: configuration for Connectors (data sources)

创建data目录对应的是Node Properties 的参数node.data-dir

 

  • config.properties

coordinator=true

discovery-server.enabled=true

discovery.uri=http://master:8080

node-scheduler.include-coordinator=true

http-server.http.port=8080

query.max-memory=60GB

query.max-memory-per-node=20GB

备注:

These properties require some explanation:

coordinator: Allow this Presto instance to function as a coordinator (accept queries from clients and manage query execution).

node-scheduler.include-coordinator: Allow scheduling work on the coordinator. For larger clusters, processing work on the coordinator can impact query performance because the machine’s resources are not available for the critical task of scheduling, managing and monitoring query execution.

http-server.http.port: Specifies the port for the HTTP server. Presto uses HTTP for all communication, internal and external.

query.max-memory: The maximum amount of distributed memory that a query may use.

query.max-memory-per-node: The maximum amount of memory that a query may use on any one machine.

discovery-server.enabled: Presto uses the Discovery service to find all the nodes in the cluster. Every Presto instance will register itself with the Discovery service on startup. In order to simplify deployment and avoid running an additional service, the Presto coordinator can run an embedded version of the Discovery service. It shares the HTTP server with Presto and thus uses the same port.

discovery.uri: The URI to the Discovery server. Because we have enabled the embedded version of Discovery in the Presto coordinator, this should be the URI of the Presto coordinator. Replace master:8080 to match the host and port of the Presto coordinator. This URI must not end in a slash.

 

  •  jvm.config

-server

-Xmx40G

-XX:+UseG1GC

-XX:G1HeapRegionSize=32M

-XX:+UseGCOverheadLimit

-XX:+ExplicitGCInvokesConcurrent

-XX:+HeapDumpOnOutOfMemoryError

-XX:+ExitOnOutOfMemoryError

备注:

 

  • node.properties

node.environment=production

node.id=ffffffff-ffff-ffff-ffff-fffffffffff1

node.data-dir=/opt/beh/core/presto/data

备注:

The above properties are described below:

node.environment: The name of the environment. All Presto nodes in a cluster must have the same environment name.

node.id: The unique identifier for this installation of Presto. This must be unique for every node. This identifier should remain consistent across reboots or upgrades of Presto. If running multiple installations of Presto on a single machine (i.e. multiple nodes on the same machine), each installation must have a unique identifier.

node.data-dir: The location (filesystem path) of the data directory. Presto will store logs and other data here,Two softlink for directory “etc” and “plugin”, and var/run will store server pid file,var/log store log.

 

  • log.properties

com.facebook.presto=INFO

com.facebook.presto.server=INFO

com.facebook.presto.hive=INFO

备注:

The default minimum level is INFO (thus the above example does not actually change anything). There are four levels: DEBUG, INFO, WARN and ERROR.

 

1.2.3 连接器配置

创建连接器配置目录,并且配置相关连接器配置

cd /opt/beh/core/presto/etc

mkdir catalog

cd catalog

touch hive.properties

touch jmx.properties

touch mongodb.properties

touch mysql.properties

备注:

 

  • hive.properties

connector.name=hive-hadoop2

hive.metastore.uri=thrift://localhost:9083

hive.config.resources=/opt/beh/core/hadoop/etc/hadoop/core-site.xml,/opt/beh/core/hadoop/etc/hadoop/hdfs-site.xml

 

  • jmx.properties

connector.name=jmx

jmx.dump-tables=java.lang:type=Runtime,com.facebook.presto.execution.scheduler:name=NodeScheduler

jmx.dump-period=10s

jmx.max-entries=86400

 

  • mongodb.properties

connector.name=mongodb

mongodb.seeds=hadoop001:37025,hadoop002:37025,hadoop003:37025

 

  • mysql.properties

connector.name=mysql

connection-url=jdbc:mysql://mysqlhost:3306

connection-user=mysqluser

connection-password=mysqlpassword

 

1.2.4 服务启动

  • 环境变量配置

export PRESTO_HOME=/opt/beh/core/presto

export PATH=$PATH:$PRESTO_HOME/bin

命令行执行,或者添加到/opt/beh/conf/beh_env

 

  • 启动命令

cd /opt/beh/core/presto

./bin/launcher start

 

备注:

The installation directory contains the launcher script in bin/launcher. Presto can be started as a daemon by running the following:

bin/launcher start

Alternatively, it can be run in the foreground, with the logs and other output being written to stdout/stderr (both streams should be captured if using a supervision system like daemontools):

bin/launcher run

Run the launcher with --help to see the supported commands and command line options. In particular, the --verbose option is very useful for debugging the installation.

 

日志:

After launching, you can find the log files in var/log:

launcher.log: This log is created by the launcher and is connected to the stdout and stderr streams of the server. It will contain a few log messages that occur while the server logging is being initialized and any errors or diagnostics produced by the JVM.

server.log: This is the main log file used by Presto. It will typically contain the relevant information if the server fails during initialization. It is automatically rotated and compressed.

http-request.log: This is the HTTP request log which contains every HTTP request received by the server. It is automatically rotated and compressed.

 

2 命令行接口

下载命令行接口程序拷贝至/opt/beh/core/presto/bin下载地址

cd /opt/beh/core/presto/bin

chmod -x presto-cli-0.172-executable.jar

ln -s presto-cli-0.172-executable.jar presto

 

测试连接:

./presto --server localhost:8080 --catalog hive --schema default

 

 

[hadoop@sparktest bin]$ ./presto --server localhost:8580 --catalog hive --schema default

presto:default> HELP

 

Supported commands:

QUIT

EXPLAIN [ ( option [, ...] ) ] <query>

    options: FORMAT { TEXT | GRAPHVIZ }

             TYPE { LOGICAL | DISTRIBUTED }

DESCRIBE <table>

SHOW COLUMNS FROM <table>

SHOW FUNCTIONS

SHOW CATALOGS [LIKE <pattern>]

SHOW SCHEMAS [FROM <catalog>] [LIKE <pattern>]

SHOW TABLES [FROM <schema>] [LIKE <pattern>]

SHOW PARTITIONS FROM <table> [WHERE ...] [ORDER BY ...] [LIMIT n]

USE [<catalog>.]<schema>

 

presto:default> SHOW CATALOGS;

 Catalog

---------

 hive   

 jmx    

 mongodb

 mysql  

 system 

(5 rows)

 

Query 20170418_121353_00035_yr3tu, FINISHED, 1 node

Splits: 1 total, 1 done (100.00%)

0:00 [0 rows, 0B] [0 rows/s, 0B/s]

 

presto:default> SHOW SCHEMAS FROM HIVE;

       Schema      

--------------------

 default           

 information_schema

 tmp               

 tpc100g           

(4 rows)

 

Query 20170418_121409_00036_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [4 rows, 55B] [43 rows/s, 601B/s]

 

presto:default> USE hive.tmp;

presto:tmp> show tables;

    Table   

-------------

 date_dim   

 item       

 store_sales

(3 rows)

 

Query 20170418_121459_00040_yr3tu, FINISHED, 2 nodes

Splits: 18 total, 18 done (100.00%)

0:00 [3 rows, 62B] [40 rows/s, 830B/s]

 

presto:tmp> select count(*) from item;

 _col0 

--------

 204000

(1 row)

 

Query 20170418_121540_00041_yr3tu, FINISHED, 3 nodes

Splits: 20 total, 20 done (100.00%)

0:02 [204K rows, 11.8MB] [81.8K rows/s, 4.74MB/s]  

 

presto:tmp> quit

© 著作权归作者所有

共有 人打赏支持
Yulong_
粉丝 8
博文 93
码字总数 169760
作品 0
朝阳
部门经理
FaceBook Prestodb 配置文档

安装配置 在etc目录下创建如下几个文件 node.properties jvm.properties config.properties log.properties Catalog 配置 启动 启动之后你可以在/data/store/presto目录下找到输出的日志 命令...

稻草鸟人
2016/05/12
397
3
Fedora Linux 11 正式版发布

以下是 Fedora 11 的主要特性: Automatic font and mime-type installation Volume Control Intel, ATI and Nvidia kernel modsetting Fingerprint IBus input method system Presto 这个发......

红薯
2009/06/10
913
2
presto分布式环境搭建

1.Presto的基本需求 Linux or Mac OS X Java 8, 64-bit Python 2.4+ Presto支持从以下版本的Hadoop中读取Hive数据: Apache Hadoop 1.x Apache Hadoop 2.x Cloudera CDH 4 Cloudera CDH 5 支......

super_yu
2016/06/15
160
0
Centos 6.9 配置 Presto

解压缩 presto-server-0.166.tar.gz 2. 在 presto-server-0.166 目录下创建 etc 目录 3. 在 etc 目录下创建 catalog 目录 4. 在 catalog 目录下创建文件 hive.properties ,文件内容如下 5....

自东土大唐而来
03/05
5
0
大数据查询引擎--PrestoDB

Presto是Facebook最新研发的数据查询引擎,可对250PB以上的数据进行快速地交互式分析。据称该引擎的性能是 Hive 的 10 倍以上。 PrestoDB 是 Facebook 推出的一个大数据的分布式 SQL 查询引擎...

红薯
2013/06/13
25.4K
2

没有更多内容

加载失败,请刷新页面

加载更多

下一页

Nginx防盗链、访问控制、Nginx解析PHP相关配置、Nginx代理

Nginx防盗链 在配置文件里写入以下内容: 用curl测试 访问控制 Nginx限制某些IP不能访问或者只允许某些IP访问。 配置文件写入如下内容: allow 表示允许访问的IP,deny限制访问的IP。 匹配正...

黄昏残影
8分钟前
0
0
自己动手实现RPC服务调用框架

转载 TCP的RPC 引言 本文利用java自带的socket编程实现了一个简单的rpc调用框架,由两个工程组成分别名为battercake-provider(服务提供者)、battercake-consumer(服务调用者)。 设计思路...

雨中漫步的鱼
10分钟前
0
0
Centos6.x安装之后的9件事

Centos6.x安装之后的9件事 这些不是必须都做的,只不过是我个人的习惯,在此记录一下。 1.修改yum源到国内 CentOS系统更换软件安装源 备份你的原镜像文件,以免出错后可以恢复。 mv /etc/yu...

叶云轩
15分钟前
5
0
springboot2 使用jsp NoHandlerFoundException

开发图片上传功能,为验证测试功能是否正常,使用JSP编写表单提交进行测试 开发完成后,请求API提示如下异常: No mapping found for HTTP request with URI [/WEB-INF/jsp/avatar_upload.j...

showlike
21分钟前
0
0
springboot踩坑记--springboot正常启动但访问404

一 spring boot的启动类不能直接放在main(src.java.main)这个包下面,把它放在有包的里面就可以了。 二 正常启动了,但是我写了一个controller ,用的@RestController 注解去配置的controlle...

onedotdot
23分钟前
0
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部