文档章节

What can you do with PostgreSQL and JSON?

一号男嘉宾
 一号男嘉宾
发布于 2015/04/24 20:29
字数 813
阅读 62
收藏 0


PostgreSQL 9.2 added a native JSON data type, but didn’t add much else. You’ve got three options if you actually want to do something with it:

  1. Wait for PostgreSQL 9.3 (or use the beta)
  2. Use the plv8 extension. Valid option, but more DIY (you’ll have to define your own functions)
  3. Use the json_enhancements extension, which backports the new JSON functionality in 9.3 to 9.2

I wanted to use this stuff now, and I opted to go with option 3. I wrote a blog post which should help you get going if you want to go this route: adding json_enhancements to PostgreSQL 9.2.

So let’s assume you’re on either 9.3, or 9.2 with json_enhancements. What can you do? Lots! All the new JSON operators and functions are in the 9.3 documentation, so I’m going to run through some of the more fun things you can do along with a real-world use case.

Get started

Create a database to play about in:


createdb json_test
psql json_test




With some sample data:


CREATE TABLE books ( id integer, data json ); INSERT INTO books VALUES (1, '{ "name": "Book the First", "author": { "first_name": "Bob", "last_name": "White" } }'); INSERT INTO books VALUES (2, '{ "name": "Book the Second", "author": { "first_name": "Charles", "last_name": "Xavier" } }'); INSERT INTO books VALUES (3, '{ "name": "Book the Third", "author": { "first_name": "Jim", "last_name": "Brown" } }');



Selecting

You can use the JSON operators to pull values out of JSON columns:


SELECT id, data->>'name' AS name FROM books; id | name ----+-----------------  1 | Book the First 2 | Book the Second 3 | Book the Third



The -> operator returns the original JSON type (which might be an object), whereas ->> returns text. You can use the -> to return a nested object and thus chain the operators:

SELECT id, data->'author'->>'first_name' as author_first_name FROM books; id | author_first_name ----+-------------------  1 | Bob 2 | Charles 3 | Jim



How cool is that?

Filtering

Of course, you can also select rows based on a value inside your JSON:


SELECT * FROM books WHERE data->>'name' = 'Book the First'; id | data ----+---------------------------------------------------------------------------------------  1 | '{ "name": "Book the First", "author": { "first_name": "Bob", "last_name": "White" } }'



You can also find rows based on the value of a nested JSON object:


SELECT * FROM books WHERE data->'author'->>'first_name' = 'Charles'; id | data ----+---------------------------------------------------------------------------------------------  2 | '{ "name": "Book the Second", "author": { "first_name": "Charles", "last_name": "Xavier" } }'



Indexing

You can add indexes on any of these using PostgreSQL’s expression indexes, which means you can even add unique constraints based on your nested JSON data:


CREATE UNIQUE INDEX books_author_first_name ON books ((data->'author'->>'first_name')); INSERT INTO books VALUES (4, '{ "name": "Book the Fourth", "author": { "first_name": "Charles", "last_name": "Davis" } }'); ERROR: duplicate key value violates unique constraint "books_author_first_name" DETAIL: Key (((data -> 'author'::text) ->> 'first_name'::text))=(Charles) already exists.



Expression indexes are somewhat expensive to create, but once in place will make querying on any JSON property very fast.

A real world example

OK, let’s give this a go with a real life use case. Let’s say we’re tracking analytics, so we have an events table:


CREATE TABLE events ( name varchar(200), visitor_id varchar(200), properties json, browser json );



We’re going to store events in this table, like pageviews. Each event has properties, which could be anything (e.g. current page) and also sends information about the browser (like OS, screen resolution, etc). Both of these are completely free form and could change over time (as we think of extra stuff to track).

Let’s insert a couple of events:


INSERT INTO events VALUES ( 'pageview', '1', '{ "page": "/" }', '{ "name": "Chrome", "os": "Mac", "resolution": { "x": 1440, "y": 900 } }' ); INSERT INTO events VALUES ( 'pageview', '2', '{ "page": "/" }', '{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1920, "y": 1200 } }' ); INSERT INTO events VALUES ( 'pageview', '1', '{ "page": "/account" }', '{ "name": "Chrome", "os": "Mac", "resolution": { "x": 1440, "y": 900 } }' ); INSERT INTO events VALUES ( 'purchase', '5', '{ "amount": 10 }', '{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1024, "y": 768 } }' ); INSERT INTO events VALUES ( 'purchase', '15', '{ "amount": 200 }', '{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1280, "y": 800 } }' ); INSERT INTO events VALUES ( 'purchase', '15', '{ "amount": 500 }', '{ "name": "Firefox", "os": "Windows", "resolution": { "x": 1280, "y": 800 } }' );



Hm, this is starting to remind me of MongoDB!

Collect some stats

Using the JSON operators, combined with traditional PostgreSQL aggregate functions, we can pull out whatever we want. You have the full might of an RDBMS at your disposal.

Browser usage?

SELECT browser->>'name' AS browser, count(browser) FROM events GROUP BY browser->>'name'; browser | count ---------+-------  Firefox | 3 Chrome | 2 




Total revenue per visitor?

SELECT visitor_id, SUM(CAST(properties->>'amount' AS integer)) AS total FROM events WHERE CAST(properties->>'amount' AS integer) > 0 GROUP BY visitor_id; visitor_id | total ------------+-------  5 | 10 15 | 700 




Average screen resolution?

SELECT AVG(CAST(browser->'resolution'->>'x' AS integer)) AS width, AVG(CAST(browser->'resolution'->>'y' AS integer)) AS height FROM events; width | height -----------------------+----------------------  1397.3333333333333333 | 894.6666666666666667 




You’ve probably got the idea, so I’ll leave it here.



© 著作权归作者所有

共有 人打赏支持
一号男嘉宾
粉丝 132
博文 28
码字总数 40512
作品 0
成都
私信 提问
Slick-pg v0.8.1 发布,PostgreSQL 的 Slick 扩展

Slick-pg 发布,主要的更新有: 增加 jsonb 支持(ps:为保持向前兼容,默认绑定的 pg json 类型还是 'json') 增加了更多 json 操作符和函数的支持 --------------------------------------...

Tu_Minglei
2015/02/03
752
0
Slick-pg v0.6.5.2 发布,Slick 的 PG 扩展

Slick-pg v0.6.5.2 发布了,相比 0.6.3 主要更新有: - 增加了 pg LTree 的支持 - search 增强:更多操作符/方法;允许指定语言 - date2/threeten 插件:允许选择性的绑定 Duration/Period ...

Tu_Minglei
2014/10/08
327
0
slick-pg v0.1.5 发布

这个版本的更新主要是: 增加了对 json 的支持 (PostgreSQL 9.3 正式版已经发布了,所以我适时加入了对 pg json 的支持。功能其实前两个星期就已经开发测试好了,但公司跟联邦政府的那个项目...

Tu_Minglei
2013/09/29
315
0
Slick-pg v0.5.0 发布, PostgreSQL 的 Slick 扩展

针对 Slick 2 的第一个 Slick-pg 正式版 v0.5.0 发布了,它是在 Slick-pg v0.2.2 的代码基础上开发的,相关的更新有: - Slick 依赖升级到 v2.0.0 - 基本的 pg composite type 支持 - array ...

Tu_Minglei
2014/02/07
292
0
HTAP数据库 PostgreSQL 场景与性能测试之 46 - (OLTP) 大json字段的高并发更新

标签 PostgreSQL , HTAP , OLTP , OLAP , 场景与性能测试 背景 PostgreSQL是一个历史悠久的数据库,历史可以追溯到1973年,最早由2014计算机图灵奖得主,关系数据库的鼻祖MichaelStonebrake...

德哥
07/28
0
0

没有更多内容

加载失败,请刷新页面

加载更多

走上真正的教育之路——《中国最美的语文》读后感3100字

走上真正的教育之路——《中国最美的语文》读后感3100字: 文:张平。《中国最美的语文》,乍一听,你可能会觉得这个作者或者编辑有些大言不惭,竟然起这么大而狂的名字,可能名不副实。我没...

原创小博客
26分钟前
2
0
tomcat线程模型

Connector结构 BIO模式 NIO模式

grace_233
46分钟前
2
0
Javascript

变量定义 以$,_,字母开头 大小写敏感 var 关键字声明变量 声明未赋值undefined 数据类型 字符串,数字,布尔,数组,NULL,undefined 变量均为对象 函数 无默认值 var声明的是局部变量 函数外声明...

关元
48分钟前
1
0
文件系统、服务、防火墙、SELINUX——安全四大金刚

一提到安全,大家都会想到防火墙,和文件系统权限。而实际工作环境中,我们在Linux的安全配置,会涉及到四个级别。我们思考一个场景,你要在百度盘中存放一个文件,这个动作需要考虑下面四个...

Linux就该这么学
49分钟前
3
0
从源码角度理解Java设计模式——门面模式

一、门面模式介绍 门面模式定义:也叫外观模式,定义了一个访问子系统的接口,除了这个接口以外,不允许其他访问子系统的行为发生。 适用场景:子系统很复杂时,增加一个接口供外部访问。 优...

我叫刘半仙
59分钟前
3
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部