文档章节

[ZT]Our Go is fine but our SQL is great

taodongl
 taodongl
发布于 2017/09/08 15:42
字数 2180
阅读 9
收藏 0
点赞 0
评论 0

From: medium

​​A few months ago fat wrote about going to hell and back re-writing Bumpers in React. To round out our thought leading, I’m going to talk about the Bumpers backend. We haven’t done anything quite as dramatic as a re-write but we’ve been learning how to do things The Go Way and it’s affected our thinking to the point that it feels worth writing down. So here goes.

Like Jacob I also hate frameworks, I hate not using a framework, and I hate rolling my own. On the flip side I love writing SQL (which is apparently not a normal thing to love and a preview of the rest of this post).

Preface

The bumpers backend is a Go HTTP server. It runs on Heroku and it’s mostly an API server, but it has a few routes that serve up our sweet frontend. The Bumpers API exposes some of the data in our Postgres database to Bumpers clients. Really shocking for an API, but this is the world we live in now.

We’ve tried a couple different approaches to talking to that database — we’ve settled on using a couple Repository interfaces and hiding all our DB access behind them.

type EpisodeRepo interface {
  Create(*api.Episode) (*api.Episode, error)
  GetByID(string) (*api.Episode, error)
  GetByURLToken(string) (*api.Episode, error)
  GetTrending(int, api.Cursor) ([]api.Episode, error)
}

​​Using an interface like this isn’t new or interesting — it’s called a DAO or the Repository Pattern and you can find it in on Martin Fowler’s blog or in the Gang of Four book. You can find a lot more writing on the general idea if you’re into that.

​​​​What is actually unusual about these Repos in the year 2017 is that we don’t have an ORM or any other database library generating SQL for us. We’re writing all our SQL by hand.

Why Would You Even Do That

​​When I say ORM I really mean a library that really tries to own your interaction with the database. The most comprehensive ORMs are things like ActiveRecord and Hibernate but under that loose of a definition you could extend the label all the way down to libraries like Sequel and Korma, even though they’re not trying to own your whole life or even dealing with “objects”. I’m actively not talking about basic things like drivers or connection pools or supporting query variables — please don’t write your own SQL escaping in 2017.

​​People tend to grab one of these libraries for some of the following reasons:​​

  • ​​It’s part of a framework that’s being used. Godspeed.​​
  • ​​The application needs to be database agnostic. This isn’t a real problem.
  • ​​You like $COOL_PROGRAMMING_LANGUAGE more than you like SQL. It’s easier to write, look at, and understand code in this ORM than it is to look at and understand a SQL query.
  • ​​The library is a helpful abstraction. It lets you compose and re-use queries in a way that writing SQL yourself wouldn’t. It makes it easy to deal with data in your language and not SQL Result Rows. It probably helps you develop a little bit faster.

​​We’re not using a framework that comes with an opinion about The Database, so we’re left with two real criteria to evaluate our libraries on.

​​I’m personally extremely into SQL and Jacob and Ian don’t hate it (yet?), so hiding SQL behind Go isn’t a reason by itself. There’s also evidence that applications that try to deny the existence of a Database behind the curtain end up having some pretty gnarly edge cases. For example, if you’re really using ActiveRecord, you end up having to know ActiveRecord so well that you know what SQL it generates and then decide if it’s ok to ship that SQL. I think that completely invalidates the “it’s easier to understand” line of thinking.

​​Ok That Is A Good Reason​

​​So we’re down to deciding whether or not a database library provides a “helpful abstraction”. The value of a good abstraction is HUGE even if you still have to acknowledge what’s going on with your database underneath. I think it’s totally reasonable to have to understand all of the SQL an ActiveRecord query generates and also think that Scopes are a good and useful tool.​​

​​For Bumpers, we looked at what’s currently out there for Go. The state-of-the-art libraries that claim some sort of ORM-style heritage seem to provide struct mapping and some help with simple SELECT, INSERT and UPDATE queries and not much more. Some of them provide ActiveRecord-inspired DSLs for composing query strings (there were a lot of strings) on top of that. When we talked to folks using one or another of those libraries in production we tended to get back an ascii-shrug-man and not much more. Bummer.​​

We did find, and IMMEDIATELY start using sqlx. sqlx does the struct-mapping that the other libraries do, and provides Get and Select functions that handle the boilerplate of doing a db.Query Row or db.Query scanning the results into a struct or slice of structs. With sqlx our query methods all look something like this:

type episodeRepo struct {
  getByID *sqlx.Stmt // this gets prepared when we new up a struct
}

func (e *episodeRepo) GetByID(id string) (*api.Episode, error) {
  var episode api.Episode
  if err := e.getByID.Get(&episode, id); err != nil {
    return nil, convertError(err)
  }  
  return &episode, nil
}

Together that’s about 90% of what we felt like we were getting out of any of the other libraries, and comes with (almost) no overhead of figuring out what’s going on under the covers.

So, we decided to just keep doing the raw-SQL statements thing.

The Other 10%

So we’re now left with the problem that raw SQL is kinda verbose and hard to re-use. We don’t really have a great answer, but we have started organizing our Repo files in a way that helps.

Each one of our repos lives in a file by itself. Those files have the repo interface up top, followed by a bunch of consts defining the SQL queries we’re gonna need, and then a struct that implements the interface.

The file that the CategoryRepo from above lives in looks something like this:

type EpisodeRepo interface {
  Create(episode *api.Episode) (*api.Episode, error)
  GetByID(id string) (*api.Episode, error)
  GetByURLToken(token string) (*api.Episode, error)
  GetTrending(limit int) ([]api.Episode, error)
}

const (
  episodeInsert = `
  INSERT INTO episodes (image_filename, ...)
  VALUES (:image_filename, ...)
  RETURNING episode_id
  `
  episodeSelectBase = `
  SELECT
    episodes.episode_id,
    episodes.created_at,
    episodes.updated_at,
    episodes.image_filename
    -- and so many more
  FROM episodes
  `
  episodeSelectByID       = episodeSelectBase + `WHERE episode_id = $1`
  episodeSelectByURLToken = episodeSelectBase + `WHERE url_token = $1`
  episodeSelectTrending   = categorySelectBase + `ORDER BY trending_score LIMTI $1`
  // and more ...
)

type psqlEpisodeRepo struct {
  create           *sqlx.NamedStmt
  selectByID       *sqlx.Stmt
  selectByURLToken *sqlx.Stmt
  selectTrending   *sqlx.Stmt
}

func (p *psqlEpisodeRepo) Create(input *api.Episode) (*api.Episode, error) {
  // implementation goes here
}

func (p *psqlEpisodeRepo) GetByID(id string) (*api.Episode, error) {
  // implementation goes here
}

func (p *psqlEpisodeRepo) GetByURLToken(token string) (*api.Episode, error) {
  // implementation goes here
}

func (p *psqlEpisodeRepo) GetTrending(limit int) ([]api.Episode, error) {
  // implementation goes here
}

We tend to write a bunch of very similar get methods that all have to do the same post-query transformations on the data we fetch from Postgres, so we usually end up writing a getOne and a getSlice func to help with that. Those funcs do all the fetching of related entities (by calling other repos) and setting computed fields.

They tend to look a little bit like this:

func (p *psqlEpisodeRepo) GetByID(id string) (*api.Episode, error) {
  return p.getOne(p.selectByID, id)
}

func (p *psqlEpisodeRepo) GetByURLToken(token string) (*api.Episode, error) { 
  return p.getOne(p.selectByURLToken, token)
}

func (p *psqlEpisodeRepo) getOne(stmt *sqlx.Stmt, args ...interface{}) (*api.Episode, error) {
  var episode api.Episode
  if err := e.getByID.Get(&episode, id); err != nil {
    return nil, convertError(err)
  } 
    
  // do whatever computation you've gotta do
  
  return &episode, nil  
}

Which is, honestly, pretty great. It gives us one place to look for N+1 queries, missing fields, and all that fun stuff that shows up when we’re looking at bugs.

Ok Where Are We?

The Bad Parts

While these funcs are looking great, the SQL up top is a little bit of a sadder story. The more we’re adding, the noisier the file gets. We’ve been composing consts (and only ever consts, no runtime SQL changes here) to reduce the noise.

Episodes are pretty chill, but we have quite a few different ways of fetching Users and things in UserRepo can get pretty crazy once you start adding in bulk selects.

const (
  userSelectBase = `SELECT select * FROM users`
  
  userSelectByID          = userSelectBase + ` WHERE user_id = $1`
  userSelectByURLToken    = userSelectBase + ` WHERE url_token = $1`
  userSelectByVanityURL   = userSelectBase + ` WHERE url_vanity = $1`
  userSelectByUsername    = userSelectBase + ` WHERE username = $1`
  userSelectAllByID       = userSelectBase + ` WHERE user_id IN (?)`
  userSelectAllByURLToken = userSelectBase + ` WHERE url_token IN (?)`
)

There’s a lot of noise there. Preparing all our statements in new-struct funcs means that finding the query for a statement generally involves doing a go-to-definition on a statement, finding it in the new-repo func, and then doing a go-to-definition on a constant just to get back to our composed consts. Naming conventions for the consts and statements help a lot, but it’s still a little gnarly. We’ve considered moving the queries into their own file whatever_queries.go but haven’t tried that yet.

Unsurprisingly, relationships between structs/tables are hard. We’re currently enforcing that a EpisodeRepo is the only repo that is allowed to return an Episode and a UserRepo is the only repo allowed to return a User. We end up with Repo implementations that that have other Repos as fields, and where the referenced repo knows about a join table.

Whew that’s gnarly prose. Here’s an example: an Episode has multiple Authors and a UserRepo has to know about our episode_authors join table:

func (e *episodeRepo) getOne(stmt *sqlx.Stmt, args ...interface{}) (api.Episode, error) {
  episode := api.Episode{}
  if err := query.Get(&episode, args...); err != nil {
    return nil, convertError(err)
  }

  authors, err := e.userRepo.GetEpisodeAuthors(e.EpisodeID)
  if err != nil {
    return nil, err /* already converted */
  }
  episode.Authors = authors

  return episode, nil
}

It works well enough that we avoid having any real fun N+1 query situations. It does mean that every time we want to grab an Episode we’re fetching every author for it. This is fine for now, and we can get away with it at our current data volume, but that may not always be the case.

Finally, since we’re writing our own SQL queries for all those Get methods, we write tests for all of these funcs that would come for free with an ORM. That definitely feels like it should count towards our lines of boilerplate-per-Repo.

The Good Parts

While there’s still an amount of boilerplate in this code, the end result is still readable. It hasn’t been a crazy thing for us to drop in and change our Repos pretty significantly. This has held true when our repos have two dozen methods, which is crazy. The most painful part has been the up-front creation of new Repos. Trading a few minutes of up-front tedium for readability and maintainability seems like a very Go tradeoff, and we’re generally happy to optimize for operator happiness over developer happiness. So, we’re fine with that.

Not having to guess what an ORM is doing is another operator happiness thing — since we’re writing queries by hand, we have to think up front about what indexes a query is going to use, whether or not a join is faster than two queries, or if a SELECT FOR UPDATE is locking rows that it doesn’t need to. Running EXPLAIN ANALYZE on a query and also having code review cover SQL is amazing, and all that explicit thinking about SQL really sets us up to think about things like “transactions” and “data integrity”. Ultimately, we’re writing better SQL than we would be otherwise because we have to think about it.

And last but not least, it means that we’ve got full access to the dope features in newer versions of Postgres. Sometimes this is maybe a little too much Shiny New Toy, but jsonb column types and INSERT… ON CONFLICT DO NOTHING have already been awesome. Given that we’re doing code-review for SQL, it’s not crazy to use those things and there’s no weird impedance mismatch between an ORM’s abstraction and those features.

THE END

So that’s where we ended up. We’re happy enough with our current situation, but still keeping our ears open for the next cool thing to come along. I occasionally space out on the train and think about what something like a Go take on HugSQL would look like, so if you’re working on that DEFINITELY let me know. If you’ve got questions or think we’re absolutely insane or had a similar experience, leave a comment or a note and I’ll do my best to reply.

© 著作权归作者所有

共有 人打赏支持
taodongl
粉丝 1
博文 2
码字总数 0
作品 0
南京
[LeetCode] Sentence Similarity II 句子相似度之二

Given two sentences words1, words2 (each represented as an array of strings), and a list of similar word pairs pairs, determine if two sentences are similar. For example, words1......

机器的心脏 ⋅ 2017/12/05 ⋅ 0

整合Spring Hibernate时 this.getSession() 报空指针异常

小弟弟是新人,今天早上练习Spring-Hibernate整合 写DAO 时 this.getSession():报空指针异常 搞了4个小时了还是没搞定,各位大侠帮帮忙。这个是项目图 这个是User.hbm.xml配置:...

关二爷 ⋅ 2012/09/13 ⋅ 7

Oracle Union Union All 使用方法

Union 指令的目的是将两个 SQL 语句的结果合并起来,可以查看你需要查询结果. 例如: SELECT Date FROM StoreInformation UNION SELECT Date FROM InternetSales 注意: union用法中,两个selec...

widely ⋅ 2013/10/14 ⋅ 0

禅道项目管理软件的数据库结构

禅道项目管理软件的数据库目前共有36张表,简单来给大家介绍下表的作用和之间的关系。 更多关于禅道:http://www.zentao.net/ 一、产品相关的表 zt_product,则记录了产品相关的信息。 zt_p...

炒蚕豆吃蹦豆 ⋅ 2012/08/16 ⋅ 2

IIS 下 URLRewrite 中文乱码 问题解决

使用了微软的 URLRewrite 反复实验,反复更改编码都是乱码 最终使用了ISAPI_Rewrite 代替 URLRewrite 圆满解决了问题 下面是ISAPI_Rewrite 规则: /zt/lol/shipin/英雄/ 规则: RewriteRule ...

foxidea ⋅ 2012/11/13 ⋅ 0

MySQL 事务和加锁

1、MySQL 开启事务后: UPDATE | DELETE 操作后,其他线程再去 更新时 需要等待事务完成后才行,其他操作无影响 SELECT INSERT 操作后,其他线程无影响,提交事务后 其他线程才能查询新记录 ...

skq ⋅ 2016/11/19 ⋅ 0

禅道自动化测试框架--ZenTaoATF

ZenTaoATF是由禅道开发团队开发的自动化测试框架,它只有一个脚本文件,可以支持各种脚本语言编写的自动化测试脚本,语法简单,使用灵活,后续会和禅道项目管理软件进行绑定和集成。 它的基本...

开源春哥 ⋅ 2013/03/19 ⋅ 0

Python爬虫:如何爬取分页数据?

上一篇文章《Python爬虫:爬取人人都是产品经理的数据》中说了爬取单页数据的方法,这篇文章详细解释如何爬取多页数据。 爬取对象: 有融网理财项目列表页【履约中】状态下的前10页数据,地址...

bestdwd ⋅ 2017/05/24 ⋅ 0

ZeroTurnaround ZIP类库

在Java平台上有很多官方的和非官方、第三方的压缩工具包,它们各有各的长处,比如Oracle官方的java.util.zip 类库,Apache网站上的Apache Commons Compress 类库,或者Chilkat Java Zip 类库...

程海峰 ⋅ 2013/06/04 ⋅ 0

禅道项目管理软件ZenTaoPHP框架系统要求

禅道项目管理软件ZT框架使用了php5所特有的语法、类、函数,因此php4是无法运行ZT框架的。使用ZT,你需要: 1. web server最好是apache。因为它和php集成使用是最稳定,功能也是最齐全的一套...

炒蚕豆吃蹦豆 ⋅ 2012/08/28 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

MySQL主从复制原理、半同步操作步骤及原理

1.1 企业Linux运维场景数据同步方案 1.1.1 文件级别的异机同步方案 1、scp/sftp/nc 命令可以实现远程数据同步。 2、搭建ftp/http/svn/nfs 服务器,然后在客户端上也可以把数据同步到服务器。...

xiaomin0322 ⋅ 19分钟前 ⋅ 0

Oracle10g 数据及文件迁移过程[原]

QL*Plus: Release 10.2.0.1.0 - Production on 星期三 5月 11 10:22:35 2011 Copyright (c) 1982, 2005, Oracle. All rights reserved. 连接到: Oracle Database 10g Enterprise Edition Re......

harrypotter ⋅ 24分钟前 ⋅ 0

nginx安装

1:安装工具包 wget、vim和gcc yum install -y wget yum install -y vim-enhanced yum install -y make cmake gcc gcc-c++ 2:下载nginx安装包 wget http://nginx.org/download/nginx-1......

壹丶贰 ⋅ 27分钟前 ⋅ 0

ideaVim安装及配置

1.安装插件 File-Settings-Plugins,Browse Repositories,输入ideavim,安装。 重启后,在Tools-Vim Emulator启用。 2.快捷键设置 ideaViim键与idea快捷键有冲突,可以在Settings-Other Se...

Funcy1122 ⋅ 32分钟前 ⋅ 0

MySQL中B+Tree索引原理

B+树索引是B+树在数据库中的一种实现,是最常见也是数据库中使用最为频繁的一种索引。B+树中的B代表平衡(balance),而不是二叉(binary),因为B+树是从最早的平衡二叉树演化而来的。在讲B...

浮躁的码农 ⋅ 46分钟前 ⋅ 0

两道面试题,带你解析Java类加载机制

在许多Java面试中,我们经常会看到关于Java类加载机制的考察,例如下面这道题: class Grandpa{ static { System.out.println("爷爷在静态代码块"); }} cl...

1527 ⋅ 50分钟前 ⋅ 0

SpringCloud(Data Flow)

dataflow-server

赵-猛 ⋅ 今天 ⋅ 0

深入理解Java虚拟机

这本书我读到第8章,之后就是在读不下去了。 读到后面是一种痛苦的体验,太多的东西是不全面的,大量的专有名词是没有解释的,读到最后很多东西仅仅是一个侧面,所以我觉得,这本书不适合初学...

颖伙虫 ⋅ 今天 ⋅ 0

NanoPi NEO core/ Ubuntu16.04单网卡配置3个IP地址(2个静态,1个动态)

配置 root@NanoPi-NEO-Core:/etc/network# cat interfacesauto loiface lo inet loopbackallow-hotplug eth0iface eth0 inet static address 172.31.188.249 netmask 255.......

SamXIAO ⋅ 今天 ⋅ 0

三步为你的App集成LivePhoto功能

摘要:LivePhoto是iOS9新推出的一种拍照方式,类似于拍摄Gif图或录制视频片段生成图片。如果没有画面感,可以联想《哈利波特》霍格沃茨城堡的壁画,哈哈,很炫酷有木有,但坑爹的是只有iphone6S以...

壹峰 ⋅ 今天 ⋅ 0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部