文档章节

用SQL做数据分析_1

Master_Li
 Master_Li
发布于 2017/08/05 18:47
字数 1282
阅读 100
收藏 0

题目:给定经纬度区间(经度(10,20),纬度(100,200)),50%以上订单起点在该区间内,定义司机为市区司机,相反,若50%以上订单起点在该区间外则定义司机为郊县司机。在北京市随机取1000个服务分在【90,100】的快车司机,请写出SQL代码查询市区司机和郊县司机的数量和市区与郊县司机占比。

分析:

首先在北京市随机取1000个服务分在【90,100】的快车司机,将结果作为司机表d

select 
	Top 1000 a.*, 
	c.service_score
	from a,c
	where (a.driver_id = c.driver_id) 
		and (c.service_score between 90 and 100) 
		and (a.driver_car_lev in ('1','2','3'))
		and (a.city_id = 1)

将司机表和订单表联合得到司机订单表,一个司机对应多个订单,每个订单对应一个起点经纬度,然后以driver_id分组,同时计算每个订单是属于市区订单还是郊区订单,将结果作为表e

select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan

对于表e,以司机为单位,计算每个司机的总订单数,市区订单数,市区订单比例,郊区订单数,郊区订单比例。将每个司机的处理结果作为表f

select 
	driver_id,
	COUNT(order1) as tlnum,
	SUM(order2) as sqnum,
	SUM(order2)*1.0/COUNT(order1) as sqrate,
	SUM(order3) as jqnum,
	SUM(order3)*1.0/COUNT(order1) as jqrate
	from 
	(select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
	group by driver_id 	

最后,只要对表f做一个统计和判断,即可得到市区司机和郊县司机的数量和市区与郊县司机占比

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

答案:

为了测试SQL的正确性,以SQL Server数据库建表,使用C#程序批量填入数据

①使用对象资源管理器创建表a,b,c

②使用C#批量插入数据

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace DD
{
    class Program
    {
     
        static void Main(string[] args)
        {
            String CONNECTION_STRING = "server = .;database=DD;Integrated Security = true;";

            SqlConnection conn = new SqlConnection(CONNECTION_STRING);
            conn.Open();
            
            SqlCommand cmd = new SqlCommand();
            cmd.Connection = conn;
            cmd.CommandType = CommandType.Text;


            for (int i = 1; i <=1000; i++)
            {
                string sql =
                       "insert into a(driver_id, city_id, driver_car_lev) values (" + Convert.ToString(i) + "," + Convert.ToString(1) + "," + Convert.ToString(1) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            for(int i =1; i<=600; i++){
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }
            for (int i = 601; i <= 1000; i++)
            {
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                      "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }

            for (int i = 1; i <= 1000; i++)
            {
                string sql =
                       "insert into c(driver_id, service_score) values (" + Convert.ToString(i) + "," + Convert.ToString(new Random().Next(10) + 90) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            //SqlDataReader reader = cmd.ExecuteReader();
            //System.Console.Write("column1\tcolumn2\tcolumn3\n");
            //while (reader.Read())
            //{
            //    System.Console.Write(reader["driver_id"]);
            //    System.Console.Write(reader["city_id"]);
            //    System.Console.Write(reader["driver_car_lev"]);
            //}
            conn.Close();
            System.Console.Write("Insert complete!");
            Console.ReadKey(true);//暂停  

        }
    }
}

③测试SQL脚本

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

结果:表d

表e

表f

最终结果

 

Over!!!

© 著作权归作者所有

上一篇: 字符串匹配
下一篇: OpenCV+MLP+Java
Master_Li
粉丝 1
博文 74
码字总数 57794
作品 0
南京
程序员
私信 提问
加载中

评论(2)

Master_Li
Master_Li 博主

引用来自“lijiachen56”的评论

您好大神,想请教一下表f
SUM(order2)*1.0/COUNT(order1) as sqrate
为什么要*1.0呢?
因为不确定sum()和count()返回的类型,所以为了保证结果是浮点类型,*1.0
l
lijiachen56
您好大神,想请教一下表f
SUM(order2)*1.0/COUNT(order1) as sqrate
为什么要*1.0呢?
如何从Python负零基础到精通数据分析

一、为什么学习数据分析 1、运营的尴尬:运营人需要一个硬技能 每个初入行的新人都会察觉到,运营是一个似乎并没有自己的核心竞争力和安全感的工作。因为每天的工作好像都被各种琐事所围绕,...

wx5e96c1530227f
05/20
0
0
阿里云分布式关系数据库DRDS - 实时SQL执行日志分析(一)

概述 数据库是企业业务的数据核心,对数据库的操作行为尤其是所有SQL执行的行为进行记录并审计的日志,就显得尤为重要。但围绕数据库的安全问题却日益严重,今年5月份,某快递公司被暴出上亿...

成喆
2018/11/05
0
0
加米谷:数据分析师是做什么的,需要掌握哪些技能?

数据分析师是做什么的: 1、支持各种常规或临时数据分析需求; 2、提供各类业务相关的分析及建议; 3、通过建模深入挖掘用户或产品方面的有价值的信息; 4、和各部门沟通协调需求并提出各种新...

加米谷大数据
2018/07/13
1
0
大数据分析师是做什么的?需要掌握哪些技能?

数据分析师是做什么的: 1、支持各种常规或临时数据分析需求; 2、提供各类业务相关的分析及建议; 3、通过建模深入挖掘用户或产品方面的有价值的信息; 4、和各部门沟通协调需求并提出各种新...

加米谷
2018/07/13
0
0
数据分析面试必备——SQL你准备好了吗? - 知乎

前些天在网上冲浪的时候看到一个案例咨询,问说世界500强的数据分析要不要去,评论区一片爆炸:“楼主能分享一下文科生怎么转行做数据分析吗??”、“SQL、python这些学起来好痛苦!”我看着...

和数据一起玩
2019/10/21
0
0

没有更多内容

加载失败,请刷新页面

加载更多

小师妹学JavaIO之:文件写入那些事

简介 小师妹又对F师兄提了一大堆奇奇怪怪的需求,要格式化输出,要特定的编码输出,要自己定位输出,什么?还要阅后即焚?大家看F师兄怎么一一接招吧。 字符输出和字节输出 小师妹:F师兄,上...

flydean
22分钟前
12
1
直接显示StackOverflow的答题日期, 增加评论区回复的时间显示 ,修改时间显示到小时分。

// ==UserScript==// @name 直接显示StackOverflow的答题日期, 增加评论区回复的时间显示 ,修改时间显示到小时分。// @namespace http://tampermonkey.net/// @version ...

FalconChen
今天
36
0
Shader笔记_005 纹理

纹理最初的目的就是使用一张图片来控制模型的外观,通过纹理映射技术 我们可以把一张图粘贴在物体表面,逐纹素的控制模型的颜色。 通常美术建模的时候也会在软件里利用纹理展开技术把纹理展开成...

STONE-CITY
今天
12
0
iOS MVVM 与RAC结合使用

MVVM配合 RAC 更能发挥的淋漓尽致。 我们把 MVVM 第一篇的例子 KVO 的事件 替换成 配合RAC 框架使用, OC的话直接导入 : pod 'ReactiveObjC' Swift 直接用 RXSwift就可以。 把 ViewModel里加...

T型人才追梦者
今天
22
1
OSChina 周一乱弹 —— 影响心情的三座大山

Osc乱弹歌单(2020)请戳(这里) 【今日歌曲】 @薛定谔的兄弟 :分享洛神有语创建的歌单「我喜欢的音乐」: 《浮生(inst.)》- 忘乡 / 墨凡悦 手机党少年们想听歌,请使劲儿戳(这里) @凝小紫...

小小编辑
今天
108
2

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部