文档章节

用SQL做数据分析_1

Master_Li
 Master_Li
发布于 2017/08/05 18:47
字数 1282
阅读 75
收藏 0

题目:给定经纬度区间(经度(10,20),纬度(100,200)),50%以上订单起点在该区间内,定义司机为市区司机,相反,若50%以上订单起点在该区间外则定义司机为郊县司机。在北京市随机取1000个服务分在【90,100】的快车司机,请写出SQL代码查询市区司机和郊县司机的数量和市区与郊县司机占比。

分析:

首先在北京市随机取1000个服务分在【90,100】的快车司机,将结果作为司机表d

select 
	Top 1000 a.*, 
	c.service_score
	from a,c
	where (a.driver_id = c.driver_id) 
		and (c.service_score between 90 and 100) 
		and (a.driver_car_lev in ('1','2','3'))
		and (a.city_id = 1)

将司机表和订单表联合得到司机订单表,一个司机对应多个订单,每个订单对应一个起点经纬度,然后以driver_id分组,同时计算每个订单是属于市区订单还是郊区订单,将结果作为表e

select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan

对于表e,以司机为单位,计算每个司机的总订单数,市区订单数,市区订单比例,郊区订单数,郊区订单比例。将每个司机的处理结果作为表f

select 
	driver_id,
	COUNT(order1) as tlnum,
	SUM(order2) as sqnum,
	SUM(order2)*1.0/COUNT(order1) as sqrate,
	SUM(order3) as jqnum,
	SUM(order3)*1.0/COUNT(order1) as jqrate
	from 
	(select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
	group by driver_id 	

最后,只要对表f做一个统计和判断,即可得到市区司机和郊县司机的数量和市区与郊县司机占比

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

答案:

为了测试SQL的正确性,以SQL Server数据库建表,使用C#程序批量填入数据

①使用对象资源管理器创建表a,b,c

②使用C#批量插入数据

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace DD
{
    class Program
    {
     
        static void Main(string[] args)
        {
            String CONNECTION_STRING = "server = .;database=DD;Integrated Security = true;";

            SqlConnection conn = new SqlConnection(CONNECTION_STRING);
            conn.Open();
            
            SqlCommand cmd = new SqlCommand();
            cmd.Connection = conn;
            cmd.CommandType = CommandType.Text;


            for (int i = 1; i <=1000; i++)
            {
                string sql =
                       "insert into a(driver_id, city_id, driver_car_lev) values (" + Convert.ToString(i) + "," + Convert.ToString(1) + "," + Convert.ToString(1) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            for(int i =1; i<=600; i++){
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }
            for (int i = 601; i <= 1000; i++)
            {
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                      "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }

            for (int i = 1; i <= 1000; i++)
            {
                string sql =
                       "insert into c(driver_id, service_score) values (" + Convert.ToString(i) + "," + Convert.ToString(new Random().Next(10) + 90) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            //SqlDataReader reader = cmd.ExecuteReader();
            //System.Console.Write("column1\tcolumn2\tcolumn3\n");
            //while (reader.Read())
            //{
            //    System.Console.Write(reader["driver_id"]);
            //    System.Console.Write(reader["city_id"]);
            //    System.Console.Write(reader["driver_car_lev"]);
            //}
            conn.Close();
            System.Console.Write("Insert complete!");
            Console.ReadKey(true);//暂停  

        }
    }
}

③测试SQL脚本

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

结果:表d

表e

表f

最终结果

 

Over!!!

© 著作权归作者所有

共有 人打赏支持
上一篇: 字符串匹配
下一篇: OpenCV+MLP+Java
Master_Li
粉丝 1
博文 74
码字总数 57794
作品 0
南京
程序员
私信 提问
加载中

评论(2)

Master_Li
Master_Li

引用来自“lijiachen56”的评论

您好大神,想请教一下表f
SUM(order2)*1.0/COUNT(order1) as sqrate
为什么要*1.0呢?
因为不确定sum()和count()返回的类型,所以为了保证结果是浮点类型,*1.0
l
lijiachen56
您好大神,想请教一下表f
SUM(order2)*1.0/COUNT(order1) as sqrate
为什么要*1.0呢?
阿里云分布式关系数据库DRDS - 实时SQL执行日志分析(一)

概述 数据库是企业业务的数据核心,对数据库的操作行为尤其是所有SQL执行的行为进行记录并审计的日志,就显得尤为重要。但围绕数据库的安全问题却日益严重,今年5月份,某快递公司被暴出上亿...

成喆
11/05
0
0
加米谷:数据分析师是做什么的,需要掌握哪些技能?

数据分析师是做什么的: 1、支持各种常规或临时数据分析需求; 2、提供各类业务相关的分析及建议; 3、通过建模深入挖掘用户或产品方面的有价值的信息; 4、和各部门沟通协调需求并提出各种新...

加米谷大数据
07/13
0
0
大数据分析师是做什么的?需要掌握哪些技能?

数据分析师是做什么的: 1、支持各种常规或临时数据分析需求; 2、提供各类业务相关的分析及建议; 3、通过建模深入挖掘用户或产品方面的有价值的信息; 4、和各部门沟通协调需求并提出各种新...

加米谷
07/13
0
0
mysql多表查询实例结果及分析-2017.12.14

今天帮运营导数据时发现sql执行速度非常慢,嗯,按照之前的编写习惯,后来跟架构师一起分析讨论,改进了下,现记录分析如下 一、测试表、sql情况 set @bgDate = '20171213';set @edDate = '...

Ye_Kwii
2017/12/14
0
0
实战演练:洞若观火--治堵之道在清源

1 观察:把问题说清楚 在我印象中,DBA 都是很明事理的人,因为这个职业经常要干的一件事情就是:“把问题说清楚,把规矩讲明白”。这是一项软技能,不管是问题分析过程中与同行沟通,还是最...

技术小能手
07/09
0
0

没有更多内容

加载失败,请刷新页面

加载更多

Spring源码学习笔记-1-Resource

打算补下基础,学习下Spring源码,参考书籍是《Spring源码深度解析》,使用版本是Spring 3.2.x,本来想试图用脑图记录的,发现代码部分不好贴,还是作罢,这里只大略记录下想法,不写太细了 ...

zypy333
今天
10
0
RestClientUtil和ConfigRestClientUtil区别说明

RestClientUtil directly executes the DSL defined in the code. ConfigRestClientUtil gets the DSL defined in the configuration file by the DSL name and executes it. RestClientUtil......

bboss
今天
17
0

中国龙-扬科
昨天
2
0
Linux系统设置全局的默认网络代理

更改全局配置文件/etc/profile all_proxy="all_proxy=socks://rahowviahva.ml:80/"ftp_proxy="ftp_proxy=http://rahowviahva.ml:80/"http_proxy="http_proxy=http://rahowviahva.ml:80/"......

临江仙卜算子
昨天
10
0
java框架学习日志-6(bean作用域和自动装配)

本章补充bean的作用域和自动装配 bean作用域 之前提到可以用scope来设置单例模式 <bean id="type" class="cn.dota2.tpye.Type" scope="singleton"></bean> 除此之外还有几种用法 singleton:......

白话
昨天
10
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部