用SQL做数据分析_1
用SQL做数据分析_1
Master_Li 发表于7个月前
用SQL做数据分析_1
  • 发表于 7个月前
  • 阅读 51
  • 收藏 0
  • 点赞 0
  • 评论 0

新睿云服务器60天免费使用,快来体验!>>>   

摘要: 2017滴滴的一道笔试题

题目:给定经纬度区间(经度(10,20),纬度(100,200)),50%以上订单起点在该区间内,定义司机为市区司机,相反,若50%以上订单起点在该区间外则定义司机为郊县司机。在北京市随机取1000个服务分在【90,100】的快车司机,请写出SQL代码查询市区司机和郊县司机的数量和市区与郊县司机占比。

分析:

首先在北京市随机取1000个服务分在【90,100】的快车司机,将结果作为司机表d

select 
	Top 1000 a.*, 
	c.service_score
	from a,c
	where (a.driver_id = c.driver_id) 
		and (c.service_score between 90 and 100) 
		and (a.driver_car_lev in ('1','2','3'))
		and (a.city_id = 1)

将司机表和订单表联合得到司机订单表,一个司机对应多个订单,每个订单对应一个起点经纬度,然后以driver_id分组,同时计算每个订单是属于市区订单还是郊区订单,将结果作为表e

select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan

对于表e,以司机为单位,计算每个司机的总订单数,市区订单数,市区订单比例,郊区订单数,郊区订单比例。将每个司机的处理结果作为表f

select 
	driver_id,
	COUNT(order1) as tlnum,
	SUM(order2) as sqnum,
	SUM(order2)*1.0/COUNT(order1) as sqrate,
	SUM(order3) as jqnum,
	SUM(order3)*1.0/COUNT(order1) as jqrate
	from 
	(select 
		b.*,
		COUNT(*) as order1,
		SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
		SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
		from (select 
				Top 1000 a.*, 
				c.service_score
				from a,c
				where (a.driver_id = c.driver_id) 
					and (c.service_score between 90 and 100) 
					and (a.driver_car_lev in ('1','2','3'))
					and (a.city_id = 1))as d,b
		where b.driver_id = d.driver_id
		group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
	group by driver_id 	

最后,只要对表f做一个统计和判断,即可得到市区司机和郊县司机的数量和市区与郊县司机占比

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

答案:

为了测试SQL的正确性,以SQL Server数据库建表,使用C#程序批量填入数据

①使用对象资源管理器创建表a,b,c

②使用C#批量插入数据

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace DD
{
    class Program
    {
     
        static void Main(string[] args)
        {
            String CONNECTION_STRING = "server = .;database=DD;Integrated Security = true;";

            SqlConnection conn = new SqlConnection(CONNECTION_STRING);
            conn.Open();
            
            SqlCommand cmd = new SqlCommand();
            cmd.Connection = conn;
            cmd.CommandType = CommandType.Text;


            for (int i = 1; i <=1000; i++)
            {
                string sql =
                       "insert into a(driver_id, city_id, driver_car_lev) values (" + Convert.ToString(i) + "," + Convert.ToString(1) + "," + Convert.ToString(1) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            for(int i =1; i<=600; i++){
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }
            for (int i = 601; i <= 1000; i++)
            {
                for (int j = 1; j <= 3; j++)
                {
                    string sql =
                      "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 30) + "," + Convert.ToString(new Random().Next(100) + 300) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
                for (int j = 4; j <= 4; j++)
                {
                    string sql =
                       "insert into b(driver_id, order_id, current_lng, current_lan) values (" + Convert.ToString(i) + "," + Convert.ToString(j) + "," + Convert.ToString(new Random().Next(10) + 10) + "," + Convert.ToString(new Random().Next(100) + 100) + ")";
                    cmd.CommandText = sql;
                    cmd.ExecuteNonQuery();
                }
            }

            for (int i = 1; i <= 1000; i++)
            {
                string sql =
                       "insert into c(driver_id, service_score) values (" + Convert.ToString(i) + "," + Convert.ToString(new Random().Next(10) + 90) + ")";
                cmd.CommandText = sql;
                cmd.ExecuteNonQuery();
            }


            //SqlDataReader reader = cmd.ExecuteReader();
            //System.Console.Write("column1\tcolumn2\tcolumn3\n");
            //while (reader.Read())
            //{
            //    System.Console.Write(reader["driver_id"]);
            //    System.Console.Write(reader["city_id"]);
            //    System.Console.Write(reader["driver_car_lev"]);
            //}
            conn.Close();
            System.Console.Write("Insert complete!");
            Console.ReadKey(true);//暂停  

        }
    }
}

③测试SQL脚本

select 
	COUNT(*) as tldriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end) as sqdriver,
	SUM(case when (sqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as sqdeiverrate,
	SUM(case when (jqrate > 0.5) then 1 else 0 end) as jqdriver,
	SUM(case when (jqrate > 0.5) then 1 else 0 end)*1.0/COUNT(*) as jqdeiverrate
	from
	(select 
		driver_id,
		COUNT(order1) as tlnum,
		SUM(order2) as sqnum,
		SUM(order2)*1.0/COUNT(order1) as sqrate,
		SUM(order3) as jqnum,
		SUM(order3)*1.0/COUNT(order1) as jqrate
		from 
			(select 
				b.*,
				COUNT(*) as order1,
				SUM(case when (current_lng between 10 and 20) and (current_lan between 100 and 200)  then 1 else 0 end) as order2,
				SUM(case when (current_lng not between 10 and 20) or (current_lan not between 100 and 200)  then 1 else 0 end) as order3
				from (select 
						Top 1000 a.*, 
						c.service_score
						from a,c
						where (a.driver_id = c.driver_id) 
							and (c.service_score between 90 and 100) 
							and (a.driver_car_lev in ('1','2','3'))
							and (a.city_id = 1))as d,b
				where b.driver_id = d.driver_id
				group by b.driver_id,b.order_id,b.current_lng,b.current_lan) as e
		group by driver_id) as f 	

结果:表d

表e

表f

最终结果

 

Over!!!

  • 打赏
  • 点赞
  • 收藏
  • 分享
共有 人打赏支持
粉丝 0
博文 38
码字总数 36441
×
Master_Li
如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!
* 金额(元)
¥1 ¥5 ¥10 ¥20 其他金额
打赏人
留言
* 支付类型
微信扫码支付
打赏金额:
已支付成功
打赏金额: