superset SQL生成的研究
superset类似系统,一期实现不要实现这么复杂的SQL拼接:
SELECT state AS state, job_date AS __timestamp, sum(job_num) AS sum_job_num FROM (select count(1) as job_num, state, job_date from job_info group by state, job_date) AS expr_qry INNER JOIN (SELECT state AS state__, sum(job_num) AS mme_inner__ FROM (select count(1) as job_num, state, job_date from job_info group by state, job_date) AS expr_qry WHERE job_date >= STR_TO_DATE('2017-12-05 00:00:00', '%%Y-%%m-%%d %%H:%%i:%%s') AND job_date <= STR_TO_DATE('2018-03-15 16:26:53', '%%Y-%%m-%%d %%H:%%i:%%s') GROUP BY state ORDER BY mme_inner__ DESC LIMIT 50) AS anon_1 ON state = state__ WHERE job_date >= STR_TO_DATE('2017-12-05 00:00:00', '%%Y-%%m-%%d %%H:%%i:%%s') AND job_date <= STR_TO_DATE('2018-03-15 16:26:53', '%%Y-%%m-%%d %%H:%%i:%%s') GROUP BY state, job_date ORDER BY sum_job_num DESC LIMIT 50000
上边是superset生成的SQL,乍看起来很复杂,经过分析,其实就是通过inner join将两个表的数据连起来。
例如:
表A 表B
id val1 id val2
1 A 1 D
2 B 2 E
3 C 3 F
通过inner join最后会变为横表
id val1 val2
1 A D
2 B E
3 C F
而通过UNION ALL则会变为纵表
id val
1 A
2 B
3 C
1 D
2 E
3 F
http://www.w3school.com.cn/sql/sql_union.asp
http://www.w3school.com.cn/sql/sql_join_inner.asp
INNER JOIN 关键字在表中存在至少一个匹配时返回行。如果 “Persons” 中的行在 “Orders” 中没有匹配,就不会列出这些行。
LEFT JOIN 关键字会从左表 (Persons) 那里返回所有的行,即使在右表 (Orders) 中没有匹配的行。
RIGHT JOIN 关键字会从右表 (Orders) 那里返回所有的行,即使在左表 (Persons) 中没有匹配的行。
superset的源码
https://github.com/apache/incubator-superset/blob/4250e239a2515be91464c1c0502db5b31ac9ee91/superset/connectors/sqla/models.py