Static analysis for optimizing big data queries
Query languages for big data analysis provide user extensibility through a mechanism of user-deined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving efective query optimization for such languages is ext...
Guardado en:
Autor principal: | |
---|---|
Otros Autores: | , , , , , , , , , |
Formato: | Acta de conferencia Capítulo de libro |
Lenguaje: | Inglés |
Publicado: |
Association for Computing Machinery
2017
|
Acceso en línea: | Registro en Scopus DOI Handle Registro en la Biblioteca Digital |
Aporte de: | Registro referencial: Solicitar el recurso aquí |
LEADER | 06758caa a22008297a 4500 | ||
---|---|---|---|
001 | PAPER-17615 | ||
003 | AR-BaUEN | ||
005 | 20230518204856.0 | ||
008 | 190410s2017 xx ||||fo|||| 10| 0 eng|d | ||
024 | 7 | |2 scopus |a 2-s2.0-85030760949 | |
040 | |a Scopus |b spa |c AR-BaUEN |d AR-BaUEN | ||
100 | 1 | |a Garbervetsky, D. | |
245 | 1 | 0 | |a Static analysis for optimizing big data queries |
260 | |b Association for Computing Machinery |c 2017 | ||
506 | |2 openaire |e Política editorial | ||
504 | |a U-SQL, the New Big Data Language for Azure Data Lake, , https://azure.microsoft.com/en-us/blog/u-sql-the-new-big-data-language-for-azure-data-lake/, Accessed: 2017-05-09 | ||
504 | |a Aho, A.V., Sethi, R., Ullman, J.D., (1986) Compilers: Principles Techniques and Tools, , Addison-Wesley | ||
504 | |a Barnett, M., Fähndrich, M., Logozzo, F., Garbervetsky, D., Annotations for (more) precise points-to analysis (2007) IWACO, pp. 11-18 | ||
504 | |a Blanchet, B., Escape analysis: Correctness proof, implementation and experimental results (1998) POPL, pp. 25-37. , ACM | ||
504 | |a Chaiken, R., Jenkins, B., Larson, P.-A., Ramsey, B., Shakib, D., Weaver, S., Zhou, J., Scope: Easy and eicient parallel processing of massive data sets (2008) Proceedings of the VLDB Endowment, 1 (2), pp. 1265-1276 | ||
504 | |a Cousot, P., Cousot, R., Abstract interpretation frameworks (1992) Journal of Logic and Computation, 2 (4), pp. 511-547 | ||
504 | |a Guo, Z., Fan, X., Chen, R., Zhang, J., Zhou, H., McDirmid, S., Liu, C., Zhou, L., Spotting code optimizations in data-parallel pipelines through periscope (2012) OSDI, pp. 121-133 | ||
504 | |a Livshits, B., Sridharan, M., Smaragdakis, Y., Lhoták, O., Amaral, J.N., Chang, B.E., Guyer, S.Z., Vardoulakis, D., Defense of soundiness: A manifesto (2015) Commun. ACM, 58 (2), pp. 44-46 | ||
504 | |a Logozzo, F., Clousot: Static contract checking with abstract interpretation Formal Veriication of Object-Oriented Software, p. 5 | ||
504 | |a Muchnick, S.S., (1997) Advanced Compiler Design Implementation, , Morgan Kaufmann | ||
504 | |a Salcianu, A., Rinard, M., Purity and side efect analysis for Java programs (2005) VMCAI, pp. 199-215. , Springer | ||
504 | |a Steensgaard, B., Points-to analysis in almost linear time (1996) POPL, pp. 32-41. , ACM | ||
504 | |a Xia, S., Fähndrich, M., Logozzo, F., Inferring datalow properties of user deined table processors (2009) SAS, pp. 19-35 | ||
504 | |a Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I., Spark: Cluster computing with working sets (2010) HotCloud, 10 (10), p. 95. , 10 | ||
504 | |a Zhou, J., Bruno, N., Wu, M., Larson, P., Chaiken, R., Shakib, D., SCOPE: Parallel databases meet mapreduce (2012) VLDB J., 21 (5), pp. 611-636A4 - Special Interest Group on Software Engineering (ACM SIGSOFT) | ||
520 | 3 | |a Query languages for big data analysis provide user extensibility through a mechanism of user-deined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving efective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to signiicantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. © 2017 Association for Computing Machinery. |l eng | |
536 | |a Detalles de la financiación: Agencia Nacional de Promoción Científica y Tecnológica, 2015-1718, UBACYT 20020130100384BA, PICT 2013-2341, 2014-1656 | ||
536 | |a Detalles de la financiación: Consejo Nacional de Investigaciones Científicas y Técnicas, 112 201501 00931CO, PIP 112 201301 00688CO | ||
536 | |a Detalles de la financiación: This work was partially supported by the projects ANPCYT PICT 2013-2341, 2014-1656 and 2015-1718, UBACYT 20020130100384BA, CONICET PIP 112 201301 00688CO, 112 201501 00931CO. | ||
593 | |a Universidad de Buenos Aires, FCEyN, DC ICC, CONICET, Argentina | ||
593 | |a New York University, United States | ||
593 | |a Microsoft Research, United States | ||
690 | 1 | 0 | |a BIG DATA |
690 | 1 | 0 | |a QUERY OPTIMIZATION |
690 | 1 | 0 | |a STATIC ANALYSIS |
690 | 1 | 0 | |a UDOS |
690 | 1 | 0 | |a QUERY LANGUAGES |
690 | 1 | 0 | |a SOFTWARE ENGINEERING |
690 | 1 | 0 | |a STATIC ANALYSIS |
690 | 1 | 0 | |a DATA DEPENDENCIES |
690 | 1 | 0 | |a DATA QUERY |
690 | 1 | 0 | |a ERROR PRONES |
690 | 1 | 0 | |a OPTIMIZERS |
690 | 1 | 0 | |a QUERY OPTIMIZATION |
690 | 1 | 0 | |a REAL-WORLD |
690 | 1 | 0 | |a RELATIONAL QUERIES |
690 | 1 | 0 | |a UDOS |
690 | 1 | 0 | |a BIG DATA |
700 | 1 | |a Pavlinovic, Z. | |
700 | 1 | |a Barnett, M. | |
700 | 1 | |a Musuvathi, M. | |
700 | 1 | |a Mytkowicz, T. | |
700 | 1 | |a Zoppi, E. | |
700 | 1 | |a Zisman A. | |
700 | 1 | |a Bodden E. | |
700 | 1 | |a Schafer W. | |
700 | 1 | |a van Deursen A. | |
700 | 1 | |a Special Interest Group on Software Engineering (ACM SIGSOFT) | |
711 | 2 | |d 4 September 2017 through 8 September 2017 |g Código de la conferencia: 130154 | |
773 | 0 | |d Association for Computing Machinery, 2017 |g v. Part F130154 |h pp. 932-937 |p Proc ACM SIGSOFT Symp Found Software Eng |n Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering |z 9781450351058 |t 11th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2017 | |
856 | 4 | 1 | |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030760949&doi=10.1145%2f3106237.3117774&partnerID=40&md5=c36f3ac5265500bad86dac434f0a2feb |y Registro en Scopus |
856 | 4 | 0 | |u https://doi.org/10.1145/3106237.3117774 |y DOI |
856 | 4 | 0 | |u https://hdl.handle.net/20.500.12110/paper_97814503_vPartF130154_n_p932_Garbervetsky |y Handle |
856 | 4 | 0 | |u https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97814503_vPartF130154_n_p932_Garbervetsky |y Registro en la Biblioteca Digital |
961 | |a paper_97814503_vPartF130154_n_p932_Garbervetsky |b paper |c PE | ||
962 | |a info:eu-repo/semantics/conferenceObject |a info:ar-repo/semantics/documento de conferencia |b info:eu-repo/semantics/publishedVersion | ||
999 | |c 78568 |