Static analysis for optimizing big data queries

Query languages for big data analysis provide user extensibility through a mechanism of user-deined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving efective query optimization for such languages is ext...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Garbervetsky, D.
Otros Autores: Pavlinovic, Z., Barnett, M., Musuvathi, M., Mytkowicz, T., Zoppi, E., Zisman A., Bodden E., Schafer W., van Deursen A., Special Interest Group on Software Engineering (ACM SIGSOFT)
Formato: Acta de conferencia Capítulo de libro
Lenguaje:Inglés
Publicado: Association for Computing Machinery 2017
Acceso en línea:Registro en Scopus
DOI
Handle
Registro en la Biblioteca Digital
Aporte de:Registro referencial: Solicitar el recurso aquí
LEADER 06758caa a22008297a 4500
001 PAPER-17615
003 AR-BaUEN
005 20230518204856.0
008 190410s2017 xx ||||fo|||| 10| 0 eng|d
024 7 |2 scopus  |a 2-s2.0-85030760949 
040 |a Scopus  |b spa  |c AR-BaUEN  |d AR-BaUEN 
100 1 |a Garbervetsky, D. 
245 1 0 |a Static analysis for optimizing big data queries 
260 |b Association for Computing Machinery  |c 2017 
506 |2 openaire  |e Política editorial 
504 |a U-SQL, the New Big Data Language for Azure Data Lake, , https://azure.microsoft.com/en-us/blog/u-sql-the-new-big-data-language-for-azure-data-lake/, Accessed: 2017-05-09 
504 |a Aho, A.V., Sethi, R., Ullman, J.D., (1986) Compilers: Principles Techniques and Tools, , Addison-Wesley 
504 |a Barnett, M., Fähndrich, M., Logozzo, F., Garbervetsky, D., Annotations for (more) precise points-to analysis (2007) IWACO, pp. 11-18 
504 |a Blanchet, B., Escape analysis: Correctness proof, implementation and experimental results (1998) POPL, pp. 25-37. , ACM 
504 |a Chaiken, R., Jenkins, B., Larson, P.-A., Ramsey, B., Shakib, D., Weaver, S., Zhou, J., Scope: Easy and eicient parallel processing of massive data sets (2008) Proceedings of the VLDB Endowment, 1 (2), pp. 1265-1276 
504 |a Cousot, P., Cousot, R., Abstract interpretation frameworks (1992) Journal of Logic and Computation, 2 (4), pp. 511-547 
504 |a Guo, Z., Fan, X., Chen, R., Zhang, J., Zhou, H., McDirmid, S., Liu, C., Zhou, L., Spotting code optimizations in data-parallel pipelines through periscope (2012) OSDI, pp. 121-133 
504 |a Livshits, B., Sridharan, M., Smaragdakis, Y., Lhoták, O., Amaral, J.N., Chang, B.E., Guyer, S.Z., Vardoulakis, D., Defense of soundiness: A manifesto (2015) Commun. ACM, 58 (2), pp. 44-46 
504 |a Logozzo, F., Clousot: Static contract checking with abstract interpretation Formal Veriication of Object-Oriented Software, p. 5 
504 |a Muchnick, S.S., (1997) Advanced Compiler Design Implementation, , Morgan Kaufmann 
504 |a Salcianu, A., Rinard, M., Purity and side efect analysis for Java programs (2005) VMCAI, pp. 199-215. , Springer 
504 |a Steensgaard, B., Points-to analysis in almost linear time (1996) POPL, pp. 32-41. , ACM 
504 |a Xia, S., Fähndrich, M., Logozzo, F., Inferring datalow properties of user deined table processors (2009) SAS, pp. 19-35 
504 |a Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I., Spark: Cluster computing with working sets (2010) HotCloud, 10 (10), p. 95. , 10 
504 |a Zhou, J., Bruno, N., Wu, M., Larson, P., Chaiken, R., Shakib, D., SCOPE: Parallel databases meet mapreduce (2012) VLDB J., 21 (5), pp. 611-636A4 - Special Interest Group on Software Engineering (ACM SIGSOFT) 
520 3 |a Query languages for big data analysis provide user extensibility through a mechanism of user-deined operators (UDOs). These operators allow programmers to write proprietary functionalities on top of a relational query skeleton. However, achieving efective query optimization for such languages is extremely challenging since the optimizer needs to understand data dependencies induced by UDOs. SCOPE, the query language from Microsoft, allows for hand coded declarations of UDO data dependencies. Unfortunately, most programmers avoid using this facility since writing and maintaining the declarations is tedious and error-prone. In this work, we designed and implemented two sound and robust static analyses for computing UDO data dependencies. The analyses can detect what columns of an input table are never used or pass-through a UDO unchanged. This information can be used to signiicantly improve execution of SCOPE scripts. We evaluate our analyses on thousands of real-world queries and show we can catch many unused and pass-through columns automatically without relying on any manually provided declarations. © 2017 Association for Computing Machinery.  |l eng 
536 |a Detalles de la financiación: Agencia Nacional de Promoción Científica y Tecnológica, 2015-1718, UBACYT 20020130100384BA, PICT 2013-2341, 2014-1656 
536 |a Detalles de la financiación: Consejo Nacional de Investigaciones Científicas y Técnicas, 112 201501 00931CO, PIP 112 201301 00688CO 
536 |a Detalles de la financiación: This work was partially supported by the projects ANPCYT PICT 2013-2341, 2014-1656 and 2015-1718, UBACYT 20020130100384BA, CONICET PIP 112 201301 00688CO, 112 201501 00931CO. 
593 |a Universidad de Buenos Aires, FCEyN, DC ICC, CONICET, Argentina 
593 |a New York University, United States 
593 |a Microsoft Research, United States 
690 1 0 |a BIG DATA 
690 1 0 |a QUERY OPTIMIZATION 
690 1 0 |a STATIC ANALYSIS 
690 1 0 |a UDOS 
690 1 0 |a QUERY LANGUAGES 
690 1 0 |a SOFTWARE ENGINEERING 
690 1 0 |a STATIC ANALYSIS 
690 1 0 |a DATA DEPENDENCIES 
690 1 0 |a DATA QUERY 
690 1 0 |a ERROR PRONES 
690 1 0 |a OPTIMIZERS 
690 1 0 |a QUERY OPTIMIZATION 
690 1 0 |a REAL-WORLD 
690 1 0 |a RELATIONAL QUERIES 
690 1 0 |a UDOS 
690 1 0 |a BIG DATA 
700 1 |a Pavlinovic, Z. 
700 1 |a Barnett, M. 
700 1 |a Musuvathi, M. 
700 1 |a Mytkowicz, T. 
700 1 |a Zoppi, E. 
700 1 |a Zisman A. 
700 1 |a Bodden E. 
700 1 |a Schafer W. 
700 1 |a van Deursen A. 
700 1 |a Special Interest Group on Software Engineering (ACM SIGSOFT) 
711 2 |d 4 September 2017 through 8 September 2017  |g Código de la conferencia: 130154 
773 0 |d Association for Computing Machinery, 2017  |g v. Part F130154  |h pp. 932-937  |p Proc ACM SIGSOFT Symp Found Software Eng  |n Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering  |z 9781450351058  |t 11th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2017 
856 4 1 |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030760949&doi=10.1145%2f3106237.3117774&partnerID=40&md5=c36f3ac5265500bad86dac434f0a2feb  |y Registro en Scopus 
856 4 0 |u https://doi.org/10.1145/3106237.3117774  |y DOI 
856 4 0 |u https://hdl.handle.net/20.500.12110/paper_97814503_vPartF130154_n_p932_Garbervetsky  |y Handle 
856 4 0 |u https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97814503_vPartF130154_n_p932_Garbervetsky  |y Registro en la Biblioteca Digital 
961 |a paper_97814503_vPartF130154_n_p932_Garbervetsky  |b paper  |c PE 
962 |a info:eu-repo/semantics/conferenceObject  |a info:ar-repo/semantics/documento de conferencia  |b info:eu-repo/semantics/publishedVersion 
999 |c 78568