Toward full elasticity in distributed static analysis: The case of callgraph analysis

In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Garbervetsky, D.
Otros Autores: Zoppi, E., Livshits, B., Zisman A., Bodden E., Schafer W., van Deursen A., Special Interest Group on Software Engineering (ACM SIGSOFT)
Formato: Acta de conferencia Capítulo de libro
Lenguaje:Inglés
Publicado: Association for Computing Machinery 2017
Acceso en línea:Registro en Scopus
DOI
Handle
Registro en la Biblioteca Digital
Aporte de:Registro referencial: Solicitar el recurso aquí
LEADER 12785caa a22011057a 4500
001 PAPER-17614
003 AR-BaUEN
005 20230518204856.0
008 190410s2017 xx ||||fo|||| 10| 0 eng|d
024 7 |2 scopus  |a 2-s2.0-85030789792 
040 |a Scopus  |b spa  |c AR-BaUEN  |d AR-BaUEN 
100 1 |a Garbervetsky, D. 
245 1 0 |a Toward full elasticity in distributed static analysis: The case of callgraph analysis 
260 |b Association for Computing Machinery  |c 2017 
506 |2 openaire  |e Política editorial 
504 |a (2015) Representational State Transfer, , https://en.wikipedia.org/wiki/Representational_state_transfer 
504 |a Agha, G., (1986) Actors: A Model of Concurrent Computation in Distributed Systems, , MIT Press 
504 |a Albarghouthi, A., Kumar, R., Nori, A.V., Rajamani, S.K., Parallelizing topdown interprocedural analyses (2012) Proceedings of the Conference on Programming Language Design and Implementation 
504 |a Andersen, L.O., (1994) Program Analysis and Specialization for the C Programming Language, , PhD thesis, University of Cophenhagen 
504 |a Bacon, D.F., Sweeney, P.F., Fast static analysis of C++ virtual function calls (1996) Proceedings of the Conference on Object-oriented Programming, Systems, Languages, and Applications 
504 |a Bernstein, P.A., Bykov, S., Geller, A., Kliot, G., Thelin, J., Orleans: Distributed virtual actors for programmability and scalability (2014) Technical Report MSR-TR-2014-41, , Microsoft Research 
504 |a Bornholt, J., Torlak, E., Scaling program synthesis by exploiting existing code (2015) Machine Learning for Programming Languages 
504 |a Calcagno, C., Distefano, D., Dubreil, J., Gabi, D., Hooimeijer, P., Luca, M., Oâǎz-Hearn, P., Rodriguez, D., Moving fast with software veriication (2015) NASA Formal Methods Symposium, pp. 3-11. , Springer 
504 |a Cousot, P., Asynchronous iterative methods for solving a ixed point system of monotone equations in a complete lattice (1977) Technical Report, Laboratoire IMAG, , Université scientiique et médicale de Grenoble 
504 |a Dyer, R., Nguyen, H.A., Rajan, H., Nguyen, T.N., Boa: A language and infrastructure for analyzing ultra-large-scale software repositories (2013) Proceedings of the International Conference on Software Engineering, , IEEE Press 
504 |a Dyer, R., Rajan, H., Nguyen, H.A., Nguyen, T.N., Nguyen. Mining billions of ast nodes to study actual and potential usage of Java language features (2014) Proceedings of the International Conference on Software Engineering 
504 |a Dyer, R., Rajan, H., Nguyen, T.N., Declarative visitors to ease ine-grained source code mining with full history on billions of ast nodes (2013) ACM SIGPLAN Notices, , ACM 
504 |a Grove, D., Chambers, C., A framework for call graph construction algorithms (2001) ACM Transactions on Programming Languages and Systems 
504 |a Grove, D., DeFouw, G., Dean, J., Chambers, C., Call graph construction in object-oriented languages (1997) ACM SIGPLAN Notices 
504 |a Hardekopf, B., Lin, C., The ant and the grasshopper: Fast and accurate pointer analysis for millions of lines of code (2007) Proceedings of the Conference on Programming Language Design and Implementation 
504 |a Hardekopf, B., Lin, C., Flow-sensitive pointer analysis for millions of lines of code (2011) Proceedings of the International Symposium on Code Generation and Optimization 
504 |a Lam, M.S., Whaley, J., Livshits, B., Martin, M.C., Avots, D., Carbin, M., Unkel, C., Context-sensitive program analysis as database queries (2005) Proceedings of the Symposium on Principles of Database Systems, , June 
504 |a Lassez, J.-L., Nguyen, V., Sonenberg, E., Fixed point theorems and semantics: A folk tale (1982) Information Processing Letters 
504 |a Lee, Y.Y., Harwell, S., Khurshid, S., Marinov, D., Temporal code completion and navigation (2013) Proceedings of the International Conference on Software Engineering 
504 |a Lhoták, O., Hendren, L., Context-sensitive points-to analysis: Is it worth it? (2006) Proceedings of the International Conference on Compiler Construction 
504 |a Livshits, B., Sridharan, M., Smaragdakis, Y., Lhoták, O., Amaral, J.N., Chang, B.-Y.E., Guyer, S.Z., Vardoulakis, D., In defense of soundiness: A manifesto (2015) Communications of the ACM 
504 |a Lopes, N.P., Rybalchenko, A., Distributed and predictable software model checking (2011) Proceedings of the International Conference on Veriication, Model Checking, and Abstract Interpretation 
504 |a Madsen, M., Livshits, B., Fanning, M., Practical static analysis of JavaScript applications in the presence of frameworks and libraries (2013) Proceedings of the International Symposium on the Foundations of Software Engineering 
504 |a Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G., Pregel: A system for large-scale graph processing (2010) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 
504 |a McPeak, S., Gros, C.-H., Ramanathan, M.K., Ramanathan. Scalable and incremental software bug detection (2013) Proceedings of the Symposium on the Foundations of Software Engineering 
504 |a Mendez-Lojo, M., Mathew, A., Pingali, K., Parallel inclusion-based points-to analysis (2010) Conference on Object Oriented Programming Systems Languages and Applications 
504 |a (2015) NET Compiler Platform ("Roslyn"), , https://roslyn.codeplex.com/, Microsoft Corporation 
504 |a (2015) Mozilla. LXR, , https://en.wikipedia.org/wiki/LXR_Cross_Referencer 
504 |a Murray, K., Bigham, J.P., Beyond autocomplete: Automatic function deinition (2011) Proceedings of the Visual Languages and Human-Centric Computing Symposium 
504 |a Nguyen, A.T., Nguyen, T.T., Nguyen, H.A., Tamrawi, A., Nguyen, H.V., Al-Kofahi, J., Nguyen, T.N., Graph-based pattern-oriented, context-sensitive source code completion Proceedings of the International Conference on Software Engineering 
504 |a Omar, C., Yoon, Y., LaToza, T.D., Myers, B.A., Active code completion (2012) Proceedings of the International Conference on Software Engineering 
504 |a Rodriguez, J., Lhoták, O., (2011) Actor-Based Parallel Datalow Analysis 
504 |a Sadowski, C., Van Gogh, J., Jaspan, C., Soederberg, E., Winter, C., Tricorder: Building a program analysis ecosystem (2015) International Conference on Software Engineering (ICSE) 
504 |a Sridharan, M., Gopan, D., Shan, L., Bodík, R., Demand-driven points-to analysis for Java (2005) Proceedings of the 20th Annual ACM SIGPLAN Conference on Objectoriented Programming, Systems, Languages, and Applications, OOPSLA '05, pp. 59-76. , New York, NY, USA ACM 
504 |a Steensgaard, B., Points-to analysis in almost linear time (1996) Proceedings of the Symposium on Principles of Programming Languages, , ACM 
504 |a Sundaresan, V., Hendren, L., Razaimahefa, C., Vallée-Rai, R., Lam, P., Gagnon, E., Godin, C., Practical virtual method call resolution for Java (2000) Proceedings of the Conference on Object-oriented Programming, Systems, Languages, and Applications 
504 |a Tip, F., Palsberg, J., Scalable propagation-based call graph construction algorithms (2000) Proceedings of the Conference on Object-oriented Programming, Systems, Languages, and Applications 
504 |a Voung, J.W., Jhala, R., Lerner, S., Relay: Static race detection on millions of lines of code (2007) Proceedings of the Symposium on the Foundations of Software Engineering 
504 |a Wang, K., Hussain, A., Zuo, Z., Xu, G., Sani, A.A., Graspan: A single-machine disk-based graph system for interprocedural static analyses of large-scale systems code (2017) Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems 
504 |a Whaley, J., Lam, M.S., Cloning-based context-sensitive pointer alias analysis using binary decision diagrams (2004) ACM SIGPLAN Notices, 39 
504 |a Xie, Y., Aiken, A., Saturn: A scalable framework for error detection using boolean satisiability (2007) ACM Trans. Program. Lang. Syst, 29 (3). , May 
504 |a Yu, H., Xue, J., Huo, W., Feng, X., Zhang, Z., Level by level: Making lowand context-sensitive pointer analysis scalable for millions of lines of code (2010) Proceedings of the International Symposium on Code Generation and OptimizationA4 - Special Interest Group on Software Engineering (ACM SIGSOFT) 
520 3 |a In this paper we present the design and implementation of a distributed, whole-program static analysis framework that is designed to scale with the size of the input. Our approach is based on the actor programming model and is deployed in the cloud. Our reliance on a cloud cluster provides a degree of elasticity for CPU, memory, and storage resources. To demonstrate the potential of our technique, we show how a typical call graph analysis can be implemented in a distributed setting. The vision that motivates this work is that every large-scale software repository such as GitHub, BitBucket or Visual Studio Online will be able to perform static analysis on a large scale. We experimentally validate our implementation of the distributed call graph analysis using a combination of both synthetic and real benchmarks. To show scalability, we demonstrate how the analysis presented in this paper is able to handle inputs that are almost 10 million lines of code (LOC) in size, without running out of memory. Our results show that the analysis scales well in terms of memory pressure independently of the input size, as we add more virtual machines (VMs). As the number of worker VMs increases, we observe that the analysis time generally improves as well. Lastly, we demonstrate that querying the results can be performed with a median latency of 15 ms. © 2017 Association for Computing Machinery.  |l eng 
536 |a Detalles de la financiación: Agencia Nacional de Promoción Científica y Tecnológica, 2015-1718, UBACYT 20020130100384BA, PICT 2013-2341, 2014-1656 
536 |a Detalles de la financiación: Consejo Nacional de Investigaciones Científicas y Técnicas, 112 201501 00931CO, PIP 112 201301 00688CO 
536 |a Detalles de la financiación: This work was partially supported by the projects ANPCYT PICT 2013-2341, 2014-1656 and 2015-1718, UBACYT 20020130100384BA, CONICET PIP 112 201301 00688CO, 112 201501 00931CO. 
593 |a Universidad de Buenos Aires, FCEyN, DC ICC, CONICET, Argentina 
593 |a Imperial College London, United Kingdom 
690 1 0 |a DEVELOPMENT ENVIRONMENTS AND TOOLS 
690 1 0 |a DISTRIBUTED AND CONCURRENT SYSTEMS 
690 1 0 |a PARALLEL 
690 1 0 |a PERFORMANCE AND SCALABILITY 
690 1 0 |a PROGRAM ANALYSIS 
690 1 0 |a PROGRAM COMPREHENSION AND VISUALIZATION 
690 1 0 |a ELASTICITY 
690 1 0 |a SCALABILITY 
690 1 0 |a SOFTWARE ENGINEERING 
690 1 0 |a CONCURRENT SYSTEMS 
690 1 0 |a DEVELOPMENT ENVIRONMENT 
690 1 0 |a PARALLEL 
690 1 0 |a PERFORMANCE AND SCALABILITIES 
690 1 0 |a PROGRAM ANALYSIS 
690 1 0 |a PROGRAM COMPREHENSION 
690 1 0 |a STATIC ANALYSIS 
700 1 |a Zoppi, E. 
700 1 |a Livshits, B. 
700 1 |a Zisman A. 
700 1 |a Bodden E. 
700 1 |a Schafer W. 
700 1 |a van Deursen A. 
700 1 |a Special Interest Group on Software Engineering (ACM SIGSOFT) 
711 2 |d 4 September 2017 through 8 September 2017  |g Código de la conferencia: 130154 
773 0 |d Association for Computing Machinery, 2017  |g v. Part F130154  |h pp. 442-453  |p Proc ACM SIGSOFT Symp Found Software Eng  |n Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering  |z 9781450351058  |t 11th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2017 
856 4 1 |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030789792&doi=10.1145%2f3106237.3106261&partnerID=40&md5=53024cde479bc0578d57ffea8ca70863  |y Registro en Scopus 
856 4 0 |u https://doi.org/10.1145/3106237.3106261  |y DOI 
856 4 0 |u https://hdl.handle.net/20.500.12110/paper_97814503_vPartF130154_n_p442_Garbervetsky  |y Handle 
856 4 0 |u https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97814503_vPartF130154_n_p442_Garbervetsky  |y Registro en la Biblioteca Digital 
961 |a paper_97814503_vPartF130154_n_p442_Garbervetsky  |b paper  |c PE 
962 |a info:eu-repo/semantics/conferenceObject  |a info:ar-repo/semantics/documento de conferencia  |b info:eu-repo/semantics/publishedVersion 
999 |c 78567