inspect4py : a knowledge extraction framework for Python code repositories
Abstract
This work presents inspect4py, a static code analysis framework designed to automatically extract the main features, metadata and documentation of Python code repositories. Given an input folder with code, inspect4py uses abstract syntax trees and state of the art tools to find all functions, classes, tests, documentation, call graphs, module dependencies and control flows within all code files in that repository. Using these findings, inspect4py infers different ways of invoking a software component. We have evaluated our framework on 95 annotated repositories, obtaining promising results for software type classification (over 95% F1-score). With inspect4py, we aim to ease the understandability and adoption of software repositories by other researchers and developers.
Citation
Filgueira , R & Garijo , D 2022 , inspect4py : a knowledge extraction framework for Python code repositories . in D Lo , S McIntosh & N Novielli (eds) , Proceedings of the 19th International Conference on Mining Software Repositories (MSR 2022) . ACM , New York, NY , pp. 232-236 , 2022 Mining Software Repositories Conference (MSR 2022) , Pittsburgh , Pennsylvania , United States , 23/05/22 . https://doi.org/10.1145/3524842.3528497 conference
Publication
Proceedings of the 19th International Conference on Mining Software Repositories (MSR 2022)
Type
Conference item
Rights
Copyright © 2022 Association of Computing Machinery. This work has been made available online in accordance with publisher policies or with permission. Permission for further reuse of this content should be sought from the publisher or the rights holder. This is the author created accepted manuscript following peer review and may differ slightly from the final published version. The final published version of this work is available at https://dl.acm.org/.
Collections
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.