loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Zihao Wang 1 ; Pei Wang 2 ; Qinkun Bao 2 and Dinghao Wu 1

Affiliations: 1 Pennsylvania State University, University Park, U.S.A. ; 2 Individual Researcher, U.S.A.

Keyword(s): Program Analysis, Context-Free Grammar, Static Analysis, Fuzzing, Data-Flow Analysis, Taint Analysis.

Abstract: This paper presents a novel approach for inferring the language implied by a program’s source code, without requiring the use of explicit grammars or input/output corpora. Our technique is based on backward taint analysis, which tracks the flow of data in a program from certain sink functions back to the source functions. By analyzing the data flow of programs that generate structured output, such as compilers and formatters, we can infer the syntax and structure of the language being expressed in the code. Our approach is particularly effective for domain-specific languages, where the language implied by the code is often unique to a particular problem domain and may not be expressible by a standard context-free grammar. To test the effectiveness of our technique, we applied it to libxml2. Our experiments show that our approach can accurately infer the implied language of some complex programs. Using our inferred language models, we can generate high-quality corpora for testing and validation. Our approach offers a new way to understand and reason about the language implied by source code, and has potential applications in software testing, reverse engineering, and program comprehension. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.224.52.23

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Wang, Z.; Wang, P.; Bao, Q. and Wu, D. (2023). Source Code Implied Language Structure Abstraction through Backward Taint Analysis. In Proceedings of the 18th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-665-1; ISSN 2184-2833, SciTePress, pages 564-571. DOI: 10.5220/0012129000003538

@conference{icsoft23,
author={Zihao Wang. and Pei Wang. and Qinkun Bao. and Dinghao Wu.},
title={Source Code Implied Language Structure Abstraction through Backward Taint Analysis},
booktitle={Proceedings of the 18th International Conference on Software Technologies - ICSOFT},
year={2023},
pages={564-571},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012129000003538},
isbn={978-989-758-665-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Software Technologies - ICSOFT
TI - Source Code Implied Language Structure Abstraction through Backward Taint Analysis
SN - 978-989-758-665-1
IS - 2184-2833
AU - Wang, Z.
AU - Wang, P.
AU - Bao, Q.
AU - Wu, D.
PY - 2023
SP - 564
EP - 571
DO - 10.5220/0012129000003538
PB - SciTePress