In the abstract interpretation framework, completeness represents an optimal simulation by the abstract operators over the behavior of the concrete operators. This corresponds to an ideal (often rare) feature where there is no loss of information accumulated in abstract computations with respect to the properties encoded by the underlying abstract domains. In this thesis, we deal with the opposite notion of completeness in abstract interpretation, that is, incompleteness, applied to two different contexts: static program analysis and formal languages over the Chomsky's hierarchy. In static program analysis, completeness is a very rare condition to be satisfied in practice and only the straightforward abstractions are complete for all programs, thus, we usually deal with incompleteness. For this reason, we introduce the notion of partial completeness. Partial completeness is a weaker notion of completeness which requires the imprecision of the analysis to be limited. A partially complete abstract interpretation allows some false alarms to be reported, but their number is bounded by a constant. We collect in partial completeness classes all the programs whose abstract interpretations share the same upper bound of imprecision. We then focus on the investigation of the computational limits of the class of partially complete programs with respect to a given abstract domain. Moreover, we show that the class of all partially complete programs is nonrecursively enumerable, and its complement is productive whenever we allow an unlimited imprecision in the abstract domain. Finally, we formalize the local partial completeness class within which we require partial completeness only on some specific inputs. We prove that this last class of programs is a recursively enumerable set under a structural hypothesis on the underlying abstract domain, by showing an algorithm capable of proving the local partial completeness of a program with respect to a given abstract domain and an upper bound of imprecision. In formal language theory, we want to study a possible reformulation, by abstract interpretation, of classes of languages in the Chomsky's hierarchy, and, by exploiting the incompleteness of languages abstractions, we want to define separation results between classes of languages. To this end, we do a first step into this direction by studying the relation between indexed languages (recognized by indexed grammars) and contextfree languages. Indexed grammars are a generalization of contextfree grammars which recognize a proper subset of contextsensitive languages, the so called indexed languages. %The class of languages recognized by indexed grammars is called indexed languages and they correspond to the languages recognized by nested stack automata. For example, indexed grammars can recognize the language ${a^nb^nc^n mid ngeq 1 }$ which is not contextfree, but they cannot recognize ${ (ab^n)^n mid ngeq 1}$ which is contextsensitive. Indexed grammars identify a set of languages that are more expressive than contextfree languages, while having decidability results that lie in between the ones of contextfree and contextsensitive languages. We provide a fixpoint characterization of the languages recognized by an indexed grammar and we study possible ways to abstract, in the abstract interpretation sense, these languages and their grammars into contextfree and regular languages. We formalize the separation class between indexed and contextfree languages, i.e., all the languages that cannot be generated by a contextfree grammar, as an instance of incompleteness of stack elimination abstraction over indexed grammars.
Partial (In)Completeness in Abstract Interpretation
Marco Campion^{}
2021
Abstract
In the abstract interpretation framework, completeness represents an optimal simulation by the abstract operators over the behavior of the concrete operators. This corresponds to an ideal (often rare) feature where there is no loss of information accumulated in abstract computations with respect to the properties encoded by the underlying abstract domains. In this thesis, we deal with the opposite notion of completeness in abstract interpretation, that is, incompleteness, applied to two different contexts: static program analysis and formal languages over the Chomsky's hierarchy. In static program analysis, completeness is a very rare condition to be satisfied in practice and only the straightforward abstractions are complete for all programs, thus, we usually deal with incompleteness. For this reason, we introduce the notion of partial completeness. Partial completeness is a weaker notion of completeness which requires the imprecision of the analysis to be limited. A partially complete abstract interpretation allows some false alarms to be reported, but their number is bounded by a constant. We collect in partial completeness classes all the programs whose abstract interpretations share the same upper bound of imprecision. We then focus on the investigation of the computational limits of the class of partially complete programs with respect to a given abstract domain. Moreover, we show that the class of all partially complete programs is nonrecursively enumerable, and its complement is productive whenever we allow an unlimited imprecision in the abstract domain. Finally, we formalize the local partial completeness class within which we require partial completeness only on some specific inputs. We prove that this last class of programs is a recursively enumerable set under a structural hypothesis on the underlying abstract domain, by showing an algorithm capable of proving the local partial completeness of a program with respect to a given abstract domain and an upper bound of imprecision. In formal language theory, we want to study a possible reformulation, by abstract interpretation, of classes of languages in the Chomsky's hierarchy, and, by exploiting the incompleteness of languages abstractions, we want to define separation results between classes of languages. To this end, we do a first step into this direction by studying the relation between indexed languages (recognized by indexed grammars) and contextfree languages. Indexed grammars are a generalization of contextfree grammars which recognize a proper subset of contextsensitive languages, the so called indexed languages. %The class of languages recognized by indexed grammars is called indexed languages and they correspond to the languages recognized by nested stack automata. For example, indexed grammars can recognize the language ${a^nb^nc^n mid ngeq 1 }$ which is not contextfree, but they cannot recognize ${ (ab^n)^n mid ngeq 1}$ which is contextsensitive. Indexed grammars identify a set of languages that are more expressive than contextfree languages, while having decidability results that lie in between the ones of contextfree and contextsensitive languages. We provide a fixpoint characterization of the languages recognized by an indexed grammar and we study possible ways to abstract, in the abstract interpretation sense, these languages and their grammars into contextfree and regular languages. We formalize the separation class between indexed and contextfree languages, i.e., all the languages that cannot be generated by a contextfree grammar, as an instance of incompleteness of stack elimination abstraction over indexed grammars.File  Dimensione  Formato  

main.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
1.16 MB
Formato
Adobe PDF

1.16 MB  Adobe PDF  Visualizza/Apri 
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.