# Bachelor- und Masterarbeiten

## Zu vergebende Arbeiten

Beispiele für zu vergebende Arbeiten:

## Expression matching up to weak head normal form in the Lean proof assistant (Limperg - Bachelor/Master)

When writing tactics for the Lean theorem prover, we often match on Lean expressions.
or example, a tactic that splits hypotheses `h : A ∧ B`

into `h₁ : A`

and `h₂ : B`

goes through each hypothesis in the context, looking for a type that matches the pattern `?A ∧ ?B`

.
The quote4 library allows us to conveniently perform such matching.

However, quote4 does not currently support matching up to computation.
For example, if we define a type synonym `T := A ∧ B`

, then we might expect the conjunction-splitting tactic to recognise and split the hypothesis `h : T`

.
The necessary logic is quite tedious and must currently be written by hand.
The aim of this project is to extend QQ with a new matching construct that performs matching up to computation, automatically and efficiently.
More details are available here.

To do so, you will first have to familiarise yourself with the metaprogramming APIs of Lean. A solid command of functional programming, including monads, is required. Experience using theorem provers based on dependent type theory (e.g. Agda, Coq, Lean) would be very helpful. The project is suitable for a master or strong bachelor student.

## An independent checker for the Lean proof assistant (Maio, Limperg - Bachelor/Master)

The Lean theorem prover is built on a fairly minimal core type theory. By design, Lean itself is not particularly trustworthy (e.g. it allows users to circumvent its type system in ways that are hard to detect), but it can export finished proofs, which are terms of the core type theory. These can then be checked by small, trustworthy third-party checkers (1, 2, 3).

The aim of this project is to build one such checker (though we may limit the scope to a subset of the core type theory). To do so, you'll have to familiarise yourself with the rules of the core type theory and then define its abstract syntax and write a parser and typechecker.

Familiarity with basic compilation techniques (parsing, abstract syntax, typechecking) is required. Experience with dependent type theories would be very helpful. The project is suitable for a master or strong bachelor student.

## Implementation of type theories in the Lean proof assistant (Maio, Limperg - Bachelor/Master)

The Lean theorem prover has a very flexible parsing mechanism which allows us to embed other languages into it. In this project, we'd like to use this mechanism to embed intensional or extensional Martin Löf type theory. This means you'll have to define the abstract syntax and type system of this type theory in Lean and write a parser type checker for it. (We aim to use the embedded type theory to simplify proofs about its categorical models.)

A solid command of functional programming is required, as is familiarity with basic compilation techniques (parsing, abstract syntax, typechecking). Experience using theorem provers based on dependent type theory (e.g. Agda, Coq, Lean) would be very helpful. The project is suitable for a master or strong bachelor student.

## Clause learning proofs of lifted formulas (Johannsen - Master)

Lifting is a technique to produce hard instances of CNF formulas from easier ones. In a recent paper [1], a lower bound for proofs of certain lifted formulas in the regular resolution proof system was shown. For these formulas, upper bounds in a stronger proof systems related to clause learning [2] should be derived.

[1] Vinyals, Elffers, Johannsen, Nordström, "Simplified and Improved Separations between Regular and General Resolution by Lifting", SAT 2020

[2] Buss, Hoffmann, Johannsen, "Resolution Trees with Lemmas: Resolution Refinements that Characterize DLL-Algorithms with Clause Learning", LMCS 4, 2008.

## Generalized rewriting for the Lean proof assistant (Limperg - Master)

The Lean theorem prover has various tactics for rewriting with equations.
For example, given the equation `h : n = m`

, we can call `rw [h]`

to replace every `n`

in our current goal with `m`

.
However, these tactics only work for equality, whereas in practice we sometimes want to rewrite with other relations as well.
For example, if two groups `G, H`

are isomorphic, we should be able to replace `G`

with `H`

in any statement about groups.

The Coq theorem prover has a tactic which implements such 'generalised' rewriting. The aim of this project is to port this tactic to Lean. An early proof of concept implementation for Lean is available.

To tackle this project, you will first have to familiarise yourself with the metaprogramming APIs of Lean. A solid command of functional programming, including monads, is required. Experience using theorem provers based on dependent type theory (e.g. Agda, Coq, Lean) would be very helpful.

## Structuring Lean proofs (Limperg - Bachelor)

In the Lean theorem prover, we usually write proofs in an imperative tactic language, so a proof is a series of commands. A problem with this style is that the proofs are very hard to read and sometimes also hard to maintain.

A master student of ours therefore wrote a tool which transforms an imperative proof into a 'structured' proof, where the steps are annotated with additional information about the expected proof state. However, this tool is currently just a proof of concept and not suitable for serious use. The aim of this project is to change that, polishing the tool and integrating it into the Lean extension for the VSCode editor.

To tackle this project, you may have to familiarise yourself with some of the metaprogramming APIs of Lean, for which a solid command of functional programming, including monads, is required. Experience with theorem provers based on dependent type theory (e.g. Agda, Coq, Lean) and with editor extensions for programming languages (particularly VSCode, which is implemented in JavaScript, and the language server protocol) would be helpful.

## Categorical semantics in Haskell (Kondylidou - Master)

Category theory provides a foundational framework for understanding and formalizing mathematical structures and relationships. In the field of functional programming, Haskell stands as a powerful language known for its strong type system and expressive capabilities. Yet, when it comes to modeling basic aspects of category theory in Haskell, the absence of a canonical approach presents a challenge and an open project. To address this, this thesis focuses on establishing a canonical methodology for modeling category theory constructs in Haskell, by introducing a higher-level abstraction for categories and their duals.

Example:

A reasonable model for a Category can be defined using a single-parameter type class.
This type class includes associated types for categorical objects and morphisms' domain
and codomain, allowing for the definition of morphisms based on these associated types.
In practical terms, this model functions well, and the default type instances specified
in the type class definition simplify the process of defining typical instances.However,
the main hurdle arises when attempting to abstract over this custom Category type class.
For instance, defining the Opposite category is not as straightforward as merely creating
an instance of Category with the arrows flipped. Instead, the aim is to express that the
Opposite category refines the base category by interchanging domain and codomain, which
is challenging within Haskell's existing framework.

## Graphical learning interface for specific topics in "Formale Sprachen und Komplexität" (Maio, Toth, Blanchette - Bachelor)

The BSc-level course "Formale Sprachen und Komplexität" and its companion "Theoretische Informatik für Medieninformatiker" present a wide variety of formalisms, including various types of automata and grammars. Some online tools exist to simulate automata or perform operations on grammars, such as the CYK algorithm tool, but a wider collection of tools would be greatly appreciated by students.

For this project, you would choose a topic from the course, design a graphical learning interface for it, and implement it.

## A compiler for a language of your choice (Blanchette - Bachelor)

Compilers are fundamental tools in computer science. Nowadays, a lot of useful theory and tools are available to design and implement them.

For this project, you would choose a subset of a source programming language (the language you compile), a target language (the language you compile to, e.g., assembly language), and an implementation language (the language you use to write your compiler). Then you would develop a compiler based on these parameters, starting with syntactic and semantic analysis and ending with code generation.

## An interpreter for a language of your choice (Blanchette - Bachelor)

Interpreters are fundamental tools in computer science. Nowadays, a lot of useful theory and tools are available to design and implement them.

For this project, you would choose a subset of a source programming language (the language you interpret) and an implementation language (the language you use to write your interpreter). Then you would develop an interpreter based on these parameters, starting with syntactic and semantic analysis and ending with execution.

## Formalizations in the Lean/Isabelle proof assistant (e.g., lambda-calculus, type systems, diff arrays, algorithms and data structures) (Maio, Toth - Bachelor/Master)

This type of thesis aims to define and then prove some interesting properties about a system, structure, or implementation, using either Isabelle [1] or Lean [2]. There exists a wide variety of possible formalization projects; you may either come to us with an idea of your own, or we look for one in a field that interests you.

For a project like this, prior experience with either Lean or Isabelle (or
another proof assistant like Coq [3] or Agda [4]) is very helpful but not strictly
required if you are willing to learn to use them in the process. For Master's level,
due to the probable scope of the project, not having prior experience might prove
to be challenging, though.

Resources: Lean 4 learning material,
Isabelle Tutorial,
Concrete Semantics with Isabelle/HOL,
Functional Data Structures and Algorithms.

[1] https://isabelle.in.tum.de/

[2] https://lean-lang.org/

[3] https://coq.inria.fr/

[4] https://agda.readthedocs.io/en/v2.6.0.1/index.html

Two interesting projects:

**Diff-Arrays:**

Functional lists are a convenient and easy-to-use data structure in functional programming. However, compared to imperative arrays, lookup and update operations on lists have a significantly worse runtime of O(n) instead of O(1). For this reason, we created an automatic refinement of linearly used lists to arrays and also non-linearly used lists to diff arrays. Diff arrays are persistent and store their updates in a tree-like structure next to a plain array. With them, we can have lookup and update operations in O(1) for its most recent version.
Parallel to the refinement, we will automatically create equivalence proofs using the Isabelle/HOL theorem prover and its Imperative/HOL and separation logic facilities to ensure that the two versions of the program are doing the same.

Repository

In the future work chapter of this thesis, you can see the open tasks in this endeavor. All four tasks described there would give a good entry point to a Master's or Bachelor's thesis.

**Functional Real-Time Deque:**

Based on the work of Chuang and Goldberg, we implemented and formally verified a double-ended queue (deque) in a purely functional language such that each enqueue and dequeue operation on either end takes O(1) time in the worst case. This is what real-time means. Operations on previous versions of a deque are in constant time since purely functional data structures are persistent by default.

Paper: https://drops.dagstuhl.de/opus/volltexte/2023/18404/pdf/LIPIcs-ITP-2023-29.pdf

Entry in the AfP: https://www.isa-afp.org/entries/Real_Time_Deque.html

However, we do not support the catenation of deques. This paper by Kaplan and Tarjan describes a fully functional implementation of a real-time deque with catenation: https://dl.acm.org/doi/abs/10.1145/324133.324139

We found an actual implementation of it in OCaml: https://github.com/art-w/deque

A good entry point to your Master's or Bachelor's thesis could be re-implementing this data structure in a functional programming language of your choice and comparing it to the Deque implementation without catenation based on Chuang and Goldberg.

## Uniform presentation and comparison of tractable cases of SAT (Johannsen - Bachelor/Master)

A tractable case of SAT is a class C of CNF formulas s.t. it is easy to decide whether a formula F is in C, and for formulas in C satisfiability can be tested in polynomial time. Well-known tractable cases are Horn formulas and 2-CNFs. Several recent papers describe new tractable cases. These cases and the algorithms for solving them should be described in a uniform standard format, and compared to each other and other known tractable cases.

## Eager normalisation in Aesop (Limperg - Master)

Aesop is a proof search tool developed by Jannis Limperg for the Lean 4 theorem prover. It runs various Lean tactics in an attempt to automatically find proofs of 'trivial' propositions. Currently, these tactics perform type-level computations, which are a core feature of Lean's type system, lazily, i.e. only when such computations are necessary. This has generally been found to be the most efficient way to implement Lean and similar theorem provers.

However, in the context of Aesop, we often operate on the same or very similar expressions multiple times, so it may be more efficient to perform all type-level computations at once (i.e., to normalise the expressions). Additionally, normalisation would allow us to use a variety of techniques from automated theorem proving, e.g. very efficient indexing methods. Hence the goal of this project is to implement eager normalisation (and possibly some follow-up optimisations) and to evaluate whether this indeed increases Aesop's performance.

## Abusing Lean's elaborator for EDSL type inference (Limperg - Bachelor/Master)

Many programming languages feature some form of type inference, which lets the compiler fill in certain type annotations omitted by programmers. For embedded domain-specific languages (EDSLs), a type inference algorithm must usually be developed and implemented from scratch. This is cumbersome, so EDSLs often end up having poor type inference.

The Lean 4 theorem prover offers a possible alternative. It gives programmers full access to its own elaborator, which infers implicit arguments and type-class instances. This means we can use Lean's elaborator as a type inference procedure for any DSL embedded in Lean, provided that the EDSL's type inference problems can be encoded as elaboration problems.

The project will explore this possibility by implementing a toy functional EDSL with type inference. The type inference problems of this language should map almost trivially to elaboration problems. If time permits, we could also look into adding type classes to the EDSL.

The project requires solid functional programming skills (including monads) and the willingness to learn Lean's compiler APIs. Experience writing compilers or interpreters would be helpful. The project is suitable for a master or strong bachelor student.

Eigene Ideen, die zu unserem Forschungsgebiet passen, sind ebenfalls willkommen. Kooperationsarbeiten mit Firmen akzeptieren wir aber nicht.

Bitte wenden Sie sich an Prof. Blanchette, falls Sie Ihre Bachelor- oder Masterarbeit im Bereich der Theoretischen Informatik oder des Theorembeweisens schreiben möchten.

## Vorlagen

Im Downloadbereich (s. unten) gibt es eine LaTeX-Vorlage (kompiliert mit pdfLaTeX) für Bachelor- sowie Masterarbeiten auf Englisch oder Deutsch (die zip-Datei enthält die eigentliche Vorlage [tex-Datei], eine Language-Bibliographie-Vorlage [bst-Datei], eine Beispielbibliographie im BibTex-Format [bib-Datei] sowie den verwendeten Font [ttf-Dateien]). Das template ist auch auf Overleaf unter https://www.overleaf.com/latex/templates/tcs-thesis-template/vqjrgfwnrqbh verfügbar.

Ebenso können dort die mit dieser Vorlage generierten pdf-Dateien für Bachelor- bzw. Masterarbeit auf Englisch oder Deutsch heruntergeladen werden, die für Studierende, die nicht mit LaTeX arbeiten, als Orientierungsgrundlage für ihre Arbeiten dienen können.

## Bewertungsbogen

Im Downloadbereich (s. unten) gibt es auch den Bewertungsbogen für Abschlussarbeiten. Er gibt einen Überblick über die Kriterien, nach denen Abschlussarbeiten beurteilt werden.

### Downloads

Artikelaktionen