Semantic Structural Representations

It is my opinion that one of the most important open challenges in machine programming (MP) is in the proper extraction or creation of a software program’s semantics (i.e., its purpose). I believe this problem, when solved properly, has the potential to reshape many aspects of programming. However, I believe the classical techniques to extract meaning from software, such as abstract syntax trees (ASTs) are, in general, inappropriate for semantic lifting.

I don’t believe we currently have a solution to this problem, but I believe the approach in the 2019 OOPSLA Aroma paper is the closest to getting it right from all the recent papers I’ve seen: Aroma: code recommendation via structural code search.”

Abstract Machine Learned Data Structures

(part of our "Intentional Programming" initiative)

I believe the way we programming today’s data structures is fundamentally flawed. I do not believe data structures should be exposed to programmers for at least three reasons:

  1. Human-decided data structure selection is oftentimes largely based on speculation.
  2. Human-decided data structure selection is often times static, limiting the maintainability of software’s dynamic development.
  3. Human-decided data structure selection slows down programmer productivity.

For these reasons, and others, I believe we need to re-architect the way in which we program using data structures. In general, I believe data structure programming should mostly be against an application programmer interface (API). There are many reasons for this. Here are some:

<details forthcoming>

Automated Testing

One of the areas of software development that tend to be critical in ensuring correct, performant, and secure software is testing. While I believe software testing is of utmost importance for real-world software engineering, testing can be time-consuming, error-prone, and (for super coders), boring.

For these reasons, and others, I believe we should try to automate software testing as much as possible. I believe the core goals of such testing should be:

  1. Mathematically sound tests (i.e., tests that reveal correct details when passing or failing)
  2. Mathematically complete tests (i.e., tests that provide exhaustive coverage)

Automatically generating mathematically sound tests is something that the research community has been making forward progress in for at least a decade (perhaps longer). However, providing exhaustive testing coverage, to the best of my knowledge, remains an open challenge for many domains.

Synthetic Data Generation

Details Forthcoming…