tuc
core

Shedding Light in Task Decomposition in Program Synthesis

The Driving Force of the Synthesizer Model

Janis Zenkner

Clausthal University of Technology

janis.zenkner@tu-clausthal.de

Tobias Sesterhenn

Clausthal University of Technology

tobias.sesternhenn@tu-clausthal.de

Christian Bartelt

Clausthal University of Technology

christian.bartelt@tu-clausthal.de

Abstract

Task decomposition is a fundamental mechanism in program synthesis, enabling complex problems to be broken down into manageable subtasks. ExeDec, a state-of-the-art program synthesis framework, employs this approach by combining a Subgoal Model for decomposition and a Synthesizer Model for program generation to facilitate compositional generalization. In this work, we develop REGISM, an adaptation of ExeDec that removes decomposition guidance and relies solely on iterative execution-driven synthesis. By comparing these two exemplary approaches-ExeDec, which leverages task decomposition, and REGISM, which does not-we investigate the interplay between task decomposition and program generation. Our findings indicate that ExeDec exhibits significant advantages in length generalization and concept composition tasks, likely due to its explicit decomposition strategies. At the same time, REGISM frequently matches or surpasses ExeDec's performance across various scenarios, with its solutions often aligning more closely with ground truth decompositions. These observations highlight the importance of repeated execution-guided synthesis in driving task-solving performance, even within frameworks that incorporate explicit decomposition strategies. Our analysis suggests that task decomposition approaches like ExeDec hold significant potential for advancing program synthesis, though further work is needed to clarify when and why these strategies are most effective.

Core Technologies

  • 💻 Execution-guided Program Synthesis
  • 🔁 Repeated iterative generation of single-step programs
  • 🤖 Transformer-Based Synthesizer Model
  • 🎯 Keywords

    Program Synthesis, Task Decomposition, Compositional Generalization, Execution-Guidance

Programmnig by example

Programming by example (PBE) is a paradigm where a system synthesizes a program based on a small set of input-output examples provided by the user. Instead of writing code explicitly, the user shows what the program should do, and the system infers a program in a predefined domain-specific language (DSL) that is consistent with the examples. PBE is widely used in scenarios like string transformation, data cleaning, and end-user programming, where writing code manually is difficult or time-consuming.

Illustrative PBE task

ARC Example

Methodology

ExeDec

Explicit task decomposition framework for program synthesis

  • Subgoal Model

    Decompose a PBE task into a subtask (yielding a new PBE task) by predicting the output of the next subtask.

  • Synthesizer Model

    Solve predicted subtask using execution-guided program synthesis.

Workflow:

Repeat until the task is solved or a maximum step limit is reached:

  1. Subgoal Model predicts the next single-step subtask.
  2. Synthesizer Model generates a program for the predicted subtask.
  3. The program is executed, and task specifications are updated.

Repeated execution-guided invocation of the Synthesizer Model (REGISM)

Removing the task decomposition step from ExeDec

  • No Subgoal Model

    Guidance via task decomposition is removed.

  • Synthesizer Model

    Solve remaining task using execution-guided program synthesis.

Workflow:

Repeat until the task is solved or a maximum step limit is reached:

  1. Synthesizer Model generates a program based on the current task specification.
  2. The program is executed, and task specifications are updated accordingly.

Note: Subtasks are only revealed in hindsight through execution.

ExeDec Framework

tuc

Evaluation Framework

Comprehensive evaluation across two program synthesis domains:

  • • Robustfill (String manipulation)
  • • Deepcoder (List manipulation)

Qualitative Results

Our comprehensive evaluation demonstrates that while ExeDec excels at learning expedient decomposition strategies in the RobustFill domain, it struggles to do so in the DeepCoder domain. The accompanying plots illustrate this discrepancy by showing the density of solved tasks as a function of both semantic and syntactic overlap with the ground truth decomposition-that is, whether the same subtasks are predicted (semantic overlap) and whether those subtasks are solved using the same program tokens as in the ground truth (syntactic overlap). A high concentration of solved tasks in regions of high overlap indicates effective learning of meaningful decompositions. While such a pattern emerges clearly in RobustFill, the spread in DeepCoder suggests difficulty in aligning with the correct decomposition strategy.

Density of solved tasks on Robustfill Density of solved tasks on Deepcoder

This trend can also be seen when looking at the number of subtasks ExeDec predicts: In the Robustfill domain, it is very close the the number of ground truth subtasks whereas it significantly deviates from the ground truth in the Deepcoder domain. This suggests, that in the Deepcoder domain ExeDec benefits from the repeated envoking of the Synthesizer Model.

Number of Decompositions ExeDec Number of Decompositions REGISM

When looking at the quality of the solutions generated by REGISM, one can see that they are closer to the ground truth solutions.

Density REGISM

Quantitaive Results

While ExeDec demonstrates clear advantages in Length Generalization and Composing Different Concepts, REGISM matches or even surpasses ExeDec across all other evaluation categories. This comparison underscores the effectiveness of iterative synthesis calls-a core mechanism leveraged by ExeDec-as a powerful strategy for tackling complex synthesis tasks.

Accuracy on Deepcoder