Abstract
Task decomposition is a fundamental mechanism in program synthesis, enabling complex problems to be broken down into manageable subtasks. ExeDec, a state-of-the-art program synthesis framework, employs this approach by combining a Subgoal Model for decomposition and a Synthesizer Model for program generation to facilitate compositional generalization. In this work, we develop REGISM, an adaptation of ExeDec that removes decomposition guidance and relies solely on iterative execution-driven synthesis. By comparing these two exemplary approaches-ExeDec, which leverages task decomposition, and REGISM, which does not-we investigate the interplay between task decomposition and program generation. Our findings indicate that ExeDec exhibits significant advantages in length generalization and concept composition tasks, likely due to its explicit decomposition strategies. At the same time, REGISM frequently matches or surpasses ExeDec's performance across various scenarios, with its solutions often aligning more closely with ground truth decompositions. These observations highlight the importance of repeated execution-guided synthesis in driving task-solving performance, even within frameworks that incorporate explicit decomposition strategies. Our analysis suggests that task decomposition approaches like ExeDec hold significant potential for advancing program synthesis, though further work is needed to clarify when and why these strategies are most effective.
Core Technologies
- 💻 Execution-guided Program Synthesis
- 🔁 Repeated iterative generation of single-step programs
- 🤖 Transformer-Based Synthesizer Model
-
🎯 Keywords
Program Synthesis, Task Decomposition, Compositional Generalization, Execution-Guidance
Programmnig by example
Programming by example (PBE) is a paradigm where a system synthesizes a program based on a small set of input-output examples provided by the user. Instead of writing code explicitly, the user shows what the program should do, and the system infers a program in a predefined domain-specific language (DSL) that is consistent with the examples. PBE is widely used in scenarios like string transformation, data cleaning, and end-user programming, where writing code manually is difficult or time-consuming.Methodology
ExeDec
Explicit task decomposition framework for program synthesis
-
Subgoal Model
Decompose a PBE task into a subtask (yielding a new PBE task) by predicting the output of the next subtask.
-
Synthesizer Model
Solve predicted subtask using execution-guided program synthesis.
Repeat until the task is solved or a maximum step limit is reached:
- Subgoal Model predicts the next single-step subtask.
- Synthesizer Model generates a program for the predicted subtask.
- The program is executed, and task specifications are updated.
Repeated execution-guided invocation of the Synthesizer Model (REGISM)
Removing the task decomposition step from ExeDec
-
No Subgoal Model
Guidance via task decomposition is removed.
-
Synthesizer Model
Solve remaining task using execution-guided program synthesis.
Repeat until the task is solved or a maximum step limit is reached:
- Synthesizer Model generates a program based on the current task specification.
- The program is executed, and task specifications are updated accordingly.
Note: Subtasks are only revealed in hindsight through execution.
ExeDec Framework
Evaluation Framework
Comprehensive evaluation across two program synthesis domains:
- • Robustfill (String manipulation)
- • Deepcoder (List manipulation)
Qualitative Results
Our comprehensive evaluation demonstrates that while ExeDec excels at learning expedient decomposition strategies in the RobustFill domain, it struggles to do so in the DeepCoder domain. The accompanying plots illustrate this discrepancy by showing the density of solved tasks as a function of both semantic and syntactic overlap with the ground truth decomposition-that is, whether the same subtasks are predicted (semantic overlap) and whether those subtasks are solved using the same program tokens as in the ground truth (syntactic overlap). A high concentration of solved tasks in regions of high overlap indicates effective learning of meaningful decompositions. While such a pattern emerges clearly in RobustFill, the spread in DeepCoder suggests difficulty in aligning with the correct decomposition strategy.
This trend can also be seen when looking at the number of subtasks ExeDec predicts: In the Robustfill domain, it is very close the the number of ground truth subtasks whereas it significantly deviates from the ground truth in the Deepcoder domain. This suggests, that in the Deepcoder domain ExeDec benefits from the repeated envoking of the Synthesizer Model.
When looking at the quality of the solutions generated by REGISM, one can see that they are closer to the ground truth solutions.
Quantitaive Results
While ExeDec demonstrates clear advantages in Length Generalization and Composing Different Concepts, REGISM matches or even surpasses ExeDec across all other evaluation categories. This comparison underscores the effectiveness of iterative synthesis calls-a core mechanism leveraged by ExeDec-as a powerful strategy for tackling complex synthesis tasks.