Essential Prompts for Reasoning Chain Verification and Natural Program Generation

cover
8 Sept 2024

Authors:

(1) Zhan Ling, UC San Diego and equal contribution;

(2) Yunhao Fang, UC San Diego and equal contribution;

(3) Xuanlin Li, UC San Diego;

(4) Zhiao Huang, UC San Diego;

(5) Mingu Lee, Qualcomm AI Research and Qualcomm AI Research

(6) Roland Memisevic, Qualcomm AI Research;

(7) Hao Su, UC San Diego.

Abstract and Introduction

Related work

Motivation and Problem Formulation

Deductively Verifiable Chain-of-Thought Reasoning

Experiments

Limitations

Conclusion, Acknowledgements and References

A Deductive Verification with Vicuna Models

B More Discussion on Improvements of Deductive Verification Accuracy Versus Improvements on Final Answer Correctness

C More Details on Answer Extraction

D Prompts

E More Deductive Verification Examples

D Prompts

D.1 Prompt for Direct Reasoning Chain Verification Without Natural Program Format

For the results in Tab. 2 of the main paper, We use “Do you think the above reasoning process is correct? Let’s think step by step.” as the zero-shot prompt to verify an entire reasoning chain at once. We also design a two-shot prompt for reasoning chain verification as shown in Tab. 12, which covers one correct reasoning chain and one incorrect reasoning chain.

D.2 Prompts for Reasoning Chain Generation in the Natural Program Format

To instruct models to generate reasoning chains in the Natural Program format that facilitates step-by-step deductive verification, we have designed four distinct prompts to address different types of problems. These include:

  1. Math word problems, as illustrated in Tab. 13, covering GSM8K, MATH, and AddSub datasets.

  2. Math word problems with multiple-choice options, illustrated in Tab. 14, covering the AQuA dataset.

  3. Date-related problems, illustrated in Tab. 15, covering the Date dataset.

  4. Last Letters problems, illustrated in Tab. 16, covering the Last Letters dataset.

D.3 Prompt for Deductive Verification Following Natural Program Format and Step-by-Step Decomposition

We have designed a general one-shot prompt for the deductive verification of a single reasoning step on different datasets, as shown in Tab. 17. This prompt serves to instruct language models to generate the deductive validity of each reasoning step as illustrated in Sec. 4.2 and the top-right box of Fig. 1 of the main paper.

This paper is available on arxiv under CC BY 4.0 DEED license.