
REFERENCES
Besta, M., Blach, N., Kubicek, A., Gerstenberger, R., Gi-
aninazzi, L., Gajda, J., Lehmann, T., Podstawski, M.,
Niewiadomski, H., Nyczyk, P., and Hoefler, T. (2023).
Graph of Thoughts: Solving Elaborate Problems with
Large Language Models. arXiv:2308.09687 [cs].
Cai, Z. G., Haslett, D. A., Duan, X., Wang, S., and Picker-
ing, M. J. (2023). Does ChatGPT resemble humans in
language use? arXiv:2303.08014 [cs].
Chen, B., Zhang, Z., Langren’e, N., and Zhu, S. (2023).
Unleashing the potential of prompt engineering in
Large Language Models: a comprehensive review.
arXiv.org.
Dong, Y., Jiang, X., Jin, Z., and Li, G. (2024). Self-
Collaboration Code Generation via ChatGPT. ACM
Trans. Softw. Eng. Methodol., 33(7):189:1–189:38.
Feuerriegel, S., Hartmann, J., Janiesch, C., and Zschech,
P. (2023). Generative AI. Business & Information
Systems Engineering. arXiv:2309.07930 [cs].
Gandhi, K., Lee, D., Grand, G., Liu, M., Cheng, W.,
Sharma, A., and Goodman, N. D. (2024). Stream
of Search (SoS): Learning to Search in Language.
arXiv:2404.03683.
Hendel, R., Geva, M., and Globerson, A. (2023). In-Context
Learning Creates Task Vectors. arXiv:2310.15916
[cs].
Hong, S., Zhuge, M., Chen, J., Zheng, X., Cheng, Y.,
Zhang, C., Wang, J., Wang, Z., Yau, S. K. S., Lin, Z.,
Zhou, L., Ran, C., Xiao, L., Wu, C., and Schmidhuber,
J. (2023). MetaGPT: Meta Programming for A Multi-
Agent Collaborative Framework. arXiv:2308.00352
[cs].
Josifoski, M., Sakota, M., Peyrard, M., and West, R. (2023).
Exploiting Asymmetry for Synthetic Training Data
Generation: SynthIE and the Case of Information Ex-
traction. arXiv:2303.04132.
Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., and Iwasawa, Y.
(2023). Large Language Models are Zero-Shot Rea-
soners. arXiv:2205.11916 [cs].
Li, C., Wang, J., Zhang, Y., Zhu, K., Hou, W., Lian, J.,
Luo, F., Yang, Q., and Xie, X. (2023). Large Lan-
guage Models Understand and Can be Enhanced by
Emotional Stimuli. arXiv:2307.11760 [cs].
Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L.,
Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S.,
Yang, Y., Welleck, S., Majumder, B. P., Gupta, S.,
Yazdanbakhsh, A., and Clark, P. (2023). Self-Refine:
Iterative Refinement with Self-Feedback. arXiv.org.
Ning, X., Lin, Z., Zhou, Z., Wang, Z., Yang, H., and Wang,
Y. (2023). Skeleton-of-Thought: Large Language
Models Can Do Parallel Decoding. arXiv:2307.15337
[cs].
Nori, H., Lee, Y. T., Zhang, S., Carignan, D., Edgar,
R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W.,
Luo, R., McKinney, S. M., Ness, R. O., Poon, H.,
Qin, T., Usuyama, N., White, C., and Horvitz, E.
(2023). Can Generalist Foundation Models Out-
compete Special-Purpose Tuning? Case Study in
Medicine. arXiv:2311.16452 [cs].
OpenAI (2024). Learning to Reason with LLMs.
Park, J., O’Brien, J. C., Cai, C. J., Morris, M., Liang, P., and
Bernstein, M. S. (2023). Generative Agents: Interac-
tive Simulacra of Human Behavior. arXiv.org.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and
Sutskever, I. (2019). Language Models are Unsuper-
vised Multitask Learners.
Saha, S., Levy, O., Celikyilmaz, A., Bansal, M., Weston,
J., and Li, X. (2023). Branch-Solve-Merge Improves
Large Language Model Evaluation and Generation.
arXiv:2310.15123 [cs].
Savage, T., Nayak, A., Gallo, R., Rangan, E., and Chen,
J. H. (2023). Diagnostic Reasoning Prompts Reveal
the Potential for Large Language Model Interpretabil-
ity in Medicine. arXiv:2308.06834 [cs].
Su, X., Luo, M., Pan, K. W., Chou, T. P., Lal, V., and
Howard, P. (2024). SK-VQA: Synthetic Knowledge
Generation at Scale for Training Context-Augmented
Multimodal LLMs. arXiv:2406.19593.
Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., and
Zhou, D. (2022). Self-Consistency Improves Chain of
Thought Reasoning in Language Models. ArXiv.
Wang, Z., Mao, S., Wu, W., Ge, T., Wei, F., and
Ji, H. (2024). Unleashing the Emergent Cog-
nitive Synergy in Large Language Models: A
Task-Solving Agent through Multi-Persona Self-
Collaboration. arXiv:2307.05300.
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B.,
Xia, F., Chi, E., Le, Q., and Zhou, D. (2023). Chain-
of-Thought Prompting Elicits Reasoning in Large
Language Models. arXiv:2201.11903.
Wu, Q., Bansal, G., Zhang, J., Wu, Y., Li, B., Zhu, E.,
Jiang, L., Zhang, X., Zhang, S., Liu, J., Awadallah,
A. H., White, R. W., Burger, D., and Wang, C. (2023).
AutoGen: Enabling Next-Gen LLM Applications via
Multi-Agent Conversation. arXiv:2308.08155 [cs].
Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B.,
Zhang, M., Wang, J., Jin, S., Zhou, E., Zheng, R.,
Fan, X., Wang, X., Xiong, L., Zhou, Y., Wang, W.,
Jiang, C., Zou, Y., Liu, X., Yin, Z., Dou, S., Weng, R.,
Cheng, W., Zhang, Q., Qin, W., Zheng, Y., Qiu, X.,
Huang, X., and Gui, T. (2023). The Rise and Potential
of Large Language Model Based Agents: A Survey.
arXiv:2309.07864 [cs].
Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y.,
and Narasimhan, K. (2023). Tree of Thoughts: Delib-
erate Problem Solving with Large Language Models.
arXiv.org.
Yogatama, D., d’Autume, C. d. M., Connor, J., Kocisky,
T., Chrzanowski, M., Kong, L., Lazaridou, A., Ling,
W., Yu, L., Dyer, C., and Blunsom, P. (2019). Learn-
ing and Evaluating General Linguistic Intelligence.
arXiv:1901.11373 [cs, stat].
Yu, Z., He, L., Wu, Z., Dai, X., and Chen, J. (2023). To-
wards Better Chain-of-Thought Prompting Strategies:
A Survey. arXiv.org.
Zou, A., Wang, Z., Kolter, J. Z., and Fredrikson, M. (2023).
Universal and Transferable Adversarial Attacks on
Aligned Language Models. arXiv:2307.15043 [cs].
Agent-Centric Projection of Prompting Techniques and Implications for Synthetic Training Data for Large Language Models
1261