Yes, probably better to get the LLM to write the script.
Example, I was trying out two podcast apps and wanted to get a diff of the feeds I had subscribed to. I initially asked the LLM to compare the two OPML files but it got the results wrong. I could have spent the next 30 minutes prompt engineering and manually verifying results, but instead I asked it to write a script to compare two LLMs, which turned out fine. It's fairly easy to inspect a script and be confident it's _probably_ accurate compared to the tedious process of checking a complex output.
Example, I was trying out two podcast apps and wanted to get a diff of the feeds I had subscribed to. I initially asked the LLM to compare the two OPML files but it got the results wrong. I could have spent the next 30 minutes prompt engineering and manually verifying results, but instead I asked it to write a script to compare two LLMs, which turned out fine. It's fairly easy to inspect a script and be confident it's _probably_ accurate compared to the tedious process of checking a complex output.