It is known that alignment tax affects smaller models but in larger models (>~100B parameters) the "tax" starts to become negative at least when trained in RLHF:
https://arxiv.org/pdf/2204.05862.pdf
The largest Llama2 model has 70B parameters. They ran the experiment with 7B Llama2.
I would expect any pre-prompt or fine tuning training that is relevant to the subsequent prompts to improve the results (relative to the pre-prompting or tuning).
Likewise, any pre-prompt or fine tuning training that is NOT relevant to the subsequent prompts adds unhelpful and highly non-contextual complexity to the models job, so likely to reduce the quality of response.
I did not say they asked the AI to lie. I did they asked it not to give a correct and truthful response.
Maybe this explains much of the deterioration of ChatGPT that has been reported.