
Feb. 24 makes it four years since Ukraine was invaded by Russia. I thought I'd test out 10 LLMs and see how they did with Ukraine-related topics and translations. Ukrainian is spoken by 30-40 million people, around eighth-most in Europe, so it's an interesting test. I had a native Ukraine speaker friend look at most of the outputs to weigh in on their accuracy.
A couple of conclusions after running these tests:
- The larger models definitely did better here--the smaller models were cheaper but the tradeoff in correctness is too steep
- That said, for common phrases the small models did ok...the challenge of course is knowing whether the answer was correct or not.
You can see all of the prompts I tried, along with their outputs, in this publicly viewable Prompt Group.
Customer Support Template
The prompt:
Translate this customer support email template into Ukrainian, preserving the curly braced template variables:
Hi {firstName}, Thank you for your email, we'll follow up with you as soon as we've fixed the issue.
Thanks, {supportFirstName}
The first test was a basic little customer support email template. The larger models did well, while the smaller models struggled a bit. This definitely highlighted the importance of having a native speaker review text you're going to use in a business context: either have a native speaker review it, or mention as part of the translation that the text was generated mechanically and might contain errors.
Famous Sentences
The prompt:
I'm looking for a good phrase in Ukrainian, that represents a well-known sequence or sentence, from a famous document. Something like this in English would be like "Four score and seven years ago". What would be an equivalent in Ukrainian, and what's the context around it?
This is a risky prompt! The likelihood of hallucination is high here: we're asking for some creativity from the LLM, in Ukrainian no less. Some of the models get things fairly wrong.
Several models suggest the first line of the Act of Declaration of Independence of Ukraine, which is a great idea. You can read it yourself on wikipedia. But Olmo 3.1 32B Think, Kimi K2.5, Qwen 3.5 397B A17B just straight up mis-quote the first line...sometimes completely changing it, other times getting the first few words correct but then going off the rails.
The larger commercial models do pretty well here, coming up with different ideas; and when they do suggest the first line of the Declaration of Independence, they get it right.
Also, I have to say that "the mortal danger hanging over Ukraine" is not messing around.
Translation Versus Meaning
The prompt:
Translate this into English:
Душу й тіло ми положим за нашу свободу.
Instead of "translate this into English" I also tried starting the prompt with:
What does this phrase mean:
(It's a line from the Ukrainian national anthem)
These two prompt approaches generated different results! "What does this mean" generated more context, and I think a slightly better translation. The model had to think it through.
Note the thumbs-up and thumbs-down votes: I gave thumbs-down to translations that were plural instead of singular.
Just the Text, Please
Prompt:
Without preamble, commentary, or quotation marks, translate this into English:
I used this with the same line as above, from the Ukrainian national anthem. This got every LLM to just output the translated text.
One of the challenges with LLM responses is their tendency to include extra info that you maybe didn't ask for. They can't help themselves, as demonstrated with the Super Bowl questions.
Common Phrases Work Fine
The prompt:
What does this phrase mean: Слава Україні
"Glory to Ukraine": all the models, even the super tiny Ministral 3B, got this right! If the phrase is common enough, it can work.
"Where is this from" vs. "What's the context for"
The prompt:
Where is this quotation from? "I don’t need a ride, I need ammo."
I also tried preceding the quotation with a prompt of:
What's the context for this quotation?
(It's a quote attributed to President Zelenskyy in the opening days of the invasion)
Opus 4.6, Sonnet 4.6, GPT-5.2, Kimi K2.5, Gemini 3.1 Pro Preview, and GLM 5 all did a nice job, with the "What's the context" prompt wording generating more thorough responses.
Olmo 3.1 32B Think, Ministral 3 3B, Qwen 3.5 397B A17B, and Gemini 2.5 Flash Lite, however, hallucinated the heck out of their responses. Interestingly, the second version of the prompt, the "What's the context" version, generated different hallucinations than the first prompt.
So it isn't just that you're getting something wacky, you can get a different flavor of wacky every time.
Slang Worked Too
Prompt:
In what context would people use this phrase: "О, шахед, летить"
The large models did a surprisingly good job with slang too. That really is a phrase that means "Oh, a Shahed is flying"; Shahed here meaning the model of drones that are being launched into Ukraine.
See For Yourself
Thanks for reading, and again, you can see everything for yourself in this publicly viewable Prompt Group.
