Gemini 1.5 Flash, our new Catalan Hero
My project just wouldn’t exist without Google Gemini
I wrote in May about how AI models support the Catalan language to wildly varying degrees. I get it, it’s a small language, but it is still a letdown.
My side project for almost a year involves summarizing hundreds of thousands of files in Catalan. When I started, only OpenAI had decent support for the Catalan language, and I estimated it would cost hundreds of dollars a month, maybe even thousands. This was problematic for a hobbyist project, I’d rather not break the bank.
Towards the end of 2023 Google released a preview of Gemini 1.0 Pro with a very sweet deal: you could use it for free up to once a second, as long as they could use the data for their own training. It was perfect for this! Except, when I tested it, it could read and understand Catalan, but writing was a disaster.
So I came up with a scheme: I would ask Gemini 1.0 Pro to summarize the long document into a short summary in Spanish -well supported-, and then ask OpenAI to translate that short Spanish text into Catalan. As the text was much shorter, the OpenAI cost was also very low, under $50/month. Eventually it would become much cheaper with the release of gpt-4o-mini.
The setup worked well, but had 3 issues:
- “Overtranslation”. Summarizing into Spanish and translating back to Catalan was problematic. If the original text had a bit in Spanish (for example, the title of a Spanish movie, which you’d typically not translate), a direct summarization into Catalan would have been able to maintain it as-is. However, when you create the summary in Spanish, the resulting text has no hint that the Spanish title is any different than the rest of the text. The final translation back to Catalan will also translate the title which should have been kept in Spanish, making it awkward.
- The pipeline becomes clunkier as you have one more layer that could hallucinate, and fail in different ways.
- The deal seemed too good to be true, how long would it last?
When my employer went public, they invited a Stanford emeritus professor to teach us the basics of financial literacy. That was his biggest advice: “If something seems too good to be true, it is probably because it is not true”. Brilliant. But I digress.
The Google deal was actually that good, but eventually they graduated out of preview and started charging for it. I had already (ab)used it a lot, but still had more to summarize. I ponied up the new fees for a few days and then switched to the newly released Gemini 1.5 Flash.
The main reason to switch to Gemini 1.5 Flash was the price, 80% less expensive than Gemini 1.0 Pro. That was fortunate. Then I realized something else, that Gemini 1.5 Flash could write Catalan correctly! I had noted in my May post that the Gemini 1.5 Flash preview didn’t speak any Catalan, so it’s a welcome improvement since then!
The problems of overtranslation are now a distant memory, and the pipeline is more robust. Also, while Gemini is not free anymore, I can fit some of it into the free tier, and I’m looking at their batch API to keep maintaining the costs down. Their free tier is actually just 1 request per minute. They documents say it’s 15 per minute, but there is also limit of 1,500 a day, and a day has 1,440 minutes.
As it relates to cost, $71 for 3 weeks is still higher than I want. I’ll keep pushing my creativity to take this down.