
Launch HN: Lang (YC S19) – Internationalization Built for Devs - cyrieu
Hey HN! We’re Eric, Peter, and Abhi, founders of Lang (<a href="https:&#x2F;&#x2F;www.langapi.co" rel="nofollow">https:&#x2F;&#x2F;www.langapi.co</a>). We help developers quickly translate their apps into foreign languages by combining internationalization SDKs with a command-line interface that integrates directly with human translators.<p>Previously, we all worked on building internationalization and localization tooling for companies. In our experience, companies don’t think about translation until too late, and the tech debt builds up very fast. It’s a nightmare to receive a task that says “translate app into Spanish.” Choosing the right open-source framework, refactoring the entire codebase, and integrating with human translators is a massive effort. As engineers, we wanted to work on features - not putting every string in our codebase into a translations.json file. In our months of internationalization work, we couldn’t find a good all-in-one toolkit. So we built Lang.<p>Like other internationalization libraries, Lang gives you a tr() function. Wrap your strings with tr(), and we’ll show your users translations that correspond to their language settings at run-time. But how do you actually get the translations? Open-source frameworks like Polyglot.js stop here, but Lang doesn’t. Run “push,” and our command-line tool will parse your code files, find tr() calls, collect newly added strings, and send them to human translators for you. For JavaScript, we use Babel to construct an Abstract Syntax Tree (AST) of your code, and traverse the tree to find tr()’d strings. For a developer, this makes it simple to add&#x2F;remove&#x2F;update strings: just run “push” in your terminal. You can track the status of your translations on our dashboard, and when they’re done just run “pull.” We’ll generate a translation file for you, and connect it with our tr() function. You own the file - Lang doesn’t make any network requests for translations at run-time, and your translations always load, even if our service is down.<p>This works for static strings in the code, but what about dynamic content in the backend or database? We expose a function called liveTr(), which takes a string argument. The first time liveTr() sees an untranslated string, it will make a request to Lang to translate it and return the string in its original language. But the next time, it will fetch the translation on-demand. We’ve shipped liveTr() with built-in caching functionality to reduce the number of network requests. We also have self-hosted solutions for users with high uptime requirements. This is a common in-house feature companies build for internationalization, and we want to make it available to all devs.<p>Lang currently supports JavaScript and Typescript apps (React, React Native, Vue etc.) with closed betas for Django, Android, and iOS. Give us a try at <a href="https:&#x2F;&#x2F;www.langapi.co&#x2F;signup" rel="nofollow">https:&#x2F;&#x2F;www.langapi.co&#x2F;signup</a> - machine translations are free, so you can see your app in another language in minutes. If you use human translations, we charge $99 &#x2F; month for our tooling, and 6-8 cents per word translated. A lot of our work is inspired by open-source, and we want to give back - if you’re building an open-source project or non-profit, ping us at eric@langapi.co. We’ll drop the monthly fee :)<p>The HN community builds amazing products, and we’re sure there are plenty of people here who have translated their apps - we’d love to hear your experiences in this area and your feedback on how we can improve!
======
codingdave
I like the idea, but you are putting more weight behind the tooling than I
would. I don't find translation tooling to be cumbersome, so especially if
machine translations are free, I don't find the price point for human
translations to be compelling.

What would be compelling is if you could pro-actively call out the bigger
gotchas in translation - grammatical differences that make you change word
orders, different mechanism for handling plurals, etc. If you could
preemptively warn us, even before a "push" that we may hit a problem, I'd take
a closer look. For example, flagging a line saying, "Hey, it looks like you
are using phrasing that will be problematic in <Italian/Hindi/Russian/etc.>
Here's why..."

~~~
mkycl
Mozilla’s Fluent project [1] seems like a really thoughtful and comprehensive
approach to many of these problems.

[1] [https://projectfluent.org/](https://projectfluent.org/)

~~~
abhisiv
We analyzed fluent when we built the product and found that it didn't offer
much more than the international standard ICU. Our platforms have full support
for ICU syntax including plurals.

------
davidzweig
We needed recently translate our chrome extension
(languagelearningwithnetflix.com) into 18 languages. We are poor and had less
than $1000 for the job.

Here's our approach.

1\. Move all strings into a Google doc. Takes about 8 hours.

2\. Organise strings into groups with screenshots, think carefully, split
strings, look for reused strings, reword things to make them simpler and
easier to translate, add notes to some strings to make the meaning more
explicit. Very tedious part of job, 2-3 days work.

3\. Put the doc, editable with link, onto upwork with a fixed payment,
somewhat generous for translation wordcount. Check translator is a native in
the target language, had some good feedback and ideally some IT/programming
experience. Order translations for the languages we can check ourselves (5-6
languages).

4\. Check the translations received for issues. Translators typically
misinterpret the same things, as the source was not clear enough. Fix these
issues. Maybe 4 hours work.

5\. Now send to 10+ other translators for the languages we don't understand.
Cross fingers that these will be ok.

6\. Check translations of labels for homogenous usage of semicolons, capitals,
fullstops etc. Struggle with zh/ja/ko.

7\. Use a small JS script to transform CSV output from sheets to JSON for
chrome.i18n.

8\. Cycle through all locales and for overflowing text or other issues.

9\. For any extra strings that we might need later, can try Microsoft UI
translations database, or else, Google translate (which is mostly ok, can
check the reverse translation).

Honestly this all was quite a lot of boring work, but we probably ended up
with reasonable translations at a good price, and managed to pay translators
reasonable money.

~~~
earenndil
> Struggle with zh/ja/ko.

FYI, these are frequently called CJK (chinese, japanese, korean). Although,
that more has to do with the fact that each character takes multiple
keystrokes to type and thus is harder to support, than to do with the locale.

------
jedberg
This is very cool! It took us months to build the equivalent tooling for
reddit (except we were using crowd sourced translations instead of machine
based ones).

Do your clients do local caching, or is my uptime dependent on your uptime
(unless I code in my own caching I suppose)?

~~~
cyrieu
Great question, and thanks so much for sharing your experience at reddit! For
static strings, we create resource files and download them directly into your
codebase, so your uptime is totally independent of ours once you deploy your
app.

For dynamic strings in the database/generated by users, our servers need to be
up to receive + handle the translation request, but that will be cached the
next time you try to look the translation up.

~~~
jedberg
I'm glad to hear you guys put thought into reliability from the start!

------
osrec
The Hindi example on your front page is translated in a grammatically
incorrect manner.

Instead of saying "Lang me aapka swaagat hai", you've got "me aapka swaagat
hai Lang".

It's like saying "Lang Welcome to".

Nice idea though, if you can iron stuff like the above out.

~~~
cyrieu
Great catch, thank you! We'll update the Hindi example :)

------
kgodey
Does using the tooling depend on being translated through Lang? We are going
to be setting up new internationalization tooling soon but have a robust human
translation community already that we wouldn't want to bypass.

~~~
abhisiv
Our tools send the strings to a dashboard. At this point, you can have your
own translators add translations and pull them back in. We have developed
flows which make it easy for your own translators to go through your phrases,
see the context (description + screenshots) and add appropriate translations.

Sending translations to the agencies we've partnered with is completely
optional.

------
geoffreyy
That's a great product! But... I can't change the language of your website?!

~~~
abhisiv
Our site is internationalized with our product and we have Spanish
translations. If you change your Chrome language setting to Spanish you can
see it. We'll plan to add a manual language flag soon.

------
scrollaway
I worked on internationalization extensively before (Pootle, specifically).

The good:

The library's API seems pretty well-thought-out. A good i18n API in JS/TS is
highly needed, even more so one that works well with React. I use i18next in
my projects but it's mediocre, although I don't know that the difficulties I
end up facing with it wouldn't show up here.

The bad:

Pricing. Sorry but translation services are extremely competitive, and players
that have been around a long time such as Crowdin, Transifex and Weblate have
the benefit of being already trusted by name by a huge community of devs and
translators.

You also talk about open source a lot but I'm disappointed your web tooling
doesn't seem to be open source & self-hostable. This is one point where you
really could differenciate yourself.

The ugly:

It looks like you've pretty concretely tied your i18n API and your translation
UI together. I can't see your UI or whether it's any good, but I'm likely to
want to use your API with a different translation service, or your translation
service with a different API.

Also, please, Google oauth is basically a requirement for any b2b service.

(I'm happy to give more thoughts on a video/screenshare chat if you like, feel
free to reach out, email in my profile; always want to help new players in the
i18n space)

------
khalilravanna
This looks super cool. I got a bunch of questions!

1\. Am I correct in understanding this is meant as a client-side only solution
for now? Right now we have a pretty complicated translations process that
needs to support translations that are spread across the client and server.
Would this support a hybrid approach like that?

2\. Another question I have is where does the `translations.json` file come
from and where is that stored? Is that just generated by the CLI and then we
have to deal with serving that however we want?

3\. Is there one `translations.json` file per language? One with all of them?
Are there performance concerns with sending large files like that over to the
client? This is a general question for me to other developers of large sites:
how do you deal with _tons_ of translations?

4\. Any plans to support existing translations? E.g. if I have an existing set
of translations keys and values can I plug those in somewhere? I know y'all
are bootstrapping so it wouldn't surprise me if that's a Future feature.

Again, love the idea of this, and it would be super cool if this solves the
problems due to complexity we currently have with supporting a ton of
languages.

\--

For some background our current solution involves a generating a rather large
(> 1MB) `translations.json` file for each language that we serve to the client
via a CDN. Typical map of keys to values.

We create the keys ourselves as we go along something like
`dashboard.salesCard-helpText`. Then we have to kludgy Drupal instance to
populate the key and value, add some tagging to show it needs to be
translated. Translations get entered into that Drupal instance. All of this is
entered manually. Then that gets used to generate the `translations.json` file
I mentioned earlier.

We have plans in the future to overhaul the process.

~~~
abhisiv
1\. We support both client and server side frameworks (Django, python,
NodeJS). We're adding support for more as fast as we can including Java and
Rails.

2\. The 'translations.json' file is autogenerated by the CLI and updated each
time you pull translations. It's automatically bundled in the deploy process
so you don't need to do any extra work.

3\. Currently we have a single translations.json file. Space hasn't been an
issue yet but we plan to add splitting to reduce it. For dynamic content which
can be large, we have solutions where we can serve the content as a CDN. We
could also give clients a microservice if they would like to self-host or
directly update a cache on disk on the deployed machines. Still experimenting
with the best/easiest way to do this.

4\. Sure thing, we have a file upload in our dashboard right now, but we want
to add this to our CLI to make it more accessible.

I understand your frustration and one of our core philosophies is to do away
with keys completely. Our keys are auto-generated and not touched by the user.
The code can have the actual text which is a lot more readable. Ping me at
abhi@langapi.co and I'd love to help solve your problems.

------
felipap
Congrats on the launch! I can totally see the need for this. What (if
anything) would you say has changed in the recent years to make this
technologically feasible for the first time? Have you looked into how the
latest advances in NLP eg. GPT-2 has the potential to substitute translators
entirely?

~~~
cyrieu
Thank you for the kind words! Two main reasons for our approach:

1) JavaScript-based apps these days have complex rendering logic that makes
the HTML-parser method to find + translate strings unfeasible. Every company
we've worked at has needed to extract each string and wrap them with a special
`translate` function in their codebase.

2) We make heavy use of the Babel and TypeScript compilers to work with
JavaScript ASTs, and there's been huge progress on those recently.

We've thought about NLP, but quality is a huge concern of ours, and we're not
quite ready yet to roll that out to companies. If that's something you're
interested in, would love to chat, send me an email at HN username + gmail!

------
betimsl
I have a small framework I built for developing small web apps faster and it
also has a translation engine very much like this, basically what it does is:

    
    
        <span class="text-muted">$T(Detajet e blerësit)</span>
    

and it translates it to:

    
    
        buf.WriteString(T(`Detajet e blerësit`))
    

T is a function that takes a string and returns a string from static map that
gets built when app starts or when somebody translates a label.

~~~
abhisiv
Cool stuff! I would love to hear more about it. Please ping me at
abhi@langapi.co if you want to talk.

------
mandatory
Correct me if I'm wrong but wouldn't I still have to take the time to wrap
everything in the codebase? I feel like that's the majority of the painful
work.

~~~
abhisiv
Developers with large code bases have complained that it takes weeks to
manually wrap all their strings. We have an experimental tool for ReactJS
which can wrap all the front facing strings with our function instantly. It's
currently in beta but we recently used it to onboard a codebase with over 1000
strings in half an hour. We're also looking to build this auto-wrapping
service for other frameworks.

Ping me at abhi@langapi.co if you want a demo of this beta tool or if you want
it for another framework.

------
mping
Hi, we developed a similar solution minus the handling of translations itself
- fortunately we have inhouse people that can supply that.

A few questions:

\- Some translations are a bit context dependant, what happens if I dont agree
with the translation? \- Sometimes we do some kind of media localization, eg:
users in france will see a different image than users in portugal. Are you a
translation only shop, or you plan to do some kind of l10n?

Best of luck!

~~~
abhisiv
Good question, we let you provide as much context to the translators as you
want (description, tone, and screenshots). If you still don't like the
translation, you can comment and it will be redone taking your comment into
account. Additionally, if you have employees who want to review the
translations, they can do that on the platform.

We currently handle plan to handle localization of text (regular, dates,
currency, time, gender, plurals, etc.). Handling images is interesting though,
I would love to explore that more if it was a pain point for you. Ping me at
abhi@langapi.co--I'd love to talk!

------
smitop
Couldn't a bad actor abuse liveTr() and call it with a ton of random strings
to make me pay to translate a ton of garbage data?

~~~
cyrieu
A bad actor could only abuse liveTr() to spend your money if they had access
to your API keys, which hopefully should be a secret. In these cases, you'll
also be able to report bad strings coming in through your dashboard, and we'll
take care of the bad actor and re-fund you immediately!

------
pfista
Looks great! It'd be cool if the homepage was a live demo that allowed you to
toggle the language.

~~~
abhisiv
Our framework automatically checks the browser language preferences. Change
your browser language preference to Spanish and refresh to see the
translations. We'll also add the flag toggler soon to make it more obvious.

------
AlchemistCamp
I use gettext for whatever back-end language I'm using. I much prefer handling
this sort of thing from the back-end. That way it doesn't matter what the
front-end is. It just works.

~~~
abhisiv
For frameworks with gettext we integrate with it! Gettext will extract the
phrases into a .po file but these need to be translated. Running 'langapi
push' will push out all the .po files in the codebase and running 'langapi
pull' will pull the translated versions back in and automatically save them in
the correct locations.

------
polskibus
I'd love to see support for other server-side languages, like C# (preferably
something that can blend in with existing resource files) and Java.

~~~
abhisiv
We're actually building out support for Java right now and would be more than
happy to explore other server-side languages you're interested in. If you have
more specific requests ping me at abhi@langapi.co and I'd love to talk!

~~~
polskibus
As I mentioned - C#. Translation mechanism uses so called "resource" files
with a master file (english), and secondary files with other languages.

see

[https://docs.microsoft.com/en-
us/windows/win32/intl/preparin...](https://docs.microsoft.com/en-
us/windows/win32/intl/preparing-a-resource-configuration-file)

for more information.

------
whouweling
Very interesting, question: from the pricing page: it is unclear to me what is
meant by phrases in the context "100 Phrases" for $99 plan?

~~~
abhisiv
That's great feedback -- we're still working on the best way to do pricing. A
phrase is any text wrapped with our 'tr' function. We use phrases for pricing
currently because it's an industry norm but we're open to changing it if it
causes confusion.

~~~
whouweling
I think the confusion for me was that there is also a pay-per-word price; then
for me the limit on the number of phrases does not feel logical. Phrases as
tr() calls for me as a developer also depend quite a bit on how I have setup
the translation and differ per project.

I would expect to pay a monthly fee for the online service + a fee for each
word. Also, 100 unique tr()'s seem a bit limiting? (even smaller projects for
me quickly get > 1000 separate tr calls)

------
aaron425
This looks great & hi Abhi! Will this support left script languages?

~~~
abhisiv
Hey! Requesting and receiving translations work with any language including
left script (right-to-left) languages. In the future, we want to solve the
page layout problem for RTL languages as we've seen it's a big pain point.

------
jechiu
Cool I’d like to use this for my startup. Could I give you a call

~~~
cyrieu
Absolutely, email me at HN username + gmail and I can onboard you :)

------
mdrx
Wow, this is amazing!

~~~
abhisiv
Thanks :)

