Hacker News new | comments | show | ask | jobs | submit login
Running Lisp in Production (grammarly.com)
397 points by f00biebletch on June 26, 2015 | hide | past | web | favorite | 114 comments



We deploy distributed, multi-language, centrally Lisp/SBCL servers as well. A few specifics that I'd point out:

Many of SBCL's optimizations are fine grained selectable, using internal SB-* declarations. I know I was at least able to turn off all optimizations for debug/disasm clarity, while specifically enabling tail recursion so that our main loop wouldn't blow up the stack in that build configuration. These aren't in the main documentation; I asked in the #sbcl IRC channel on FreeNode.

You can directly set the size of the nursery with sb-ext:bytes-consed-between-gcs, as opposed to overprovisioning the heap to influence the nursery size. While we've run in the 8-24GB heap ranges depending on deployment, a minimum nursery size of 1GB seems to give us the best performance as well. We're looking at much larger heap sizes now, so who knows what will work best.

While we haven't hit heap exhaustion conditions during compilation, we did hit multi-minute compilation lags for large macros (18,000 LoC from a first-level expansion). That was a reported performance bug in SBCL and has been fixed a while back. Since the Debian upstream for SBCL lags the official releases quite a bit, it's always a manual job to fetch the latest versions, but quite worth it.

Great read, and really familiar. :-)


We played with sb-ext:bytes-consed-between-gcs, but couldn't find the right balance. That's why we were surprised with the result of the oversized heap experiment


You might want to do some googling about it, but I also seem to remember that SBCL might decide to use large or small MMU page granularity depending on the size of the heap. That might be the watershed trigger for performance. (or just some random mis-remembered nonsense)


Can you also point to the particular version of SBCL that fixed for long-running compilation? We have recently upgraded to one of the latest version, but I think I've missed this change - I'm interested to check it in more detail.


I think it was back in the 0.54-0.59 version range.


that's much older than what we used recently, so it should be a different thing


Common Lisp's macros and grammar go together like bread and butter. A grammar module in the app I built [1] uses macros to generate huge amounts of repetitive code.

[1] https://github.com/tshatrov/ichiran/blob/master/dict-grammar...

I wonder if they're still hiring Lispers. I once passed on the opportunity to work in their Kiev office, but I might give it a shot again.


Wow, I checked out your ichi.moe app. It's awesome! I'm definitely going to use this as I work on re-learning the Japanese I've forgotten over the years. I've done some Racket and Clojure but never done CL... I'll have to check it out.


Thanks for pointing this out. I'm learning Japanese now and this will be massively helpful.


Found my reading material for tonight. Thank you.


They are hiring - contact kevin dot mcintire at grammarly dot com


Great article, and good reminder on using trace. Every time I rediscover trace, I can't remember how I ever forgot to use it in the first place for most of my problems.

I used CL in a production environment a while back for a threaded queue worker and nowadays as the app server for my turtl project, and I still have yet to run into problems. It seems like you guys managed to push the boundaries and find workable solutions, which is really great.

Thanks for the writeup!


Slightly off-topic, but does anybody know of a kind of "LISP Challange" set? I recently started the Matasano challenges[0] and I found them really well-suited to my style of learning (learning by doing and expanding by reading relevant material, enabled by my own internal motivation). Is there anything like that that has a relatively small set of condensed yet rich challenges that demonstrate key elements from LISPy functional programming? I read some of SICP but reading long form really puts a damper on my motivation/excitement. Also there were a lot of exercises (with a lot of overlap in concepts) so I didn't know what to do and what not to do, since I wasn't about to do every single one. Any pointers would be much appreciated!

[0] http://cryptopals.com


From the top of my head..:

The Praetorian Bootcamp challenges are similar to the Matasano ones: https://www.praetorian.com/challenges/

Lisp Koans: https://github.com/google/lisp-koans

Project Euler: https://projecteuler.net/

Exercism: http://exercism.io/

99 Lisp Problems: http://www.ic.unicamp.br/~meidanis/courses/mc336/2006s2/func...


I've been using python-koans to teach Python, but the thought of looking for lisp-koans from Github didn't even cross my mind now as I personally started learning CL. Thanks for these excellent links!


the Lisp Koans, Exercism, and 99 Lisp problems look especially good for FP concepts (and the first two are more modern than I had expected). Thanks!


Hi, I don't know if it fits what you want. But there is something like that for clojure[0] . I liked the language a lot. I didn't have the chance to use it in production yet. Another option is hacker rank with challenges [1] [0]https://www.4clojure.com/ [1]https://www.hackerrank.com/


http://exercism.io supports Common Lisp, Scheme, Clojure, and Emacs Lisp, plus a ton of others, and runs through a neat little CLI app locally, so you can use your own editor (I've been doing Rust challenges in Emacs, frex.)

http://www.codewars.com/ also supports Clojure, and Haskell, which is not a Lisp but is FP.

Hacker Rank pretty much supports everything, but the reason for this is that it handles all the tests through stdio instead of a test suite, resulting in a lot of irritating boilerplate code.


Well, that stdio thing is actually what I love about Hacker Rank :) I don't think the boilerplate code is a problem, if you use a reasonable language. Racket is a pleasure to use over there, Clojure slightly less so but wouldn't call it irritating...

But yeah, I've seen some Java solutions on HR...


I just wind up getting irritated having to rewrite the same code all the time. Some popular languages will generally come with a pre-filled template with the I/O already covered, but quite a lot of the FP stuff didn't.


Awesome stuff. Articles like this are what we Lispers/Schemers need to show that our languages can be used for "real work"(tm).


One of the things I would have liked to see on the article is how do they handle the deployment itself. Do they build an executable with build app? To they used sb-daemon? An home-grown solution using sb-posix:fork?


I don't know how they do it, but I have run several Lisp production systems. I use CCL, which has a really fast compiler. So I just load the code from source. The deployment process then becomes: git pull, quit and restart Lisp. (The ccl-init file loads the code.)


Oh cool! so from the server you stop the image that's running on the server, then restart the CCL image, and then load in the source (like starting a repl on the server and doing a "load ./start-file.lisp")? (ps sorry for the rookie questions am quite new to this and very interested).


Yes, exactly. (And no worries about rookie questions. That's how people become non-rookies.)

In my case, I run the REPL using screen so I can go back to it for debugging if I need to. But you can also run "headless", without a REPL at all if you want.


How do you deal with existing connections to the Lisp service when you want to deploy?

e.g. remove from load balancer, wait up to n minutes for connections to drain?, git pull, quit, restart Lisp, re-add to load balancer. ?


Yep, pretty much. I've only ever dealt with low-load applications so there was no load balancer. You just put the Lisp server in a state where it refuses new connections (serves up a "temporarily down for mx" page, or reconfigure the ngingx front end to do that), wait for existing connections to finish, and then restart. It's no different from any other server.


And in that case how do you handle the monitoring of the system?


This is why I switched from running daemons in tmux to having them start a slime listener; I can monitor them from any standard process monitor that will log errors and restart (I use daemontools with a run-script that sends me an e-mail to notify me of the restart).


I don't understand the question. Once it's running, it's like any other server, and you monitor it like you would any other server written in any other language.


Speculation here, but based on the context, I think the question is: Some daemon-izer solutions will monitor the daemon and e.g. restart it if it's unresponsive. How do you handle this slash how is this handled in the Common Lisp world?


As a last resort, if Lisp becomes totally unresponsive, you kill the process and restart, same as any other language. But it's pretty rare to lose the REPL, so usually you can fix any unexpected problems through that.


There's a reference to upstart in the article. We have played with demonizing SBCL (there are a couple of projects out there), but then Grammarly as a whole moved to upstart-based deployments. They are really easy to manage: basically, you just give it a normal (Lisp) script.


Something I need to get around to doing is playing with Docker and SBCL - it gives you daemonization out of the box. Have you tried that?


Yes, we're in the process of moving to docker-based deployments now. In fact, we already have ClozureCL running inside docker, but haven't yet done the same for SBCL. TBD soon :)


Is it worthwhile to explore Clojure for web-dev seriously or more as a toy?


It's extremely serious.

The deployment is the very well understood JVM and standard web servers. There's a common HTTP middleware framework (Ring), lots of choices for HTML generation and ClojureScript allows some code-sharing with your client side (compiles to JS).

On the back-end, you could use any Java library. On the front-end, you could use any JS framework (e.g. see https://github.com/omcljs/om)


I suspect swannodette (dnolen)'s talk at EuroClojure should be a pretty compelling showcase of how Clojure is quickly blossoming in the web-dev space.


It was a cool talk, but it didn't really address adoption.

It featured (to the best of my memory):

A new data model for om next. Instead of cursors, components have a composable datomic style DSL to locate their data, which is pluggable and can be from local memory or a cached server side query.

Using this approach you can request data in new ways from the client without adding server side code.

An update on cljs in cljs (small example but showing lots of plumbing work done)

The path to react native, repls and dynamic code loading on the device (was demoed) via ambly


I think it is one of the best stacks out there for web.

Check out this:

https://github.com/bhauman/flappy-bird-demo

It uses figwheel for the dynamic changes of state when you change to code, and it renders it without reload ala Bret Victor style. The first time I saw it I was amazed. It speeds up prototyping so much.

The out of the box performance is decent as well, Ring and Hiccup is pretty lean but you can go for more heavy frameworks (I don't have any experience with those).

I personally use Reagent + Ring, found it easy to use and get productive in a day.

If you haven't used LISP or any homoiconic language before it might look little weird at first but I found it easy to explain to people.

About the toy part, what is your definition of toy?


So how do you start using this example?


You need leiningen (it's in most package managers):

https://github.com/technomancy/leiningen

Clone the repo, `lein figwheel`, http://localhost:3449/index.html


I would do something like this on my macbook:

git clone blah && cd blah && brew install leiningen && lein figwheel

I don't know what is you operating system and other details.

You need leiningen, that is the make of Clojure, the repository cloned and you are good.


Seriously. I have my misgivings with Clojure in comparison to other Lisps, but it's a serious programming language.


I'm the opposite. The idea that Clojure is just enough Lisp - that it strips down some of the syntax a bit - is one of the benefits, IMO.


I've been using Clojure in production systems for 5 years now. Average of 15req/s over 5 years - I'd say that's pretty serious.


I vote that it's worthwhile as well. We've been using Clojure full-time at ReadyForZero for over 3 years now (for web-dev as well as other tasks), and are very happy with our choice.


Have you ever used anything like rails? I come from that background and clojure doesn't seem as straight forward; for better or worse. But I like the beauty of the language.


Mmm, I've never used Ruby or Rails professionally, so I can't really speak to that. Prior to Clojure, I mostly worked with C++, Java, C#, and Python.

At its core, Clojure is basically just data and functions. Once the switch flipped in my mind, I found that to be simpler and more straight forward than any of the languages I worked with before.

When the majority of your code is side-effect-free, pure functions (and Clojure makes it very easy to structure your code this way), it becomes very easy to reason about. And very easy to test, with minimal ceremony!

P.S. When I was learning Clojure, I kinda devoured all the resources I could find. A couple interesting looks at the language from the perspective of Rubyists:

http://briancarper.net/blog/536/clojure-from-a-ruby-perspect...

http://confreaks.tv/videos/rubyconf2009-clojure-for-ruby-pro...


Is it better than plain Java? I mean, if you have the entire Java ecosystem, is it worth it to go upstream just to work in a Lisp-like language?


IMHO, absolutely. There are many wins, but the 2 biggest for us have been:

1. Clojure's tools for making abstractions are orders of magnitude better than Java's, so our codebase is significantly smaller and simpler than what it'd look like in Java.

2. Clojure lends itself very well to a REPL-oriented kind of iterative development that speeds up writing and testing code tremendously, compared to the typical Java workflow.


Writing Java with Clojure is actually kinda fun, and macros let you abstract it away when the ceremony gets thick.


Is there a Clojure/Clojurescript tutorial similar to the Hartl Rails tutorial? I have amassed quite a few Clojure books, but find the easiest way to learn is to build.


this might be a good starting point if you're looking for a simple example http://www.luminusweb.net/docs


Apparently they use "JVM languages", JavaScript, Python, Go, Lisp and Erlang in production.

I may be in the minority, but that would drive me mad. I assume they're not routinely jumping between those stacks multiple times a day, but even so is there really that much benefit that it's worth keeping track of how to do things in that many different environments?


I think SICP states it best:

"In our study of program design, we have seen that expert programmers control the complexity of their designs with the same general techniques used by designers of all complex systems. They combine primitive elements to form compound objects, they abstract compound objects to form higher-level building blocks, and they preserve modularity by adopting appropriate large-scale views of system structure. In illustrating these techniques, we have used Lisp as a language for describing processes and for constructing computational data objects and processes to model complex phenomena in the real world. However, as we confront increasingly complex problems, we will find that Lisp, or indeed any fixed programming language, is not sufficient for our needs. We must constantly turn to new languages in order to express our ideas more effectively."

At a certain level of software design tying together different programming languages that are each the right tool for their job becomes just another type of programming. I currently do data science work, but even then in a given week I typically use R, Python, Lua and Java (and often Scheme in the evenings for fun). Trying to make any one of those languages do something the other is much better at is a phenomenal waste of time.

On the system level, once prototyping ends, if there's something that Java does phenomenally better than R, but we need both, that implies that you have two parts of the system different enough that they shouldn't be tightly coupled anyway. If you write a deep learning algorithm in Lua, but want to do some statisitical analysis on the results of that in R, it's good to force these things to be separated because if in 5 years you find a better model for the Lua part (maybe some better algorithm in Julia) you want to be able to swap it out anyway.


While a Single Language to Rule Them All would be cool, I ultimately prefer using the "best" language for the job based on specific requirements.

The "best" might change over time, too.

It can be a headache to manage massively-polyglot environments. At the same time, it's also pretty great for a variety of reasons. I mean, we regularly use different data stores, messaging solutions, frameworks, etc. and I don't see why languages shouldn't also be up for shuffling.


> I ultimately prefer using the "best" language for the job based on specific requirements.

Most shops don't think like that. Using too many languages often quickly becomes unmanageable.

I wouldn't use 6 different languages into the same project UNLESS they are part of different server/cli tools that work in isolation. I don't need to have a deep understanding of Go to use Docker, I don't need to be a PHP expert to deploy Drupal or Wordpress , I don't need to know Ruby to use vagrant,nor Java to use Cassandra. So using these tools in isolation is fine inside the same project.


As for me, adaptability is one of the important traits of a senior engineer. Surely, you don't have to be an expert in every platform, but you also shouldn't go mad if you need to do some work outside of your comfort zone occasionally. Besides, every language has its strong and weak points: if you're putting arbitrary limits here, you're just limiting what you can do and the people you're going to get in a team. At Grammarly, we always erred on the side of more freedom and it worked not so bad for us so far. Although, there are different companies, each with a unique story...


Since "proper service encapsulation" is mentioned, it may be that each team uses whatever they like, and as long as your component speaks http you don't have to look at what other components are doing.


This is exactly how it is (except sometimes it is not HTTP but message queuing etc)


That makes sense, having spent the last five years on a development team that has only recently grown to four people I can forget that not every team needs all their developers to be able to work on any project!


To me: Javascript, Python, and Go aren't all that different.

You must know Javascript in this day and age given it's de facto presence on the web.

Python is my go to language, especially for mathematical analysis. I can do everything in Python that I used to need Matlab for. From my point of view, pick your favorite modern scripting language and run, the differences really are the libraries, not the languages.

I have used Go, but Go just doesn't do it for me. It doesn't offer me anything I can't get, better, in another language especially if I can choose among Python (smaller headspace), Erlang (way better concurrency) or Lisp (way better abtraction power).

Erlang is my go to language for concurrency. Once I architect it in Erlang, I probably understand the problem.

Lisp is useful when my problem requires powerful abstraction. Otherwise, it gets in the way because people can't resist using that power. Clojure has changed my opinion on this quite a bit, but I don't yet have a big project that fits in it's space quite yet.


I suggest you look at http://lfe.io - it's an interesting take on LISP...


Very informative. Thank you.

Anyone here has any experience with the GCs of Allegro or LispWorks or any other commercial Lisp implementations?


LispWorks has a lot of features in its GC. Years ago it was used in a demanding telco application, an ATM switch. It was also used on a space mission experiment. Generally the runtime is very very nice.

Franz Inc uses Allegro CL in a large database. They tuned the GC quite a bit for that. But there were also other GC demanding applications on Allegro CL, for example in CAD and 3D design. They are now working on a concurrent GC, something which is still rare in the Lisp world.


Who used lispworks in an ATM switch? My very first job was working on ATM/IMA switches.


Lucent. The switching nodes were running LispWorks and the control system was running Allegro CL + OO database. That's quite some time ago... ;-)


Why are concurrent GCs rare?


Mostly because there are very very few people in the world who are able to develop such a complex thing for Lisp (or similar runtimes). Since the market is relatively small and many applications are not overly concurrent, there is very little money to support the development.


I am guessing[1] that GCs are easier to code correctly without the concurrency and that a GC language is already expected to be slower so it doesn't make sense to do a concurrent GC. Also possibly, the language doesn't support concurrency well. Like a Python or Ruby.

[1] just an educated guess. I have no real knowledge of GCs other than skimming how they work in articles and runtime/language docs.


are there many platforms, besides JVM and .Net, that have good-quality concurrent GCs?


The Erlang Vm (BEAM), is another one at least.


It doesn't actually implement concurrent GC, although what it does implement is far simpler and has a similar effect (low latencies) as concurrent GC.

Each Erlang process has a separate heap that is collected independently; because the process heap is usually small a stop-the-process collection does not take much time.

The downside is that sending messages between processes requires copying all the data that is sent between process heaps.


I've worked a lot with LispWorks and tuning the gc had the same method of programmatically calling a full GC after every N operations. We also found setting the gc threshold to high amount helped a bunch.

Supposedly they have a concurrent GC in the works but I haven't played with it.


Thanks.

> I've worked a lot with LispWorks and tuning the gc had the same method of programmatically calling a full GC after every N operations.

What is "full GC", here? Do you mean, even the "older" generations? (Assuming Lispworks also has generational GC a la sbcl)

In other words, Would it have helped if the implementation was a mark-compact rather than generational?


We normally used

http://www.lispworks.com/documentation/lw60/LW/html/lw-712.h...

with the full set to nil.


Aw, now they have disclosed their secret weapon! [1]

[http://www.paulgraham.com/avg.html]


> but we value choice and freedom over rules and processes.

Which I exactly why I feel Lisp doesn't see much use elsewhere :(


Good lord, I would go insane if I ran into a bug like this:

"We've built an esoteric application (even by Lisp standards), and in the process have hit some limits of our platform. One unexpected thing was heap exhaustion during compilation. We rely heavily on macros, and some of the largest ones expand into thousands of lines of low-level code. It turned out that SBCL compiler implements a lot of optimizations that allow us to enjoy quite fast generated code, but some of which require exponential time and memory resources. "


Actually these errors are relatively easy to spot and and often there are solutions for that. Sometimes a compiler might need to be hacked on. The good thing: the compiler is written in Lisp and debugging is possible.

The more nasty errors are lurking for example in the GC... there we move into C and assembler land...

Most platforms have nasty errors. With popular platforms one can hope that many of these have be found and somebody has fixed them already. With language/runtimes which are no so widespread in production one is more likely to find these problems oneself. Especially in more complex runtimes.


Yeah, but that's something you need to be prepared for. Such bugs happen in literally every platform (for instance, I had similar trouble with the JVM). So the question is not how to avoid them, but how to cope. Usually, there are 2 ways:

- workaround

- investigation (and in this case, if you're on a closed platform you're busted)


It'd be a Hell of a lot more interesting than the millionth time looking into a bug caused by invalid input into a non-validated form in an ancient CRUD intranet app.


At least those are usually pretty quick to fix.


That makes the optimistic assumption that someone will fix them.


You can write C++ template metaprogramming code that will send your compiler into coma, if you're not careful. Code that pushes the boundaries is bound to push through them occasionally.


You know, starting a paragraph by "We've built an esoteric application", would you expect what followed to be a simple bug?

Write an "esoteric" app and you'll start hitting the limits of your platform.


I dont really understand the downvotes... I agree with you would go nuts.

My two cents, if your macro is expanding to thousands of lines of code, your doing something wrong I think. I would expect Macros to expand out to a few lines of code which might have function calls that themselves may contain however many lines of code... but expanding to thousands of lines INLINE via macros seems wrong.


Some macros expand to a lot of code, especially when doing more in-depth transformations such as those performed by core.async in Clojure where they transform standard sequential code into a state machine with exactly the same semantic but with the added ability to execute, yield and resume like a coroutine.


I assume their macros do compilation of an expressive DSL to efficient low level code, as one would expect in their domain. So keep in mind that this is something pretty fancy you can pretty much only do with Lisp (or by writing a full blown compiler).


You're getting a lot of pushback on this, but I think it's a legitimate question at the very least. In this situation I would wonder whether some procedural abstractions could be introduced to cut down the size of the expansions.

And before anyone jumps on me -- I have 35 years of Zetalisp and Common Lisp experience and have written some pretty hairy sets of macros. There might turn out to be no easy way to make the expansions smaller, but there's nothing wrong with asking the question.


I have pointed to the call-with-* style which is a general "best practice" for that (it's even mentioned in Google CL Style guide). However, expanding to low-level stuff also has it's benefits for a clearly delimited space (mainly, performance) if you know what you're doing



Yeah it's fairly easy to fix this (and you can even do it without unwinding the stack!). There are tons of sbcl global variables which you can tweak on-the-fly.


I love it :0) I tried grammarly, and typed in a remembered poem. It informed me that it had detected significant plagiarism.

Edit: It's still not advice I would pay for, though.


Thanks, great post and lot of useful references.


This was fantastic. Thank you.


Wow. I can't ever remember reading about a consumer-facing app using Common Lisp. Ever.


> I can't ever remember reading about a consumer-facing app using Common Lisp.

There was ViaWeb: http://www.paulgraham.com/avg.html

I'm told that Orbitz uses or used it a lot too.


You're using one ;)


Nope, HN is written in Arc (which runs on top of Racket) :-p


Once you go deep enough you are fucked no matter what language you choose. Might as well pick one that doesn't beat you up too much.


Why not racket?


I was thinking the same question and the reply would probably be "performance". I don't think untyped Racket can currently compete with SBCL and I'm not sure they'd be interested in Typed Racket. But I'd love to be proven wrong on both of those things ;)


Ok that makes sense.


But that's not the reason. Personally I find Lisp superior in most respects, but I don't think it's a good place and time to argue on Lisp vs Racket and similar topics.


Why racket?


Seconded. Why not racket?


[deleted]


Well, the HDF5 problem was actually not on the Lisp side ;) But, in general, do you really believe that there are no issues with libraries in other languages? I've had my share in Python or on the JVM, as well. The whole point in the article was to show that there are some challenges, but they didn't become critical to our operation.


All right, i'm deleting my comment. I don't want to start a Clojure vs CL flame war. That wasn't the intent and I thought I'd made it very clear.


So you would pick a different language based on the lack of an existing Jenkins interface library? They would lack 90% of Common Lisp then. Not a good trade if you ask me.

Edit: I assume you are down-voted because most of your post is wrong information.


Huh? I said nothing about picking a language only based on a Jenkins interface library.

I just noted that every problem they had which they noted in this post seems to arise from the gnarliness of the CL ecosystem, and wouldn't happen with Clojure. Jeez. Where is the "wrong information" in that.


Gnarly CL ecosysten? That seems a tad manichean, Clojure certainly has more libraries but CL has its fair share as well. And if you want to tap into the Java ecosystem there is always ABCL.

Clojure is a language with diferent design sensibilites than CL.


I assume they have their reasons. Speaking of which, it would be interesting to hear why they chose the particular flavour that they did.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: