Hacker News new | comments | show | ask | jobs | submit login
Rust FFI: Sending strings to the outside world (thefullsnack.com)
22 points by huydotnet 131 days ago | hide | past | web | 19 comments | favorite



It's nice to see a demonstration of how to do this by hand! But at work, we use Neon https://github.com/neon-bindings/neon to write production Node.js modules in Rust.


I appreciate the OP's desire to learn how Node's FFI works, but for production purposes utilizing an existing project like Neon is almost certainly the right answer. Interfacing interpreters with native code can be hairy and no two interpreters are alike, so going it alone is going to take a lot of time understanding the nuances and pitfalls of each individual system. Neon is still young yet, but it's from Dave Herman, the head of Mozilla Research, so I'm hopeful that it has a bright future ahead of it.


FWIW, ffi is just a third-party node module, which makes it pretty similar to Neon. Both are solutions that require you to write a bit of glue code and both have their caveats/pitfalls/landmines. Which one is the "right answer" is situational and neither is universally correct. For instance, if you needed to call into both Rust and C/C++, ffi might be a cleaner solution since your JavaScript could basically be the same for both. Using ffi can also be more dynamic than Neon, so if you wanted to do something akin to Java's URLClassLoader that downloads and runs native code at runtime, using Neon would make that a bit harder. The main difference between the two is where you write your glue code...with Neon, you write your glue in Rust and with ffi you write in JavaScript.

BTW...I'm a big fan of Neon and have even sent in a PR for an issue I found. I've used it to write Google Cloud Functions without having to write a single line of JavaScript and it's been pretty easy to work with. Rust is the perfect language for Cloud Functions/Lambda since those services are billed per CPU time/memory and Rust has very little overhead in those regards. Writing services that use the minimum allowed resources makes using these serverless platforms very price competitive with running your own servers, even when you reach a pretty considerable load. I do find the current direction Neon is headed (using macros to essentially write JavaScript code in Rust) to be not particularly useful (I'd rather my Rust code look like Rust), but the project is already at a point where I can build around it, so I'm not complaining.

Anyways, your comment seemed to imply that ffi was part of the Node platform and that it wasn't appropriate for production, and I disagree with both of those. Ffi is just a different approach that's suitable for different situations.


    #[no_mangle]
    pub extern fn string_array() -> *const *const u8 {
        let v = vec![
            "Hello\0".as_ptr(),
            "World\0".as_ptr()
        ];
        v.as_ptr()
    }
This has exactly the same problem as with `CString`; the `Vec` is deallocated at function exit.


You're right. I just ran the code again, it produces something funny:

   [ '��\u0002\u0002', '8+���~', buffer: <Buffer > ]
I will update the post. Thanks for pointing it out! :D


And there is a memory leak in all of that.


I was going to ask, how does Node then know to free the string when all JS pointers to it are gone?

... guess it doesn't?


When the program exits memory will be reclaimed by the kernel. Memory leaks aren't horrible if they don't appear in loops or recursive functions (that is to say the leaked allocations do not grow with the uptime of the program). In this case it's pretty much identical to allocating memory that has the lifetime of the program. Pretty obvious stuff but it's important to keep in mind.


True, but I'd imagine the primary use case for NodeJS wrapping a Rust binary is a long-lived server process, where I'd expect this to actually matter. (Even on Lambda it might matter, given container reuse.)


Absolutely.


It might be good to mention how to free the memory. By passing the pointer back to a rust function like

  #[no_mangle]
  pub extern fn free_string(p: *mut c_char) {
    unsafe {
      let _ = CString::from_raw(p);
      // optionally you could explicitly call drop, but
      // that is unnecessary, since it is automatically dropped
      // at end of scope
    }
  }

Also the `into_raw` method on CString gives you a pointer while simultaneously taking ownership of the CString, so mem::forget isn't necessary if you use `into_raw` rather than `as_ptr`.


> First, we store the Pointer of s string in a variable (p).

> Then we use std::mem::forget to release it from the responsibility of Rust.

CString::into_raw does that whole song and dance for you.

For Vectors you probably want to convert them to boxed slices first, at which point you can use Box::into_raw (to get the slice's head pointer).


I think the article inadvertently reinforces the idea that it's really only practical to scale code that segregates processes and uses streams instead of shared memory. New languages should probably focus on things like the Actor model and optimize stream passing with things like copy-on-write so that no data is actually copied until it's mutated.

For this and many other reasons (mainly pushing the onus of managing memory onto the developer instead of automating it) I can't really get behind Rust. I think many other languages like Swift are also barking up the wrong tree by being too pedantic.


Rust puts a reasonably large amount of effort into making shared memory concurrency safe. Move semantics, borrow checking and the Send/Sync traits combine to eliminate/reduce many of the pitfalls (e.g. data races) which motivated people to move away from shared memory. Additionally, Rust's semantics mean that things like actors/message passing and copy-on-write can be very efficient (e.g. move semantics means any copy-on-write overhead doesn't need to exist if the code will statically never read/write concurrently): http://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.ht...

Of course, the fairly explicit nature of Rust is annoying/inappropriate for some tasks which don't need quite that level of control, but I think you're missing the power Rust brings to concurrency. (Of course, whether this particular article uses that power is a different question.)


These problems (and the mistakes thereof) wouldn't happen between Rust threads, because the typesystem is explicitly designed to handle that safely. It's only when you're introducing type-unsafe boundaries where these problems occur, and the best solution to that is typically to use a wrapper that creates a type-safe abstraction. Supposedly Neon handles this for Node.


I'd recommend against using `std::mem::forget`. Even though it does pretty much the same thing you should probably use `some_cstring.into_raw()` instead. Semantically it's indicating that you are passing ownership of the string to the caller, while `forget()` just indicates that you want to leak the thing.


To the author: it would be nice if your site had a small screens layout.

I can't read the article on my phone because too much code is cut off (and the right margin is unpleasantly close to the main text, unlike the left margin).


Hey, thanks for the feedbacks and apologize for that. I made a quick fix for the style, the code can be scrollable on mobile now :D


[flagged]


why seems weird :(




Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: