I've used Swig for C++ to Python interop and let me warn you it is a nightmare. The syntax is absurdly terse and convoluted, documentation sucks and good luck trying to search the web for constructs made only of special characters.
CLIF is way nicer for C++/Python. If you need other languages then best of luck (Zig, Rust and D are easier than Java and Go at least).
SWIG was created 27 years ago. I've used it when it was the only game in town and I'm grateful for it. Now there are many other options, at least for Python. Use the one that suits you best.
Right, I'm sure the project achieved great things and I didn't mean to belittle that.
But today I would not recommend anyone start using it if they have a choice, and it seemed prudent to post that warning on a submission of Swig's homepage without context.
That was exactly my experience writing custom GNU Radio blocks for my master's thesis: swig mangling my C++ variables. I have nothing positive to say about it.
The syntax is terse if you need to do arbitrary stuff with an arbitrary interface. But if you start off keeping SWIG in mind from the beginning and avoid funky features at API level you might be fine by just "importing" your header files in the SWIG-definition file.
> But if you start off keeping SWIG in mind from the beginning and avoid funky features at API level you might be fine by just "importing" your header files in the SWIG-definition file.
This has been my experience as well. Just like I would argue that code which has been written to be testable is better than code which hasn’t, a “public API” which has been written to be consumed by another tool is better than one that hasn’t
I don't fully agree. Returning unique_ptr, optional or some error sumtype (absl::Status) is perfectly fine non-funky API, but it will require special Swig handling to rewrap.
pffffffffffffff yea right. maybe if you're building some tiny project one like 1 platform. anything serious and you're immediately diving through github issues for solutions to problems and poring over the source of both cibuildwheel and pip.
I don't know if it's a feature of Swig, but Swig C++ modules are also hard to use from Python, because they're missing reflection, docstrings, etc. I don't know if it was a shortcoming of the Swig module I was working with (e.g. author didn't bother to add docstrings) or the Swig itself, but it was pretty uncomfortable working with module that provides zero runtime documentation and incomplete documentation.
Also, structure of the resulting module, classes, methods was kind of unpythonic.
CLIF is nice but not under public development and not really getting feature improvements. It's direct connection to LLVM is both a very big strength--it gets industrial-grade C++ parsing from a standards-compliant C++ parser--and it's biggest weakness--building and using it requires sticking with its favored version of llvm sources.
All of that works, especially for Google internally, which uses CLIF quite extensively, but that makes it not so great for external projects.
For Java (JNI), using SWIG was a very pleasant experience for me. It allowed me to abstract away shared logic to C++ that would otherwise have to be implemented in both Java and Objective-C.
I deeply appreciate comments like this. It's frustrating when a tool sounds like the perfect tool for a job, you sink a bunch of time into it, only to discover this kind of scenario. Sometimes there are early indications, but often not. Thanks for sharing your lived experience.
Yeah, I'm never doing that again. I created a binding for the Irrlicht rendering engine with Lua using swig. There were always parsing errors with the header files so I had to create definition files for almost everything. It was a lot of work, but I did finally get it to function.
And then a new version of Irrlicht came out, and I basically had to start from scratch. All the definition files no longer matched the headers so they had to be recreated.
And then a new version of Lua came out and broke modules. The binding was written as a module so that had to be redone too.
I got way more familiar with the internals of swig than I ever wanted to. Each language it supports has it's own little parser/runtime integration, so I had to learn a lot about how swig and lua interact.
If I were to do it again I'd create a proper C binding to the C++ interface, and then bind that to whatever language I wanted. They almost all support C. C++ is a whole other beast.
That's the generally recommended way of exposing your C++ library to any kind of non-C++ code.
I'm not aware of any software which directly helps with that, unfortunately. You either do it manually or write a custom script. Here's a recent example of the latter, from Dear ImGui:
“I'm not aware of any software which directly helps with that, unfortunately.”
If the API surface is not too large that’s one of the things ChatGPT is usually competent enough to ‘type out’ for you most of the way (delta some errors).
Agreed. I had working Swig bindings for Lua to a C data engine. While it worked, it was opaque, required a lot of diving into the Swig code generators, added its own special footguns, and felt like the most dangerous edge in the project.
The next time I wanted to bind Lua to a C++ application, I just hand wrote the bindings. Boring, repetitive, shovel code. Once I had a template, it would take me just a few minutes to hook up new interfaces. If there was ever a problem with one, it was usually a typo and took 30 seconds to find and fix.
In my personal experience, to use a very small C library in python in a work project, my life became simpler when I ditched swig and just called the C API from python.
Then I had a pure python codebase that I could run on any python version without recompilation, and I no longer had the complication of having to deal with swig's different versions across different distribution releases.
For much bigger projects I could imagine that the situation can be different.
You have several options for calling C functions directly from Python without having to start a separate process and communicate with it. For simple jobs when you just want to call a function or two, ctypes from the standard library is a great choice. You load a dynamic library, define the argument and return types of the function you want to call, and call it. It can even handle callbacks from C code into Python.
When you have a lot of C functions to call, having to duplicate all the declarations in Python code gets old. That's where the third-party cffi module comes in. It can parse straightforward C headers and use the appropriate types automatically. However, for complex headers (lots of macros, for example), it has to run a C compiler, which requires the machine which will run the script to have a functioning C toolchain. When using cffi, it can also be pretty easy to crash the interpreter with a use-after-free bug.
If you're already invested in the Cython ecosystem, you can use that for calling into C too. However, getting into Cython just to call a few C functions would be cracking a nut with a sledgehammer.
Finally, the nuclear option is to write a C extension module. Extension modules receive unconverted Python objects directly and can deeply integrate with the CPython interpreter. However, they also need to be compiled for a specific version of CPython, and won't necessarily work in alternative Python implementations. If you can get the job done with ctypes or cffi, you should avoid extension modules.
Using a subprocess and pipes wouldn't work with a .so file. I'd need to write a C wrapper to implement a way to call functions and marshall parameters in binary…
CPython is written in C, and has an extensive API, also in C. A good number of CPython's stdlib is implemented in C. It's fairly easily to write an extension lib in C or C++ or Rust, even, and have python dynamically load the library. Shared libs have to be linked differently than they normally are as ld.so needs to be able to locate and load any dependencies. Libs are typically referred to as 'dlo's instead of 'so's to make the distinction.
There is also gobject-introspection[0], which is also capable to generate bindings for quite a lot of higher level languages from (gobject based) c-code.
I'm always impressed on how simple it is to use an object I defined in c within python.
I know there are many newer tools with various advantages. But I've used SWIG bindings in production for almost 20 years and some of the code is still live. An amazing achievement of the SWIG developers. So the stability and longevity are big advantages as well as the ability to adjust how the C++ code is exposed without actually changing the C++ sources.
Swig is an interesting piece of work. As a project I think it's quite brilliant, I wrote a one-part-of-three-but-never-finished-the-other-two blog post illustrating how nifty I thought it was [1].
In my old age though I've soured on it a little. The original problem I used it for, generating bindings for hundreds of generated pure aggregate data structures (for a serialized data format), is kinda SWIG's best case scenario.
The more complex (or more modern) your code gets, the more involved the types, the more the inherent complexity of the problem will overwhelm even the mighty SWIG and you'll find yourself wishing to use the native C ABI of whatever language you're binding to.
That said, if you're trying to generate bindings for several languages, I think working with SWIG still might be a worthwhile effort.
If you're just targetting python, and it's your first venture into building python extensions, it's fantastic. After a while, it's worth trying out building an extension without the use of swig. Much of the time, it's easier to use than swig is - but you do have to implement all the bells and whistles yourself. If you're writing in C or C++, chances are you like that kind of thing anyway.
I've used it previously, and the issue is that while it "works", it's a nightmare to maintain, and the resulting code is not idiomatic in the high-level language which makes it difficult to use.
I used this years ago and the comments reminded me why I never used it again. Common theme in the comments it’s a nightmare. I do recall it being ungodly prickly to get it to work and debug.
Now days unless it is required for performance I doing up grpc to handle library calls. I’m that lazy. Lazier than a node developer that imports the world for a hello world health check.
SWIG directors are awesome. It's so powerful having some functions implemented in C++ and some in Java in the same class, and able to call each other willy-nilly.
Swig is kind of clunky. But it is so much better than writing JNI yourself. JNI is gouge-your-eyeballs-out awful; SWIG is at least tolerable, and it works.
I took different approach. Because I only needed to support these two languages, there’s no separate interface definition language, and no code generator for interfaces. Instead, users are expected to write both language projections manually.
Then there’s a runtime code generator on the .NET side of the interop which builds runtime callable proxy types for interfaces implemented in C++, also virtual tables for C# objects consumed by C++.
I much prefer to design a C interface by hand around c++ before ever considering exposing to a high level language. Usually it ends up being cleaner to implement the higher language wrapper in that language around the C interface and use idioms that are native to the high level language.
Swig was quite nice and I used it often when I was doing some larger TCL projects. Didn't take very long to easily integrate many of the C (or FORTRAN libs wrapped by C) into TCL (or specifically, TK).
Made putting serviceable graphical interfaces to some of the CLI apps we were using.
I got a great speedup using C++ with pybind11 for my Python backend, 30x if I remember correctly. It was for an app that detects wether words rhyme. I was about to release it but ditched it as soon as ChatGPT came out (it can find rhymes easily, and sometimes better).
I don't think this is the right way: interop with the C ABI, and for the hercules who want to do that with c++, should be native to such programming languages, not from a project on the side.
Agreed. And I think approaches like python or node, where you need to write some c code against their API, is getting it wrong also. The most smooth interop experience I've ever had is with c#. You just export normal c functions from a dll, and the c# code can call it. All the awkward bits like "how do I marshal a string" happens in the host code, not the native module. The overall experience of that is soooo much better. It also has the added benefit that if you just want to load a third party dynamic lib and call a few functions, you don't necessarily need to write any native code at all. You can just... do it. Then you don't need to set up a native build system, etc. It's really, honestly superior.
Are you sure python needs that? I heard about c-type.
A python native module may be providing python semantics on top of a dynamic library (libressl), but if some python code decide to dynamically load the shared library to make direct C ABI calls, I guess this is built-in, isn't it?
It’s highly context dependent. I’ve used stuff like Swig before now to create code that can be shared between multiple different environments. In that scenario the more host side code you’re writing the less effective your solution.
It is. Python, Java, etc have built-in ways to interface with native code. SWIG is a framework which sits on top of those, and aims to make it easy to do that interfacing across multiple languages.
I used Swig to connect our C++ API to Ruby back in... '02 (wow, so 21 years ago, time flies!). It worked great. We could try things out in IRB (the Ruby REPL) and it allowed us to quickly write unit tests in Ruby (this was well before things like gtest).
There are probably better/easier ways to do that now, but back then Swig was the only game in town and for us it worked out pretty well.
Reminded me of the time when I had to write a piece performance-sensitive code in C++ but I decided that unit tests are not performance sensitive so I wrote them in Go. Naturally I used SWIG. It worked quite well since the C++ code exposed a very simple interface.
I thought I could leave it that way until one day a manager came and questioned why we had terrible unit test coverage. It turned out we had separate tools for measuring code coverage in C++ and Go and it was impossible to make them work together.
Curious of the different reasons reasons people are creating Python bindings. Was it because the project started in Python, then when performance became an issue the slow parts were rewritten in C++. Or it started in C++ and Python bindings were added later.
i have several compiler projects that are 99% C++ (fully functioning APIs etc) but also have python bindings. why? python bindings are by far the best APIs to expose to whatever C/C++ code you have that you want "casual" devs to use. who are casual devs? either people that write code non-professionally or professional devs that pick up your project outside of their day-to-day. there is nothing that beats the ease of use and (today) familiarity that people have with python.
If we want to be pedantic, C++ is a high level programming language. Newer languages like Python just provide a higher degree of abstraction, so you could call them "higher-level programming languages".
We tried using this to automate bindings for U3D (u3d.io) and it was a nightmare. Documentation sucks and the syntax doesn't help with making things any easier.
Who classifies C++ as a low-level programming language? Even C is only being called low-level relatively recently, and it's not universally accepted as such.
You could argue <language X> IS NOT (high-level) does not necessarily imply <language X> IS (low-level), in which case no-one classified C++ as low-level.
I guess for the large part the inter-op here is from (mostly) interpreted languages, which are certainly higher level than C++.
Maybe instead of it being a binary choice between high-level and low-level languages, there is a middle ground. Fairly easy to argue that as the case.
The way the title is written does imply C and C++ are not high-level languages. "Connect C/C++ programs with high-level programming languages", not "connect C/C++ programs with other high-level programming languages".
After reading the first sentence on the site, though, I think this was just a somewhat clumsy rephrasing. "SWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages." "A variety of high-level programming languages" implies there are some high-level languages that SWIG cannot connect C/++ with, and it's open to interpretation whether C/++ are among the high-level languages that SWIG doesn't connect C/++ with.
> instead of it being a binary choice between high-level and low-level languages, there is a middle ground.
There is: mid-level languages. I was taught (and still think of) C as a mid-level language; things like assembly are low-level languages; C++, Python, Java, etc., are all high-level languages.
But I'm a graybeard, and these are the meanings that I've learned. It's possible that the definitions have shifted and nobody told me.
That doesn't seem like a useful dividing line, if only because it consigns a lot of different languages, with a very wide array of different levels of abstraction, as being "low level". That makes the term "low level" very nearly worthless.
I thought the "level" of a programming language was referring to the level of abstraction it presents rather than how memory allocation is performed. Low-level would be very close to machine language. C and C++ have very different levels of abstraction, for instance, and being able to express that seems useful.
A language and its libraries will inevitably allow a whole range of abstractions. Also, just about every piece of software has layers above and below. Hence my subjective qualification: "to most contemporary programmers" - to a bunch of which everything below the Javascript engine is low level !
> A language and its libraries will inevitably allow a whole range of abstractions.
True, but that's a different sort of "abstraction" than what I'm talking about. I'm talking about how abstracted the processor itself is. Libraries and the like don't really enter into it -- this is about the language proper.
Machine language has no abstraction whatsoever. What you write is literally the code that the CPU executes.
Assembly language is one higher level of abstraction. There's still mostly a 1 to 1 correspondence to machine language, but some of that gets hidden for human convenience. You're now writing with symbols instead of numeric op codes, and you have concepts like macros, which have no machine language equivalent.
C is a higher level than that. Every C statement can easily be expressed directly in assembly, but more common operations (loops, subroutines, etc) have a shorthand that makes them easier to write. There is no longer a 1 to 1 correspondence with assembly or machine language, but it's not terribly far from assembly. Still, it's abstracted enough that the language is no longer tied to a specific processor.
C++ is yet another level up the ladder. In a sense, C++ is to C what assembly is to machine language.
And so forth.
I suppose that a plausible (but highly imperfect) rule of thumb for how high up the ladder of abstraction a language sits is how many machine language instructions are required to implement a given language construct. The more required, the higher the level of abstraction. High level languages also have a lot more (indeed, mostly consist of) constructs that simply don't exist at the machine language level.
CLIF is way nicer for C++/Python. If you need other languages then best of luck (Zig, Rust and D are easier than Java and Go at least).