-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
wrapper-free interop with C/C++ #8327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
c2nim can parse C++ code and has been used to wrap Urho3D, wxWidgets and Unreal Engine 4. It's true it's quite some manual work though. |
with something like the clang parser you could likely create a pretty seamless experience. I thought about it in the past, but it would still be a lot of work (even though I've got a pretty nice clang wrapper already). |
I didn't know
using clang as a library avoids having to parse/analyze C and C++ altogether (while guaranteeing 100% compatibility at least as far as generating llvm IR is concerned);
cool which one? related to that, in https://nimble.directory/search?query=wrapper I find:
however both these lead to just nim-libclang, seems like a bug Difficulties I see with automated conversion have more to do with lack of proper mapping in Nim for some C++ features. I feel the biggest one would be
There are other ones, but probably not as common:
|
@timotheecour why do you call it "nim2c"? it's "c2nim" :) |
To provide C++ interop, you need to understand C++ ABI - this ABI is non-standard, full of surprises, twists and turns and many years worth of legacy features. It would be an incredible time-sink, and what you'd get in return is a bug-ridden maybe-works-if-you're-lucky implementation. Right now, the Nim compiler remains ignorant of any such issues and delegates them to the backend and ultimately to the underlying C/C++ compiler - it has no notion of an ABI beyond the very bare minimum, exportc and a few other patches of support. Neither the Nim language nor the compiler are equipped to handle C++ interop, increasing the amount of work needed to get a feature like this done. Steps are being taken to get closer - at least to a point where the dramatically less complicated C ABI can potentially be covered (for example with the alignment work). If there was an MVP for this feature, it would be covering at least C interop, which remains the lingua franca and lowest common denominator of interop. It is not without reason that even LLVM itself offers a C api - C++ is simply unsuitable for interop without vast surgery or resources (failed examples include "managed" C++ of Microsoft fame). D is in a different position in that it tries to be an explicit C++ replacement and match many of its features, with a smaller impedance mismatch. all in all, I think the imperfect brute-force approach of c2nim, warts and all, serves nim better for the time being, until nim matures as a language and gains the features to even meaningfully talk about the problem. at least c2nim is honest in its limitations. |
Perhaps I am misunderstanding you but I think this statement is completely wrong. The Nim language and compiler is a duo that offers one of the most powerful C++ FFIs out there. Nim compiles to C++ after all, why are you saying that it's not equipped to handle the interop? |
Nim compiles to C++ but it does not understand C++ - multiple inheritance, functors, ADL, template metaprogramming, even trivial stuff like destructors (yes, I know they're coming, and I'm guessing they'll be slightly off compared to C++) are all foreign to the Nim language. An |
/cc @dom96
as described by @arnetheduck as well as in top post, this is very far from what you can do in Calypso, which doesn't require writing any wrapper, and understands a very large subset of C++ (enough to wrap C++ standard library, template heavy C++ opencv etc)
now that motivation and current limitations are hopefully clear, I'd really like to focus the discussion in this issue on how to bridge the gap and make C++ interop more useful There are 2 ways I can think of :
it doesn't have to support all of C++ to be useful, ie, libclang will accept and parse correct C++ code, and Note that c2nim doesn't have this property, since parsing failures can affect future translations. |
As the author of nimgen, I can agree that having a seamless interop with C/C++ would be great. I am a big proponent of code reuse and feel it is a no brainer to leverage established C/C++ libs. If I'm also on the fence on replacing c2nim's engine with libclang. The separate tool will still be out of sync and some of the challenges are managed by running code through the preprocessor - Nimgen makes that easy. The only advantage would be to translate C/C++ code completely into Nim but I'm not a fan of that beyond wrapping since you need to inspect every generated line and it isn't scalable for large projects. It also goes against my principal of reuse - translated code won't benefit from upstream improvements and bug fixes. Manual edits will be inevitable and I doubt automating everything through Nimgen will be viable. Meanwhile, I will also say that despite c2nim warts and the long list of open issues, I have been able to wrap quite a few C++ libs, let alone C. Check out nimgraphql for a complex example. Given nimgen is automated, the wrappers also stay up to date with minimal maintenance effort. The other benefit is that all users need to do is Assuming current state of affairs, my near term wish list for the Nim compiler is as follows:
For c2nim:
For nimgen:-
|
Go ahead, write a prototype. |
Nim supports |
Oh good reminder - genotrance/nimgen#36. I will start using it in the future. |
reorder is useful indeed for this task but needs a bit of love |
One more item to Nim wish list to my comment above:
|
see also how AutoFFI iterates over all declarations parsed by libclang: https://github.com/AutoFFI/AutoFFI/blob/master/src/clang.cpp ; can't directly be used, but can be used for inspiration |
https://ziglang.org/ btw uses
haha, if you think parsing C++ is hard, wait until you get to the semantics part ;) |
libclang has a Sema library to help with semantic pass: https://clang.llvm.org/docs/InternalsManual.html#the-sema-library links
the bulk of the work would be to translate C++ semantic concepts (parsed by clang as AST with semantics attached, eg a TypeOf node) to Nim concepts; and that could be done gradually starting from the easiest (ignoring things that isnt' yet translatable), even on large complex programs |
Well go ahead and start working on it. I have heard "c2nim is bad, I will write a better tool based on LLVM" from at least 3 people now. They never delivered... |
If you use libclang, I advise using the C++ API. The C++ API is the real API, and the C API is a stable wrapper that barely scratches the surface area. It's a dead end. The other problem we ran into is that pointers cannot be null in zig, but obviously in C they can, so we have to translate every C pointer as an optional pointer. |
.. or you can do like rust devs did and contribute to the C api such that it becomes less of a dead end. it's actually quite simple. |
You can actually use a combination of the C and C++ APIs, I've done it and it works pretty well. The C API holds references to the C++ AST objects so you can query it for information that isn't available via the C API. I've got a package that does this and I will open source it soon. |
I played around with libclang for a while but it is huge and tedious to build. I'm not sure it is the best course of action. In my quest to find a better way, I found tree-sitter which is a language parser built by Github for Atom. It supports over 18 programming languages and parses them into a common AST format which is then being leveraged for syntax highlighting and code folding among other possibilities. I've gone ahead and wrapped it using c2nim/nimgen and it works as expected for these 18 languages. I'm now looking into the AST format to see how it can be leveraged in our world. If it becomes possible to convert this AST into Nim code, it will become possible to convert code from all these languages into Nim. Of course, have to be realistic - it may not be the case that there's a 1:1 mapping for every construct but it certainly seems interesting. Moreover, I'm interested in wrappers so I'm not as motivated to convert everything into Nim yet, just C/C++ headers into definitions that Nim can immediately leverage. The question in my mind is whether this wrapper interop can be done at compile time - given tree-sitter is C code, it would have to be built into the Nim compiler to do that. Without that, it would end up being equivalent to c2nim. I cannot see a way to create a library that adds this capability via macros since the VM cannot importc at compile time. Finally, it will be super cool to have a Nim grammar as part of tree-sitter so that existing Nim source code can be parsed and supported just as well as these other languages. I hope someone takes on that effort. |
I've started nimterop which builds tree-sitter into a binary and then converts the ast into Nim using macros. It's working pretty well so far and @timotheecour has also been making contributions. I'll appreciate a review of the approach and any feedback to ensure this thing has legs. Again, my goal is only wrappers and not outright conversion of C/C++ to Nim so the scope is limited at this time. |
my take would be that I prefer the wrapper gen to output a nim file for several reasons (instead of it all being hidden behind macro magic:
I use c2nim currently to import the |
Of course it can. Keep in mind that the VM can execute processes easily, this enables pretty much anything. You can write a small binary that takes as input a .c/.cpp filename and output a JSON-formatted AST, parse that in your macro and you've got wrapper-free interop. Now that I think about it you can probably just run |
@arnetheduck - thanks for the feedback - agree with all your statements.
Agreed - design is moving towards this. Right now, the code is run on every compile since it is still POC grade but generating a .nim file is where we will end up.
My approach with nimgen has been to make the wrapper process seamless for a consumer. I like how I can simply @dom96 - nimterop already does a whole bunch using macros including creating the AST data structure so macros are really capable of anything! That being said, I'm working on moving most of the functionality into the binary since the VM is slower and this method involves C => tree-sitter AST => string => macro AST => Nim code which is round about. It will also work better standalone to meet @arnetheduck's use case.
I'd have loved to do this but clang isn't the default on Windows or Linux. Downloading several hundred megs to just do the AST generation will be a showstopper for most. |
yeah, good point about the "get-up-and-running-seamlessly" and the package doing all this - I just don't think we're there yet, with nimble :) ie you're absolutely right that if all other stars were aligned, checking in a generated file would be deeply questionable. an additional argument to do so anyway might be that it removes the need to have the header files installed at all - a common-enough situation given that on windows you usually get just a dll, and on linux you need |
IMHO this "don't commit generated files" is a huge fallacy that indirectly produces stuff like https://gcc.gnu.org/wiki/WindowsBuilding where you need make, perl, flex, bison installed with the right version or else you can't build it. |
That's because you are looking at this problem from the wrong angle, sorry to be blunt. Ship the generated Nim code. |
@arnetheduck - so that's what I have done for most nimgen wrappers - download the upstream sources via git or zip, generate wrappers and then compile in all the sources, no binary required. And none of this is the end user's problem. You simply There are some wrappers where this isn't possible (nimbass) or painful (nimssh2) but nimgen supports compiling in, DLLs and static lib scenarios.
I've taken this on in nimgen to an extent since I feel it is crucial to make the process seamless. It makes the nim ecosystem that much richer and easier to get started in. It is tedious up front but I've done 25 wrappers myself. Every package is tested with 0.18.0, 0.19.0 and devel daily with the latest upstream changes so anyone can use them at any time.
@Araq: I know we discussed this a bit earlier on IRC but as asked before, there's too many moving parts:
I can check in a bunch of Nim files as a Nimble package but they are a snapshot in time. Anything outside that tested combination and stuff may not work or meet the consumer's requirements. You see so many Nim wrapper packages in this situation which were purpose built by the consumer for a project, not built as a sustainable package. This approach works for a consumer - install at a point in time and save that combo of everything and maintain it in source control until there is a need to upgrade. It does not work as a library maintainer who has to cater to any variation of the above. It is not a seamless experience in just a few months and not scalable if you have to generate and maintain an archive of combinations. Nimgen doesn't solve everything but allows me as a package maintainer to keep things up to date. If a consumer wants a snapshot, they can and absolutely should make one. These aren't reproducible builds though, not yet. If there's a way to solve this in a scalable fashion, I'm all for it. That being said, my primary goal is to make it easy for others to use these packages and transition into Nim. I don't believe making it static is going to achieve that. |
Here is the workflow I assumed your
|
There was a long discussion on IRC about this topic. The list of concerns with the nimgen approach are the following:
Thanks for all the feedback so far. Meanwhile, we are continuing to work on nimterop and will port relevant improvements into the nimgen workflow accordingly. |
Quick update on this issue - nimterop has been growing in functionality over time. The road is still long but the current status makes me optimistic. I think it is time to close this issue since Nim provides all the infrastructure required to pull off this interop without compromise or requiring any fundamental changes to the compiler. Minor details have been discussed in other issues and have provided a good direction to continue on. I encourage the community to continue providing your guidance and feedback to ensure development continues in the right direction. |
Having a good C and C++ interop for Nim would be of strategic importance for wider adoption of Nim, as it would allow reusing the massive code bases out there (eg opencv, qt, SFML, ...) without having to either rewrite them or writing and maintaining wrappers.
Calypso
https://github.com/Syniurge/Calypso, a fork of ldc compiler for D, is an amazingly cool project that allows direct interface between D code and C or C++, without using any wrapper, any (because it uses clang and llvm), understands virtually all of C++ (including pre-processor, C++ templates, exceptions, etc).
A C++ class (eg opencv
cv::Mat
) or functions/templates can be used in D directly afterimport (C++) cv.Mat;
without need to write or generate wrapper code for these, and templates don't need to be instantiated in order to be used: they can be used directly. We can pass/return by value, pointer, or reference, we can even derive C++ classes in D etc.Here's a simple example importing Qt in D: https://github.com/Syniurge/Calypso/blob/master/examples/qt5/qt5demo_simple.d
Thanks to calypso, I was able to use some opencv functionality from D in non-trivial use cases involving heavy use of C++ features and it actually worked (modulo some bugs that have been fixed since then for the most part).
Nim interop
On the Nim side, we can embed C or C++ code as follows:
however the hard part is writing the wrapper code (especially for larger C API's or any C++ API)
For C projects, c2nim can be used but it's not based on a full C frontend (eg clang) and can quickly run into limitations, eg see CREATING A NIM WRAPPER FOR FMOD which shows a number of manual extra steps had to be employed to wrap C library FMOD.
For C++ projects, there's currently no way to automatically generate wrappers/bindings, one has to resort to tedious manual mapping of C++ classes, taking care of manual allocation/deallocation of C++ classes in Nim code, no pass-by-value causes a performance hit, etc.
is a calypso-like approach for Nim feasible in the short/medium/long term?
/cc @Syniurge @arnetheduck @Araq @dom96 I'm curious what are your thoughts on that. Nim compiles to C, C++, and some other backends (objc, js).
@arnetheduck wrote nlvm, a LLVM-based compiler for Nim which could be used as a starting point (it provides the glue layer between the AST (produced by Nim compiler) and LLVM, replacing the C output
with LLVM bitcode. It uses the llvm-c interface (source)
At the end of the day, I'd love to be able to just write:
notes
links
The text was updated successfully, but these errors were encountered: