I currently use Cython to link C and Python, and get speedup in slow bits of python code. However, I'd like to use go routines to implement a really slow (and very parallelizable) bit of code, but it must be callable from python. (I've already seen this question)
I'm (sort of) happy to go via C (or Cython) to set up data structures etc if necessary, but avoiding this extra layer would be good from a bug fix/avoidance point of view.
What is the simplest way to do this without having to reinvent any wheels?