Could you please add an example with a simple CUDA kernel in the Julia [introduction notebook](https://github.com/EnzymeAD/Enzyme-Tutorial/blob/main/julia/introduction.ipynb) ?