Notes on the Julia programming language
Rob Blackwell
Introduction
I'm using the Julia programming language for my PhD work and this page records various notes as an aide-mémoire. I acknowledge that some of the material is opinionated, but I'm happy to accept corrections or contributions. As of 20220626, some of this material is now quite old.
Getting Started
Download and install the Julia current release. Any tutorials by David Sanders are good, e.g. An Invitation to Julia. There are some books (see Learning Julia), but the Julia documentation arguably serves as a better resource.
Tools
There are lots of options including Juno and Visual Studio Code, but I'm just using Emac, julia-mode and julia-repl.
I really like IJulia (Jupyter notebooks) and tend to have one per experiment / investigation, printing them out and discussing them at supervisory meetings.
Python support
It's easy to call Python libraries from Julia, and this is a useful trap door if you need a specialist library. See PyCall.jl and Conda.jl.
Examining Conda.ROOTENV
within Julia gives the directory of
the underlying Python environment. This allows you to hop out
to a shell, activate
the environment and pip install
odd packages like pynmea2
.
using PyCall @pyimport pynmea2 pynmea2.parse(mystring)
Plotting
There are lots of plotting libraries in Julia. I use PyPlot forpublication quality plots. Winston is useful and fast for interactive use.
NaN
Does an array contain NaNs? any(isnan, A)
(Thanks to Markus Kuhn).
Remove NaNs filter(!isnan, A))
.
Missing
if m === missing ...
to test for missing.
skipmissing(A)
.
Remember to convert missing to NaN for PyPlot.
Dataframes
describe(df)
, first(df,5)
, last(df,5)
, names(df)
,
dropmissing(df, field)
.
How can I push to the front of a list?
Whilst there is a prepend!
there is no non-destructive prepend
. It's
not needed, try [1 ; [2,3,4]]
instead.
Interpolation
Interpolation via interp1
and interp2
is a common thing in MATLAB
and Interpolations.jl
is the Julia equivalent. This code is similar:
using Interpolations function interp1(X, V, Xq) knots = (X,) itp = interpolate(knots, V, Gridded(Linear())) itp[Xq] end function interp2(X, Y, V, Xq, Yq) knots = (X,Y) itp = interpolate(knots, V, Gridded(Linear())) itp[Xq, Yq] end
Scripting
Julia can be used to make shell scripts, using the shebang line below:
#!/usr/bin/env julia
This can be useful in large projects where it allows for Make to be used to build products incrementally. Beware, the Julia start up time can be large in comparison to the run time of your script so it can be very inefficient to work with small files and tasks.
Logging
The output of @info()
, @warn()
and @error()
is sent to STDERR instead of STDOUT making them ideal
mechanisms for instrumenting and logging scripts,
without messing up output.
$ ./script.jl > output.txt 2> log.txt
The above results in two files, output.txt with everything your printed and log.txt with everything written by info, warn and error.
CPU and memory
Julia garbage collection is now GC.gc()
julia> VERSION v"1.0.2" julia> Sys.total_memory() / 2^20 15764.0078125 julia> Sys.free_memory() / 2^20 294.72265625 julia> Sys.CPU_NAME "skylake" julia> Sys.cpu_summary() Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz: speed user nice sys idle irq #1 3500 MHz 27764 s 86 s 8320 s 472683 s 0 s #2 3502 MHz 26227 s 126 s 5428 s 375699 s 0 s #3 3506 MHz 32798 s 8 s 5405 s 372848 s 0 s #4 3499 MHz 26937 s 7 s 5934 s 376377 s 0 s
Masking confusion
julia> a = [1 2 3; 4 5 6; 7 8 9] 3×3 Array{Int64,2}: 1 2 3 4 5 6 7 8 9 julia> a .> 3 3×3 BitArray{2}: false false false true true true true true true
All good so far, but the following result can be a surprise - we only get the values, not the shape.
julia> a[a .>3] 6-element Array{Int64,1}: 4 7 5 8 6 9
However mutating the array does work:
julia> a[a .>3] .= 3 6-element view(::Array{Int64,1}, [2, 3, 5, 6, 8, 9]) with eltype Int64: 3 3 3 3 3 3 julia> a 3×3 Array{Int64,2}: 1 2 3 3 3 3 3 3 3
Compared to Python
Whilst Python code can be optimised, I find most Julia code to be more performant with no effort (although I acknowledge that this isn't always the case). For me, the switch to Julia meant that I didn't have to move to a high performance compute cluster from the convenience of a couple of PCs in the corner of my study.
Most of the time, Julia just feels nicer to use, and that's important when you spend a large proportion of your life staring at a REPL.
The whole Python 2 or Python 3 thing is just irritating.
That said, there is still a lot of Python code under the Julia hood (e.g. IJulia) so don't knock it!
Compared to MATLAB
MATLAB usage is widespread amongst scientists for good reason; it's established and has very high quality, proven libraries. In many ways it is the gold standard for scientific, exploratory programming.
Unfortunately, a lot of graduate student science is
just grunt work. Reading legacy file formats and
dealing with big data
is a big part of
that. MATLAB can do it, but it isn't very elegant. I
also found problems with compatability between
versions. No two scientists seem to use the same
version and it worked on my machine
is a
common cry. Upgrading a machine is often considered
too risky or too expensive.
I found that MATLAB programmers can readily understand Julia code. Googling for a MATLAB solution to a problem often turns up a function name, and that can be the key to finding an equivalent Julia solution.
Julia is free and open source. The barrier to entry is much lower and that's got to be a good thing for reprodicible science.
There is a mat.jl library which allows reading and writing of MATLAB MAT files..
Niggles and problems
Julia is a Lisp-1 so naming variables so that they don't clash with functions can sometimes be annoying, e.g
julia> wheels = wheels(mycar) ERROR: invalid redefinition of constant wheels
Wishlist
Is there a way to ask whocalls(f)
where f is some function?
I'd like some refactoring tools.
SLIME support in Emacs would be great. I have a proof of concept SWANK server.