Notes on the Julia programming language

Rob Blackwell

Introduction

I'm using the Julia programming language for my PhD work and this page records various notes as an aide-mémoire. I acknowledge that some of the material is opinionated, but I'm happy to accept corrections or contributions. As of 20220626, some of this material is now quite old.

Getting Started

Download and install the Julia current release. Any tutorials by David Sanders are good, e.g. An Invitation to Julia. There are some books (see Learning Julia), but the Julia documentation arguably serves as a better resource.

Tools

There are lots of options including Juno and Visual Studio Code, but I'm just using Emac, julia-mode and julia-repl.

I really like IJulia (Jupyter notebooks) and tend to have one per experiment / investigation, printing them out and discussing them at supervisory meetings.

Python support

It's easy to call Python libraries from Julia, and this is a useful trap door if you need a specialist library. See PyCall.jl and Conda.jl.

Examining Conda.ROOTENV within Julia gives the directory of the underlying Python environment. This allows you to hop out to a shell, activate the environment and pip install odd packages like pynmea2.

using PyCall
@pyimport pynmea2
pynmea2.parse(mystring)
		

Plotting

There are lots of plotting libraries in Julia. I use PyPlot forpublication quality plots. Winston is useful and fast for interactive use.

NaN

Does an array contain NaNs? any(isnan, A) (Thanks to Markus Kuhn).

Remove NaNs filter(!isnan, A)).

Missing

if m === missing ... to test for missing.

skipmissing(A).

Remember to convert missing to NaN for PyPlot.

Dataframes

describe(df), first(df,5), last(df,5), names(df), dropmissing(df, field).

How can I push to the front of a list?

Whilst there is a prepend! there is no non-destructive prepend. It's not needed, try [1 ; [2,3,4]] instead.

Interpolation

Interpolation via interp1 and interp2 is a common thing in MATLAB and Interpolations.jl is the Julia equivalent. This code is similar:

using Interpolations

function interp1(X, V, Xq)
    knots = (X,)
    itp = interpolate(knots, V, Gridded(Linear()))
    itp[Xq]
end

function interp2(X, Y, V, Xq, Yq)
    knots = (X,Y)
    itp = interpolate(knots, V, Gridded(Linear()))
    itp[Xq, Yq]
end
	    

Scripting

Julia can be used to make shell scripts, using the shebang line below:

#!/usr/bin/env julia
	    

This can be useful in large projects where it allows for Make to be used to build products incrementally. Beware, the Julia start up time can be large in comparison to the run time of your script so it can be very inefficient to work with small files and tasks.

Logging

The output of @info(), @warn() and @error() is sent to STDERR instead of STDOUT making them ideal mechanisms for instrumenting and logging scripts, without messing up output.

$ ./script.jl > output.txt 2> log.txt
	    

The above results in two files, output.txt with everything your printed and log.txt with everything written by info, warn and error.

CPU and memory

Julia garbage collection is now GC.gc()

julia> VERSION
v"1.0.2"

julia> Sys.total_memory() / 2^20
15764.0078125

julia> Sys.free_memory() / 2^20
294.72265625

julia> Sys.CPU_NAME
"skylake"

julia> Sys.cpu_summary()
Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz: 
       speed         user         nice          sys         idle          irq
#1  3500 MHz      27764 s         86 s       8320 s     472683 s          0 s
#2  3502 MHz      26227 s        126 s       5428 s     375699 s          0 s
#3  3506 MHz      32798 s          8 s       5405 s     372848 s          0 s
#4  3499 MHz      26937 s          7 s       5934 s     376377 s          0 s

Masking confusion

julia> a = [1 2 3; 4 5 6; 7 8 9]
3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

julia> a .> 3
3×3 BitArray{2}:
 false  false  false
  true   true   true
  true   true   true
		

All good so far, but the following result can be a surprise - we only get the values, not the shape.


julia> a[a .>3]
6-element Array{Int64,1}:
 4
 7
 5
 8
 6
 9
		

However mutating the array does work:

julia> a[a .>3] .= 3
6-element view(::Array{Int64,1}, [2, 3, 5, 6, 8, 9]) with eltype Int64:
 3
 3
 3
 3
 3
 3

julia> a
3×3 Array{Int64,2}:
 1  2  3
 3  3  3
 3  3  3
		

Compared to Python

Whilst Python code can be optimised, I find most Julia code to be more performant with no effort (although I acknowledge that this isn't always the case). For me, the switch to Julia meant that I didn't have to move to a high performance compute cluster from the convenience of a couple of PCs in the corner of my study.

Most of the time, Julia just feels nicer to use, and that's important when you spend a large proportion of your life staring at a REPL.

The whole Python 2 or Python 3 thing is just irritating.

That said, there is still a lot of Python code under the Julia hood (e.g. IJulia) so don't knock it!

Compared to MATLAB

MATLAB usage is widespread amongst scientists for good reason; it's established and has very high quality, proven libraries. In many ways it is the gold standard for scientific, exploratory programming.

Unfortunately, a lot of graduate student science is just grunt work. Reading legacy file formats and dealing with big data is a big part of that. MATLAB can do it, but it isn't very elegant. I also found problems with compatability between versions. No two scientists seem to use the same version and it worked on my machine is a common cry. Upgrading a machine is often considered too risky or too expensive.

I found that MATLAB programmers can readily understand Julia code. Googling for a MATLAB solution to a problem often turns up a function name, and that can be the key to finding an equivalent Julia solution.

Julia is free and open source. The barrier to entry is much lower and that's got to be a good thing for reprodicible science.

There is a mat.jl library which allows reading and writing of MATLAB MAT files..

Niggles and problems

Julia is a Lisp-1 so naming variables so that they don't clash with functions can sometimes be annoying, e.g

julia> wheels = wheels(mycar)
ERROR: invalid redefinition of constant wheels

Wishlist

Is there a way to ask whocalls(f) where f is some function?

I'd like some refactoring tools.

SLIME support in Emacs would be great. I have a proof of concept SWANK server.