Automating Your Code

It’s been far too long since I’ve posted, but I wanted to share a small piece of how my work has changed over the last year in case my experience ends up being helpful for anyone.

Automated Actions with Code

For a few years now I’ve gotten used to setting up projects as code repositories (by this I really mean GitHub repositories, but I assume there are other providers to host versioned code). The point of these repositories is to make it easy to track changes to a collection of code, even when a group of people is collaborating and updating different parts of the code. For a while I thought that my sophistication with this system would develop mostly in the areas of managing and tracking these code changes (pull requests, working on branches, etc.), but in the last few months my eyes have been opened to a whole new world of automated tests or “actions.”

Of course, providers like GitLab, CircleCI, TravisCI, etc. have been providing tools for automated code execution for some time, but I never ended up setting those systems up myself and so they seemed a bit too intimidating to start out with. Then, sometime last year GitHub introduced a new part of their website called “Actions” and I started to dive in.

The idea with these actions is that you can automatically execute some compute using the code in your repository. This compute has to be pretty limited, can’t use that much resources and can’t run for that long, but it’s more than enough capacity to do some really useful things.

Pipeline Validation

One task I spend time on is building small workflows to help people run bioinformatics. Something that I’ve found very useful with these workflows is that you can set up an action which will automatically run the entire pipeline and report if any errors are encountered. The prerequisite here is that you have to be able to generate testing data which will run in ~5 minutes, but the benefit is that you can test a range of conditions with your code, and not have to worry that your local environment is misleading you. This is also really nice in case you are making a minor change and don’t want to run any tests locally — all the tests run automatically and you just get a nice email if there are any problems. Peace of mind! Here is an example of the configuration that I used for running this type of testing with a recent repository.

Packaging for Distribution

I recently worked on a project for which I needed to wrap up a small script which would run on multiple platforms as a standalone executable. As common as this task is, it’s not something I’ve done frequently and I wasn’t particularly confident in my ability to cross-compile from my laptop for Windows, Ubuntu, and MacOS. Luckily, I was able to figure out how to configure actions which would package up my code for three different operating systems, and automatically attach those executables as assets to tagged releases. This means that all I have to do to release a new set of binaries is to push a tagged commit, and everything else is just taken care of for me.

In the end, I think that spending more time in bioinformatics means figuring out all of the things which you don’t actually have to do and automating them. If you are on the fence, I would highly recommend getting acquainted with some automated testing system like GitHub Actions to see what work you can take off your plate entirely to focus on more interesting things.