Mastering Scientific Programming: Practical Tips and Tricks
How can you become a better scientific programmer and boost your numerical software development?
Scientific programming involves writing code to solve scientific problems. This can range from simulating complex physical phenomena to analyzing large datasets. While such software is incredibly important, it can be challenging for scientists to learn all the required software development skills. However, by gradually adding specific tricks into your workflow, you can enhance your coding efficiency and effectiveness.
Software skills are important for everyone these days, including scientists. I see certain common risks if you do not spend effort on your code quality:
Incorrect code leads to incorrect results, which means you may have to redo work or even risk damaging your reputation.
Unreproducible code means others, including your future self, cannot verify your work, nor built on top of it.
Both incorrect code and unreproducible code may lead people to stop trusting your software and the conclusions they draw from that code.
As your code grows, it may become unreadable and unmaintainable, making it harder for you and others to understand it and contribute further to the code.
Scientific code and scientific principles are also applied in startups and in the industry, for research and development of software products. All these aspects are even more important to learn if you ever want to join a professional software development organization.
Choose the Right Language
When it comes to scientific programming, choosing the right programming language is crucial. You want a language that is efficient, easy to use, and has good libraries for scientific computing. Some popular languages for scientific programming include Python, MATLAB, R, and Julia. If you need performance, you typically end up learning Fortran, C, C++ or Rust, though they are considered more difficult and take more time to master.
If you need both performance and simplicity (you probably do if you write complicated algorithms) then you quickly encounter the so-called "two language problem". This is the fact that you typically need to work with at least two programming languages. Read my recent article about how to solve this two language problem if you want to know more.
Choosing the right language can be a difficult task. Typically people pick the language that people around them are using, but it may pay off to investigate alternatives in order to avoid running into technical difficulties later on.
Write Clean and Readable Code
Scientific programming often involves writing complex algorithms and data structures. It's important to write code that is easy to read and understand. This will make it easier to debug and maintain your code in the long run. Some tips for writing clean code include using meaningful variable names, adding comments to explain your code, and breaking up long functions into smaller, more manageable pieces. I intend to elaborate on many of these topics on this blog.
Test Your Code
Testing is an important part of software development in general. You want to make sure that your code is working correctly before you use it to analyze data or simulate physical phenomena. One popular testing method is unit testing, which involves writing small tests for individual functions or methods. This can help you catch bugs early on and ensure that your code is working as expected. There is a lot of attention in the software development community regarding testing. But I believe this topic also deserves another blog post from me, to explain how to get started, but also how to address the iterative nature of scientific development.
Use Version Control
Version control is a system for managing changes to your code over time. It allows you to keep track of changes, revert to previous versions, and collaborate with others. One popular version control system is Git, which is widely used in scientific programming. In some organizations you may also encounter other systems, such as SVN.
Thanks to Github and other tools using Git, it has become ever simpler to control your code and share changes with others.
Code Reviews
When you write scientific publications, you go through a rigorous reviewing process, starting with advice from your colleagues and finally a peer review procedure. Somehow the code doesn't always get such rigorous reviewing.
In typical software development environments, code reviews are common practice, for example to make sure the code is readable and correctly tested. Most version control systems, such as Github, provide easy web interfaces to inspect code changes and leave comments. Another practice to improve the code is so called "pair programming", where you code together side-by-side, essentially doing the reviewing in real time.
Learn from Others
Scientific programming is a rapidly evolving field, and there is always something new to learn. One of the best ways to learn is to collaborate with others who have more experience or knowledge. This can involve joining online communities or attending scientific programming conferences. There are also books to learn from, though I wish there were more targeting the challenges of scientific software. Currently I am reading Software Engineering for Science, which someone suggested to me recently.
Of course if you want to stay up to date, you can subscribe to this blog, where I intend to keep sharing my knowledge.
Conclusion
In conclusion, scientific programming is an important tool for solving complex scientific problems. By choosing the right language, writing clean and readable code, testing your code, using version control, and learning from others, you can write efficient and effective software that advances the field of science.