Skip to content

0.1. Open source

Open source refers to software where the source code is made freely available for anyone to view, use, modify, and distribute (usually for free).


Why create software that's open source?

  • Transparency: Anyone can see how the software works. This builds trust in the methodology as it can be inspected and validated.

  • Collaboration: Developers worldwide can contribute improvements or fix bugs, which accelerates progress.

  • Freedom: You can use the software for any purpose, customise it to your needs, and share it with others, leading to wider adoption.

  • Community: Often, open source projects have communities that support each other and grow the project.

In contrast, closed source software keeps its source code private and users just get the compiled program without access to how it works inside.


Open source in data science

Python and it's libraries are open source, along with many other tools widely used in data science for the above reasons.

The most popular tools will also be designed to interact seamlessly with other popular open source tools.

These can generally be installed very easily, making it quick to try out new tools and if it doesn't meet your required use case, there's no investment that has been made and can be uninstalled just as easily.

If you need assistance when using an open source tool, there are other users and those maintaining the tool to provide it.


This course itself is open source and has been created with open source tools

This course makes use of the MkDocs library allowing it to be created with markdown without needing HTML and CSS for the styling.

You can also view the underlying code by visiting the repository, this also means you can adapt, share or even use it for commercial purposes under the Creative Commons Attribution 4.0 International license, providing you give appropriate credit.