Contributing to DESDEO
In this tutorial, step-by-step instructions are given on how to begin contributing to DESDEO, and what one should consider when developing DESDEO, such as coding practices and typical workflows. We first cover the required software to be installed in the section Installing required software. Then, we discuss how to download DESDEO's source code and setup a virtual environment in the section Setting up a virtual environment and installing DESDEO. A typical Git workflow is then described in the section Typical Git workflow for contributing to DESDEO. Development practices and utilized tools are discussed in the section Development practices, and how to integrate some of these into an integrated development environment are discussed in the section Integrated development environments. The main points of this tutorial are then summarized in the section Summary. Lastly, conclusions and potential next steps in contributing to DESDEO are outlined in the section Conclusions, where to go next, and our Discord server.
Installing required software
In this section, instructions are provided for installing the software
required to start contributing to DESDEO. Instructions
have been provided for the most common operating systems. We have
assumed a command line environment to be available on each operating system
(indicated by the $
symbol).
Applications with a graphical user interface may also be utilized, but will not
be covered in this tutorial. However, most of the presented content
should be applicable even outside a command line environment. Instructions
for each platform can be found in the following sections:
Lastly, we also assume that we have a GitHub account setup. If not, we can visit GitHub to setup a new account.
Windows
Here, we assume to be operating in a powershell environment. The first step
is to install Python on the system, unless it is already installed. To check
which version of Python are supported, check the section
Requirements. If utilizing the .exe
installer for installing Python, we should ensure that the installer also sets the
necessary Path
environment variables. There should be a check-box for this
during the installation. Python binaries for Windows platforms can be found on
the Python website.
Ensuring Path variables are updated
To ensure changes in Path
variables are in effect, it is advisable to logout
of the current Windows session, and then log back in.
To check that Python has been installed correctly on you system, we can open powershell and run the command
this should report the version of the currently installed Python interpreter.
Next, we need to install Git for version control and poetry for managing packages and the virtual environment for developing DESDEO. To facilitate this, it is recommended to install scoop. Installation instructions are provided on scoop's webpage. Using scoop is optional, but we will assume that it has been installed for the remainder of this section.
To install poetry, we will follow the recommended way and use pipx. To install pipx, in a powershell, we run the commands
After successfully installing pipx, installing poetry is as simple as
It might be a good idea to run
once more after installing poetry. After this, we should log out and back into a Windows session.
Finally, we can install Git utilizing scoop
We should now have all the necessary tools to be able to start developing and contributing to DESDEO on a Windows operating system.
Linux
Git and Python are most probably already available on our system. If not, we can use our systems's package manager to install both. We can follow the instruction on poetry's webpage to install poetry. After this, we should be set to begin developing on Linux-based systems.
macOS
(this section should be written by somebody who has access to a Mac.)
Homebrew something something. I have no idea. Just follow and adapt the instructions for Linux-based systems.
Setting up a virtual environment and installing DESDEO
In this section, we will download DESDEO, and install it in a virtual environment. It is highly recommended to install DESDEO into such an environment when developing it to ensure that there is no clash between system-level Python packages and the packages utilized by DESDEO. Despite the fancy name, a virtual environment is nothing more than a set of environment variables that point to a specific, usually isolated, Python installation and its packages.
Downloading the source code of DESDEO
Before we proceed to setting up our virtual environment, we will have to download (clone) the source code of DESDEO. To do so, first, we navigate to a directory where we wish to download the source code to. For example:
where the tilde ~
is a shorthand for our $HOME
directory. We
may also create such a directory in our current working
directory, and switch to it, with the commands
We can then proceed to download the source code for DESDEO. It is highly recommended to fork the repository first. Forking the repository is described in the section Forking the DESDEO repository. Assuming our fork has the url https://github.com/ourusername/DESDEO, we can clone the repository on our machine with the command:
We should remember to replace ourusername with our actual GitHub username.
On SSH and keys
It is recommended to utilize the SSH (secure shell) url when cloning DESDEO. This, however, requires that an SSH key-pairs has been generated, and that a public key has been added to one's GitHub user account. For instructions on how to setup and SSH key, see Adding a new SSH key to your GitHub account.
Lastly, we should change to the newly cloned directory with the source code
Ensuring we work on the correct branch
For now, we should also make sure to be checked in the desdeo2
branch of the project with the command
pulling changes on the branch might also be necessary
We are now in a position to setup our virtual environment.
Setting up a virtual environment
There are many ways to setup a virtual environment. Here, we will be utilizing poetry (c.f., section Required software) for the task. (If you want to some other virtual environment system, see the DESDEO installation guide for hints.) First, we should ensure that we are utilizing a correct version of Python
This prints the version of the current Python interpreter installed on our system. If the version is correct, then we can proceed. If not, we should point poetry to the Python binary of the version we wish to utilize, e.g.,
Managing multiple Python versions
For managing multiple Python versions, a tool, such as pyenv is recommended.
Before proceeding, it is useful to set the poetry configuration
virtualenvs.in-project
to true
. This will ensure that our
virtual environment will be created in the .venv/
directory
in our project's directory. To configure poetry to do this,
we run
Assuming we are still in the DESDEO's project directory
(e.g., ~/workspace/DESDEO
), we can now create and activate
a virtual environment with the command
This should create a virtual environment and activate it. To install DESDEO and download all of its software (development) dependencies to the environment, we can now run
- The option
-E standard
installs the regular version of polars. If we are on an older CPU, we might want to install the legacy version with the option-E legacy
. The option-E api
install the packages required to run DESDEO's web API. Having the web API dependencies is beneficial when developing the API, database, or both, aspects found in DESDEO. The-E
flag is used to indicate to poetry extra dependencies we wish to install. Likewise, the--group=dev
tells poetry that we want to install the dependencies listed as development dependencies.
This might take a while. After poetry is done installing, and there are no error messages, we should be able to run
which runs all the tests present in DESDEO. Not all of them will be passing, but a majority of them should. This should indicate to us now that DESDEO has been correctly installed, and our virtual environment is now setup correctly. Pytest and tests are discussed in more detail in the section Testing.
To exit the virtual environment, simply run
and to re-activate it,
Activating the environment requires that our current working
directory is set to be the DESDEO directory with the source code.
The command poetry shell
will not re-create the virtual environment
if it already exists, only reactivate it.
Typical Git workflow for contributing to DESDEO
This section outlines the standard process for contributing to DESDEO using Git and GitHub. This workflow assumes that we have a GitHub account. If this is not the case, we should create one at GitHub first. The workflow involves forking the DESDEO repository, cloning our fork of the repository to our local machine, making changes, and then submitting these changes as a pull request.
Forking the DESDEO repository
A fork is a copy of a repository that we manage and that is a completely separate entity from the original repository, which is often referred to as the upstream. Forking a repository allows us to freely experiment with changes without affecting the original project.
To fork a repository:
- Visit the DESDEO GitHub repository.
- In the top-right corner of the page, click the Fork button.
- This action creates a copy of the DESDEO repository in your GitHub account.
- To ensure we are working on a fork, the url to our repository should be of the form https://github.com/ourusername/DESDEO. We might have a different name for the forked repository if we chose one when making the fork.
Cloning the fork
To work on our fork on a local machine, we need to clone it first. Cloning creates a local copy of our fork. In practice, it downloads the repository and its history to our machine.
To clone our fork:
- On GitHub, we navigate to our fork of the DESDEO repository.
- Above the file list, there should be a green button labeled "Code". Clicking the button should reveal a smaller window. We should select the "SSH" tab and copy the given url.
- In a terminal, we then navigate to where we want to place the local repository.
- Then we clone the fork using the command:
- workspace is just a directory where we want to store the directory containing our fork. this is just an example, we can use any directory we want.
It is highly recommended to use the SSH url. This requires setting up a SSH key pair on our local machine and uploading the public key to GitHub. This info box gives further details on the process.
Setting the upstream repository
It is beneficial that we periodically check the upstream, e.g., the original repository of DESDEO for changes, and update our fork. This ensures that our local version of DESDEO is up to date with new changes, and allows us to fix any potential conflicts between the two versions as they emerge.
Assuming we are in the DESDEO directory with the contents of our cloned fork, we can set the upstream with the following command:
The upstream should now be successfully set.
Creating a new branch
Before making changes to DESDEO's code, we should create a new branch on our local fork of DESDEO. It is a good practice to name the branch something relevant to the changes we plan to make.
DESDEO 2.0 pre-release era
Before making a branch on our fork, we should make sure
we are on the desdeo2
branch of the repository:
- The output should be "On branch desdeo2... etc."
Before making a branch, we should update the desdeo2
branch first
(once DESDEO 2.0 is released, this would be the main
branch instead). We
issue the commands
- We should make sure we are on the
desdeo2
branch first! - This will print a log of the most recent changes to the branch. We should see fairly recent changes here, if not, we should double check we are on the correct branch. To exit the log, we can press 'q'.
We are now in a position to create our own branch, which branches from
desdeo2
. To create a branch with the name feature-x
and switch to it,
we issue the command
We have now created a new branch and switched to it.
Making changes
We are finally in a position where we can begin making our changes to DESDEO and start implementing our new feature. It is advisable to check the section Development practices to learn about some of the practicalities to consider when developing DESDEO.
Assuming we have made changes, we can stage the changes using the command
- This assumed we are in the root directory of the project, i.e., the
directory containing our fork. Otherwise we can stage all files
with changes with the command
git add -A
.
What files to commit?
In general, we should commit only source files, not
compiled files. This means that no Python byte-code
should ever be committed (i.e., __pycache__
directories and their
contents). Luckily, rules for Git to ignore the most common
types of files that should not be committed have been defined in the
.gitignore
file defined at the root level of the DESDEO project.
We can always check which files are staged, and which are not, with the command
On Git status
The command git status
will 99% of the time tell us exactly
what we should in case of errors related to Git. Carefully reading the
output of the command is important and can save us a lot of troubles.
Once we have staged all our changes, we can add then to the branch by committing them
-
This should open our system's default text editor. To configure it, we can change the option, we can issue the command
where editor name can be, for instance, "nano", "vim", "code", etc.
Alternatively, to avoid opening a text editor for giving our commit message, we can also issue the command
On commit messages
In a good commit message, we should give enough information for another developer to understand what was changed. Usually the first line of the commit should be a short summary, e.g., "Added a few new tests.", which is then followed (separated by a blank new line) with more details, e.g., "A test was dded to test the correct functioning of the NIMBUS method. A similar test was also added for the E-NAUTILUS methods. Both of these tests should be passing." There is no such thing as a "too long" commit message!
We can make as many commits as we like. We do not need to have anything "ready" when making a commit. We should not be afraid of committing too often; there is no such thing! Actually, the more commits the better. Committing can be understood as "saving" our changes to the version control system, which also creates a "checkpoint" we can roll our project back to at any point in time. The changes in the commits are still only local, i.e., on our own local machine. To integrate them into our fork on GitHub, or the DESDEO upstream, we would have to push them first.
Pushing changes to our fork
After committing our changes, we can push them to our fork on GitHub:
origin
refers to the GitHub repository with our fork, it points to the repository we originally cloned from, which in this case is our fork.feature-x
is the branch name we have been working on and committing to. To check whereorigin
point to, we can issue the commandgit remote -v
, which lists all the remotes and their urls.
This does not make any changes to the original upstream repository of DESDEO. For our changes to be integrated in the upstream, we have to make a pull request.
Creating a pull request
A pull request is a GitHub feature where we can notify the maintainers of an upstream repository, usually the one we originally forked (in this case DESDEO), that we have made changes that we would like to integrate into the upstream. When making a pull request, it is assumed that a feature, or features, to be added are complete and not work in progress. Once a pull request has been made, the maintainers of DESDEO will be notified. They will then check the changes, and either accept them as they are and pull them into the upstream, or they can give feedback on what needs to be changed for the pull request to be accepted. The more commits in a pull request, the easier it is for a maintainer to review the changes.
In practice, making a pull request consists of the following steps:
- Go to the fork on GitHub.
- Switch to the branch with our new feature, e.g.,
feature-x
. - There should be a green "Pull request" button next to our branch. Click it.
- We can then review the changes in the pull request against the upstream. We can also provide additional information about the contents of the pull request.
- Once we are done creating the pull request and describing it, we can then create it.
We may still continue working on our local branch and pushing commits to our fork. The pull request can always be updated with the new changes in GitHub.
Keeping your fork up to date
While working on our fork, it is a good practice to keep it up to date with the upstream. This is important because, for example, a pull request from another contributor might have been accepted into the mainstream while we have been working on our local fork. Some of these changes might also affect the code we have been working on, and we should make sure our changes are based on the latest code in the upstream. This can avoid overly complicated merge conflicts in the repository when making a pull request.
The following commands can be used to update our fork (remember to commit any changes to your
own branch (e.g., feature-x
) first!):
$ git fetch upstream # (1)!
$ git checkout desdeo2 # (2)!
$ git merge upstream/desdeo2 # (3)!
$ git push origin desdeo2 # (4)!
- The
fetch
command downloads all the changes made to the upstream but does not apply them, unlike thepull
command would. - Remember, this is the main branch of the upstream, which we forked and which we want to keep up with.
- This adds all the changes made in the upstream to our local fork. The
desdeo2
branch of our local fork is now up to date. - Lastly, we want to update the fork in our repository, or remote as well, which is on GitHub. This command pushes the updated version of the main working branch to our fork on GitHub as well.
If we have work in progress in our feature branch (e.g., feature-x
), we can then change back to it and attempt to merge the most recent
changes in the upstream with our work:
- We are now in our feature branch again.
- This will attempt to merge all the changes in the upstream with our feature branch.
If we have made changes in our code to parts of the code that have also been
changed in the upstream, then we would have a merge conflict. The output of
the command git status
should give us plenty of information on how to proceed
to resolve the conflict.
For further instructions on how to resolve merge conflicts, see the section
Further resources.
Further resources
Git is a very powerful tool for version control. What we have covered in this tutorial thus far barely scratches the surface of what Git is capable of, and how to use it. The interested reader is encouraged to checkout further resources, such as the official documentation for Git and the references therein.
The process of learning Git has also been gamified by many. Two popular examples for learning a Git are:
Combined with the official documentation, these games can supplement one's learning journey on coming a proficient Git user.
Lastly, there is a seminar talk by Giovanni Misitano, where the Git workflow discussed in this section is presented in more detail. While the talk references an older version of DESDEO, the contents relevant to Git and GitHub are still very much relevant. A recording of Giovanni's talk can be found on YouTube.
Development practices
There are at least three important aspects we should keep in mind when it comes to the development practices of DESDEO:
- First, we should adhere to common coding practices so that the codebase of DESDEO can be kept coherent and similar across its different modules and files.
- Second, we should ensure to test as much as possible the code have written, ensuring that it works as expected and that we do not break any existing code in DESDEO with out additions.
- Three, we should document our additions carefully to ensure their usability and reusability, and to support other users in utilizing our additions.
These topics are discussed in their respective sections Code style, linting; Typechecking; Testing; and Documenting.
Code style and linting
The practices we should adhere to while writing code, such as line length, quotation style, and such, are determined by the code style. Luckily, we do not have to remember each detail ourselves each time we type a new line of code, instead, we can rely on linters. Linters are external tools that, once run, will check the code we have written to ensure it follows the code style we have chosen, and also that our code is syntactically valid. If not, the linter will alert us of any discrepancies. Instead of manually correcting each discrepancy, we can also utilize code formatters, which can be combined with a linter to automatically format our code such that it follows the established code style.
In DESDEO, we have chosen to utilize Ruff which is both a linter and code formatter. If we have installed DESDEO utilizing poetry, including its development dependencies (c.f., this example), then Ruff should be installed on our machines.
To run Ruff on a file and check for any errors or discrepancies, we can issue the command
The output could then look like
desdeo/problem/schema.py:31:5: D417 Missing argument description in the docstring for `parse_infix_to_func`: `cls`
desdeo/problem/schema.py:394:7: TD002 Missing author in TODO; try: `# TODO(<author_name>): ...` or `# TODO @<author_name>: ...`
desdeo/problem/schema.py:394:7: TD003 Missing issue link on the line following this TODO
desdeo/problem/schema.py:394:7: FIX002 Line contains TODO, consider resolving the issue
Found 4 errors. # (1)!
- In case Ruff finds no errors, its output will read
All checks passed!
We can also ask Ruff to try and fix the found errors in the same file with the command
This will automatically fix the errors, if possible. If not, we will have to manually address them. Luckily, Ruff's output is very rich, and the fix is often easy to implement.
The code style itself utilized in DESDEO has been configured in the
project's configuration file pyproject.toml
, which is found in the root
directory of the project. Ruff related options are found in sections
starting with [tool.ruff...]
.
For more information on Ruff, the reader is encouraged to check its official documentation.
Typechecking
While Python is not a typed language, it still offers the options to provide typehints in function and variable declarations. For example
def divide_numbers(nominator: int | float, denominator: int | float) -> float:
return nominator / denominator
In the above example, the types following each function argument (after the colon),
denote the type of the arguments, in this case, either an int
or a float
.
The type after the arrow ->
in the function definition defines the
return type of the function, in this a case float
.
Python itself does not enforce the types of variables or functions based on the typehints, but rather, they can help users and developers to reason about the code. Moreover, tools have been implemented to do static type checking for Python code enhanced with typehints. One such tool is mypy, and it is utilized in DESDEO as well.
As was the case with Ruff, mypy should be already installed on our machine if we installed desdeo with its development dependencies. To check a file with mypy, we can run the command
As an example, the output of mypy
could look like
desdeo/problem/utils.py:26: error: Argument 1 to "len" has incompatible type "float"; expected "Sized" [arg-type]
desdeo/problem/utils.py:28: error: Value of type "float" is not indexable [index]
desdeo/problem/utils.py:58: error: Value expression in dictionary comprehension has incompatible type "float | None"; expected type "float" [misc]
desdeo/problem/utils.py:70: error: Value expression in dictionary comprehension has incompatible type "float | None"; expected type "float" [misc]
Found 4 errors in 1 file (checked 1 source file)
We would then have to manually fix the typehints highlighted by mypy.
It is important to keep in mind that mypy does not guarantee any form or type of type safety. It may enhance it, but it relies heavily on the typehints provided by users, which may be wrong or only partially correct. Moreover, mypy does not remove the need for runtime type checks. Nevertheless, mypy is a useful tool to check the consistency and correctness of user provided typehints in DESDEO, which can improve the overall quality of the code.
For more information on mypy, the reader is encouraged to check its official documentation.
Testing
Another way to ensure the code quality in DESDEO is testing. DESDEO utilizes pytest as its testing library, which should come installed with the project if we installed DESDEO with its development dependencies.
Tests should be located in the tests/
directory found at the root of the
directory. Tests are written inside .py
files with the test_
prefix, e.g.,
test_feature.py
. Tests themselves should be defined as test cases, each in its
own Python function with its name starting with the prefix test_
, e.g., def
test_feature():
. It is important that these naming conventions are followed
because pytest relies on them during test discovery (i.e., when it tries to
figure out where tests have been defined).
As an example of a test case, let us consider the following
@pytest.mark.feature # (1)!
def test_feature_correct_output():
output: int = double(5) # (2)!
expected = 10
assert output == expected # (3)!
- To add marks, we must remember to import pytest first (
import pytest
). - We can imagine we are testing a feature that doubles a given integer and
returns an integer corresponding to the doubled value of the argument. This function
has been imported, e.g.,
from feature_file import double
. - An
assert
statement will raise anAssertionError
whenever the expression following it resolves to beFalse
.
What should a test test?
Because DESDEO is currently mainly developed my researchers, we do not have the time or resources to implement tests in a systematic fashion. I.e., unit tests for every single feature, and then more comprehensive tests. Instead, we have taken a code-coverage approach, where out goal is to write at least some tests for all the code found in the code base that visits each line in the code at least once. It is better to write a test that at least checks that a piece of code executes without error with some input. Even better if the output can be checked to be logically correct. And even even better, if multiple input and outputs of a piece of code and be checked.
A bare minimum test would look something like the following
An optimal test like this:
And in an ideal case we would have multiple tests like the above for other inputs as well, and more logical checks, e.g.,
def test_sign():
output = double(-4)
expected = 8
assert is_positive(output) # (1)!
assert output == expected
- This is just some function that checks that a number is positive,
returning
True
if it is, otherwise it returnsFalse
.
In the example, we have defined a test case test_feature_correct_output
,
where we check that the output of double
given the argument 5
is
10
. If not, the test will raise and AssertionError
and the test will not pass.
A passing test is such that it does not raise any errors. We have also utilized the
decorator @pytest.mark.feature
, which tells pytest that this test has been marked
with the mark feature
. Marks should be registered in the configuration of pytest,
which can be found in the pyproject.toml
file's section [tool.pytest.ini_options]
.
Each mark should be registered only once, and can be
re-utilized as many times as needed.
Multiple test cases can have the same mark, or multiple marks. Marks are a useful
way to not run all tests each time we run pytest, but to run only a subset of them, which
are relevant to the changes we have made when implementing for a new feature, for instance.
We can then run the test with the command
- The option
-m feature
tells pytest to only run tests with the markfeature
. To exclude tests with mark, and run every other test instead, we can, for instance, issue the commandpytest -m "not feature"
, which would run all tests that are not marked with the markfeature
.
The output of pytest will be very explicit whether the test is passing or not. We can also run all the tests (that have not been explicitly marked to be skipped), with the command
Since DESDEO has quite many tests defined, this may take a while. It is therefore recommended to use marks during development to run only a subset of tests that are relevant to the ongoing work.
However, there are some instances when all the tests should be always run. These are
- Before starting developing any new features, or making modifications. That
is, after a
git pull
in a clean working tree. - Before we push our changes to a remote repository, e.g., before a
git push
. - Optionally, we might want to run all the tests before any commits as well, e.g.,
before a
git commit
command.
Running tests in the described cases ensures that we, first of all, start working on a functioning code base. We can check which tests are passing before we make our changes. If, after we have made our changes, some of the previously passing tests are no longer passing, then we know that the changes we introduced, are likely the source of the tests breaking. In this case, we should investigate the reason and implement adequate fixes.
Running tests before pushing is also important to ensure that we have not broken any existing code, and that the code we have introduced (and written tests for!) is also working as intended. We should never push code that breaks existing code, unless we have a very good reason to do so. For similar reasons, running all tests before committing is also advisable.
For examples of existing tests in DESDEO, the reader is encouraged to
check the directory tests
at the root of the project and the tests therein.
The official documentation for
pytest is also a valuable resource to check out.
Documenting
DESDEO comes with a comprehensive documentation. When developing new features, these should be carefully documented as well. Documentation in DESDEO can be roughly divided into three main types:
- comments found in the source code,
- docstrings, and
- external or project documentation.
Comments are found in the files containing Python source code found across DESDEO. These are often pieces of very specific information related to a line or block of code. They are often present to help a developer understand what a specific part of the code does. Comments are often specific and technical, and depend on the context. They are not meant to be understandable outside the source code. An example of a comment could be:
def log2(x: float) -> float:
# argument must be positive and non-zero, if not, raise an error
if not x > 0:
msg = f"Argument {x=} must be positive and non-zero." # (1)!
raise ValueError(msg)
# we utilize log10 and the change of base formula for computing the base-2
# logarithm
return log10(x) / log10(2) # (2)!
- In Python f-strings, the formatter
{x=}
prints the name of the variable and its value. E.g., instead of writingx={x}
, we can write{x=}
. - We assume that we have a function
log10
available that computes the base-10 logarithm of its argument.
As we can see, comments start with a hash #
. Taken out of their context, the
comments do not make much sense, but in the code, they provide valuable information.
Write informative comments
When writing comments, avoid redundant comments, such as:
- This is obvious and the comment does not provide any additional informative value.
instead, try and provide additional information that helps a reader understand why a particular piece of code has been written, for instance. For instance:
- This is not obvious and provides valuable information for understanding the rest of the code.
Docstrings, on the other hand, are a lot more self-contained and informative
than comments.
Docstrings are Python literals (they start and end with thee double quotation marks """
)
that are often used to describe Python functions and classes, though they can also be used
to describe many other things as well, e.g., class and module level variables. To illustrate a docstring,
let us return to the previous example:
def log2(x: float, useless: str) -> float:
"""Computes the base-2 logarithm of a given number.
Computes the base-2 logarithm of a given non-zero and positive number.
Utilizes a base-10 logarithm and the change of base formula. The formula
is defined as `y = log10(x) / log10(b)`, where `x` is the
number for which the logarithm is computed, `y` is the logarithm
of `x` in the base `b`, and `log10` is the base-10 logarithm. In
this case, `b=2` since we are computing the base-2 logarithm.
References:
Napier, J. (1614). Mirifici Logarithmorum Canonis Descriptio
[Description of the Wonderful Rule of Logarithms]
Raises:
ValueError: the given argument `x` is either zero or negative.
Arguments:
x (float): the number for which the base-2 logarithm should be computed.
Must be non-zero and positive.
useless (str): an unused variable to just showcase how we can have
multiple entries in a section.
Returns:
float: the base-2 logarithm of `x`.
""" # (1)!
# argument must be positive and non-zero, if not, raise an error
if not x > 0:
msg = f"Argument {x=} must be positive and non-zero."
raise ValueError(msg)
# we utilize log10 and the change of base formula for computing the base-2
# logarithm
return log10(x) / log10(2)
-
The first line of the docstring should be a short summary of the function. This is then followed by a more detailed description.
The more detailed description is then followed by sections. The section
References
can be used to list one or more references related to the function, for example, when it is based on the works of others. Each entry in any section starts on an indented line. Following lines must be further indented if they are related to the entry. A new entry would start on a new line with an indentation level one lower than the section name, in this caseReferences
.Likewise, the section
Raises
describes what exceptions may raise from the the function. The type of the exception is followed by a colon and a description of the reason for the exception.In the
Arguments
section, each argument of the function is described. The name of the arguments is followed by its type in parentheses, and then a colon, after which details about the argument are given.Lastly, the return of the function is described in the
Returns
section. Here, the type of the return value if followed by a colon and a description of the value.Each function docstring should have at least an
Arguments
andReturn
section.
As we can notice, a docstring is a lot more informative that a mere comment. The docstring alone is understandable even without the function definition. However, there exist multiple styles for docstrings. The one illustrated here follows the docstring format described in Google's style guide, which is also the style for docstrings utilized in DESDEO.
Lastly, we have external or project documentation. This is easy to illustrate since we are reading it right now! In other words, this refers to the part of the documentation that does not reside in the Python source code itself, but it, as the name suggests, external, and documents larger entities related to DESDEO, such as different concepts, or like in this tutorial, how to contribute to DESDEO.
The tool utilized for building the very documentation we are reading, is mkdocs. We also utilize its extension mkdocstirngs, for automatically generating reference documentation from the docstrings written in the Python source code found across the DESDEO project, and the Materials theme for mkdocs. As with the other tools discussed in this section, mkdocs is installed on our local machine if we installed DESDEO with its development dependencies.
The external documentation of DESDEO resides in the directory
docs
found at the root of the project. The overall structure and
the configuration related to mkdocs, and the other documentation tools
used, are found in the file mkdocs.yml
, which is also found at the root
level of the project.
To build and view the current documentation, we can issue the command
This will build the documentation and create a local server, allowing
us to view the built documentation by opening the url http://localhost:8000/
in a web browser.
To better understand the structure of DESDEO's documentation, it is
best to take a moment and explore the file mkdocs.yml
, starting with
its top-level contents, and especially the contents in the section nav:
,
which describes the structure of the generated web page containing this
documentation. The actual contents of the documentation are written in a rich
markdown format, and are logically arranged in the sub-folders found in docs/
.
Rich here just refers to the fact that we are using extensions of
the markdown language that support \(\LaTeX\), for instance.
To given an example of the structure, the
contents of this tutorial can be found in the file
docs/tutorials/contributing.md
.
We will not discuss mkdocs and its extensions further in this tutorial. A novice contributor will find the easiest time by studying how the existing documentation has been written, and working by example. For further reading resources, the documentation for mkdocs, mkdocstrings-python, and Materials for mkdocs, are good reference material.
Integrated development environments
Thus far, we have separately discussed the tools utilized when developing DESDEO. While these tools can be used as demonstrated, it is often too inconvenient from a practical point of view. In practice, many of the discussed tools are run automatically, or managed though other, arguably more user friendly tools, such as various graphical user interfaces.
Integrated development environments, or IDEs, are programs that offer an environment for developing code, often integrating many other tools, such as Git and different linters. They also offer a text, or rather code, editor for editing the source code found in a project.
Many different IDEs exist, and the choice of the right IDE is very subjective. For this reason, we have introduced the tools for developing DESDEO on their own, without making assumptions regarding use of any IDE. It is also perfectly fine to not use IDEs, but with modern code bases and tools, IDEs can streamline an otherwise complicated tool chain and development process, and thus significantly save a developer's time. That being said, we will briefly mention one IDE and some recommended extensions for easing the process of developing and contributing to DESDEO.
Visual studio code
One of the most popular IDEs currently being used, is Visual studio code (Vscode for short). It is available on all major platforms, making it a portable solution. Its development is backed by a major company, and it is based on mostly open source software. Vscode is also very customizable, and has a plethora of different extensions available, both official and user made. A more open source version of Vscode is also available, which mainly disables Microsoft's telemetry and strips the application of other proprietary parts. Whichever version we use, they are virtually the same, but the proprietary version does come with some extensions that are otherwise not available in the open source version. However, these are not required or necessary for developing DESDEO.
Vscode comes with Git support out of the box and is nicely documented. To manage GitHub specific aspects from Vscode, we need to install the extension GitHub Pull Requests. This extension allows us to make pull request form Vscode, open and browse issues, and much more.
For better Python support, including support for enabling code auto-completion, running tests, and debugging, the Python extension is recommended. This extension also supports virtual environments, which are automatically activated if they reside in the project's directory.
For linting and code formatting, the Ruff extension for Vscode is very useful. This extension allows to run Ruff automatically on our code whenever we, for instance, save our changes. Auto-fixable issues will be fixed automatically, while other issues will be highlighted and reported inside Vscode.
To help with documentation, the
autoDocstring
extension is useful. It will automatically
generate docstrings templates, populating them
intelligently with relevant sections, e.g.,
Arguments
by picking the arguments from the
function definition.
To avoid most typos, the extension Code Spell Checker can be useful. It is smart enough to identify typographical errors in source code despite Python's syntax. Perhaps most importantly, it will point out errors in docstrings and markdown files as well when writing documentation.
All of these extensions can be configured to further match one's needs, but we are not going to to discuss these aspects in this tutorial. Many other extensions exist as well that can prove to be useful. What we have listed here is a bare minimum to get us started. And as said, the choice of, or lack of, the best IDE is personal and subjective. One is free to explore the many other alternatives available as well.
Summary
We have covered many topics in this tutorial, and at first, its contents may feel overwhelming. However, despite first impression, most of the discussed steps and practices will become almost second nature fairly quickly. To support the reader's endeavors in contributing to DESDEO, we have summarized the main steps to take into account when contributing to DESDEO in this last section.
Contribution workflow
- Fork the DESDEO repository. This is done on GitHub.
- Clone the fork (
$ git clone <url to fork>
) -
Make sure the fork is up to date with the upstream, or main repository, of DESDEO.
- This just indicates that the line continues on a new row.
-
If no virtual environment exists, create and activate a virtual environment. (
$ poetry shell
) - Else activate an existing virtual environment. (
$ poetry shell
) - Run tests and make sure at least most of them are passing. (
$ pytest
) - Create or switch to a local branch.
- Make changes to the code. Stage them, and keep committing the changes and running tests now and then.
- Remember to write tests and documentation, including comments, docstrings, and external documentation, when relevant.
- Me mindful of the outputs of Ruff and mypy. Fix errors and warnings whenever possible.
- Push your commits to your fork on GitHub. (
$ git push origin feature-x
) - Make a pull request on GitHub.
- Goto 3.
Conclusions, where to go next, and our Discord server
Many of the topics covered in this tutorial are not unique to DESDEO, but rather adhere to some common modern practices in developing Python software. By investing a little bit of our time in familiarizing ourselves with the various tools discussed, we will be able to save a lot of time in the future. Moreover, a lot of the information covered is applicable to other open source projects as well.
We are now in a position to start contributing to DESDEO. If we do not know where to start, we can check the open issues found on DESDEO's GitHub repository.
For further support, feel free to join the MCDM Community's Discord server. It has dedicated channels for DESDEO as well. There, you may ask for further assistance and discuss DESDEO and its development in general.
To join the Discord server, click here!
Happy coding!