Development Guidelines#
Setup#
Python environment#
There is an environment.yml
provided in the repository root, which installs all
required development dependencies in the facet-develop
environment.
conda env create -f environment.yml
conda activate facet-develop
Pre-commit hooks#
This project uses a number of pre-commit hooks such as black and flake8 to enforce uniform coding standards in all commits. Before committing code, please run
pre-commit install
You can use pre-commit run
to manually run the pre-commit hooks from the command
line.
Pytest#
Run pytest test/
from the facet root folder or use the PyCharm test runner. To
measure coverage, use pytest --cov=src/facet test/
. Note that the code coverage
reports are also generated in the Azure Pipelines (see CI/CD section).
Note that you will need to set the PYTHONPATH to the src/
directory by
running export PYTHONPATH=./src/
from the repository root.
Git Guidelines#
For commits to GitHub, phrase commit comments as the completion of the sentence This commit will …, e.g.
add method foo to class Bar
but not
added method foo to class Bar
Documentation#
This section provides a general guide to the documentation of FACET, including docstrings, Sphinx, the README and tutorial notebooks.
Docstrings#
The API documentation is generated from docstrings in the source code. Before writing your own, take some time to study the existing code documentation and emulate the same style. Describe not only what the code does, but also why, including the rationale for any design choices that may not be obvious. Provide examples wherever this helps explain usage patterns.
A docstring is mandatory for all of the following entities in the source code, except when they are protected/private (i.e. the name starts with a leading _ character):
modules
classes
functions/methods
properties
attributes
Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does.
Docstrings must use reStructuredText syntax, the default syntax for Sphinx.
Write docstrings for functions and methods in the imperative style, e.g.,
def fit(): """Fit the model."""
but not
def fit(): """This is a function that fits the model."""
which is too wordy and not imperative.
Write docstrings for modules, classes, modules, and attributes starting with a descriptive phrase (as you would expect in a dictionary entry). Be concise and avoid unnecessary or redundant phrases. For example:
class Inspector: """ Explains the inner workings of a predictive model using the SHAP approach. The inspector offers the following analyses: - ... - ...
but not
class Inspector: """ This is a class that provides the functionality to inspect models ...
as this is too verbose, and explains the class in terms of its name which does not add any information.
Properties should be documented as if they were attributes, not as methods, e.g.,
@property def children(self) -> Foo: """The child nodes of the tree.""" pass
but not
@property def foo(self) -> Foo: """:return: the foo object""" pass
Start full sentences and phrases with a capitalised word and end each sentence with punctuation, e.g.,
"""Fit the model."""
but not
"""fit the model"""
For multi-line docstrings, insert a line break after the leading triple quote and before the trailing triple quote, e.g.,
def fit(): """ Fit the model. Use the underlying estimator's ``fit`` method to fit the model using the given training sample. :param sample: training sample """
but not
def fit(): """Fit the model. Use the underlying estimator's ``fit`` method to fit the model using the given training sample. :param sample: training sample"""
For method arguments, return value, and class parameters, one must hint the type using the typing module. Do not specify the parameter types in the docstrings, e.g.,
def f(x: int) -> float: """ Do something. :param x: input value :return: output value """
but not
def f(x: int) -> float: """ Do something. :param int x: input value :return float: output value """
Sphinx Build#
Documentation for FACET is built using sphinx.
Before building the documentation ensure the facet-develop
environment is active as
the documentation build has a number of key dependencies specified in the
environment.yml
file, specifically:
sphinx
pydata-sphinx-theme
nbsphinx
sphinx-autodoc-typehints
To generate the Sphinx documentation locally navigate to /sphinx
and run
python make.py html
By default this will clean any previous build. The generated Sphinx
documentation for FACET can then be found at sphinx/build/html
.
Documentation versioning is managed via the release process - see the section on building and releasing FACET.
The sphinx
folder in the root directory contains the following:
a
make.py
script for executing the documentation build via pythona
source
directory containing predefined .rst files for the documentation build and other required elements (see below for more details)a
base
folder which containsthe
make_base.py
andconf_base.py
scripts with nearly all configuration formake.py
andconf.py
_static
directory, containing logos, icons, javascript and css used for pytools and other packages documentation builds_templates
directory, containing autodoc templates used in generating and formatting the modules and classes for the API documentation
The sphinx/source
folder contains:
a
conf.py
script that is the build configuration file needed to customize the input and output behavior of the Sphinx documentation build (see below for further details)a
tutorials
directory that contains all the notebooks (and supporting data) used in the documentation build. Note that as some notebooks take a little while to generate, the notebooks are currently committed with cell output. This may change in the future where notebooks are run as part of the sphinx buildthe essential
.rst
files used for the documentation build, which are:index.rst
: definition of the high-level documentation structure which mainly references the other.rst
files in this directorycontribution_guide.rst
: detailed information on building and releasing FACET.faqs.rst
: contains guidance on bug reports/feature requests, how to contribute and answers to frequently asked questions including small code snippetsapi_landing.rst
: for placing any API landing page preamble for documentation as needed. This information will appear on the API landing page in the documentation build after the short description insrc/__init__.py
. This file is included in the documentation build via thecustom-module-template.rst
_static
contains additional material used in the documentation build, in this case, logos and icons
The two key scripts are make.py
and conf.py
. The base configuration for the
these scripts can be found in pytools/sphinx.
The reason for this is to minimise code given the standardization of the documentation
build across multiple packages.
make.py: All base configuration comes from pytools/sphinx/base/make_base.py
and
this script includes defined commands for key steps in the documentation build. Briefly,
the key steps for the documentation build are:
Clean: remove the existing documentation build
FetchPkgVersions: fetch the available package versions with documentation
ApiDoc: generate API documentation from sources
Html: run Sphinx build to generate HTMl documentation
The two other commands are Help and PrepareDocsDeployment, the latter of which is covered below under Building and releasing FACET.
conf.py: All base configuration comes from pytools/sphinx/base/conf_base.py
. This
build configuration file
is a requirement of Sphinx and is needed to customize the input and output behavior of
the documentation build. In particular, this file highlights key extensions needed in
the build process, of which some key ones are as follows:
intersphinx (external links to other documentations built with Sphinx: matplotlib, numpy, …)
viewcode to include source code in the documentation, and links to the source code from the objects documentation
imgmath to render math expressions in doc strings. Note that a local latex installation is required (e.g., MiKTeX for Windows)
README#
The README file for the repo is .rst format instead of the perhaps more traditional
markdown format. The reason for this is the README.rst
is included as the quick start
guide in the documentation build. This helped minimize code duplication. However,
there are a few key points to be aware of:
The README has links to figures, logos and icons located in the
sphinx/source/_static
folder. To ensure these links are correct when the documentation is built, they are altered and then the contents of theREADME.rst
is incorporated into thegetting_started.rst
which is generated during the build and can be found insphinx/source/getting_started
.The quick start guide based on the
Diabetes_getting_started_example.ipynb
notebook in thesphinx/auxiliary
folder is not automatically included (unlike all the other tutorials). For this reason any updates to this example in the README need to be reflected in the source notebook and vice-versa.
Tutorial Notebooks#
Notebooks are used as the basis for detailed tutorials in the documentation. Tutorials
created for documentation need to be placed in sphinx/source/tutorial
folder.
If you intend to create a notebook for inclusion in the documentation please note the following:
The notebook should conform to the standard format employed for all notebooks included in the documentation. This template (
Facet_sphinx_tutorial_template.ipynb
) can be found insphinx/auxiliary
.When creating/revising a tutorial notebook with the development environment the following code should be added to a cell at the start of the notebook. This will ensure your local clones (and any changes) are used when running the notebook. The jupyter notebook should also be instigated from within the
facet-develop
environment.def _set_paths() -> None: # set the correct path when launched from within PyCharm module_paths = ["pytools", "facet", "sklearndf"] import sys import os if "cwd" not in globals(): # noinspection PyGlobalUndefined global cwd cwd = os.path.join(os.getcwd(), os.pardir, os.pardir, os.pardir) os.chdir(cwd) print(f"working dir is '{os.getcwd()}'") for module_path in module_paths: if module_path not in sys.path: sys.path.insert(0, os.path.abspath(f"{cwd}/{os.pardir}/{module_path}/src")) print(f"added `{sys.path[0]}` to python paths") _set_paths() del _set_paths
If you have a notebook cell you wish to be excluded from the generated documentation, add
"nbsphinx": "hidden"
to the metadata of the cell. To change the metadata of a cell, in the main menu of the jupyter notebook server, click on View -> CellToolbar -> edit metadata, then click on edit Metadata in the top right part of the cell. The modified metadata would then look something like:{ "nbsphinx": "hidden" }
To interpret a notebook cell as reStructuredText by nbsphinx, make a Raw NBConvert cell, then click on the jupyter notebook main menu to View -> CellToolbar -> Raw Cell Format, then choose ReST in the dropdown in the top right part of the cell.
The notebook should be referenced in the
tutorials.rst
file with a section structure as follows:NAME OF NEW TUTORIAL ***************************************************************************** Provide a brief description of the notebook context, such as; regression or classification, application (e.g., disease prediction), etc. - Use bullet points to indicate what key things the reader will learn (i.e., key takeaways). Add a short comment here and direct the reader to download the notebook: :download:`here <tutorial/name_of_new_tutorial_nb.ipynb>`. .. toctree:: :maxdepth: 1 tutorial/name_of_new_tutorial_nb
The source data used for the notebook should also be added to the tutorial folder unless the file is extremely large and/or can be accessed reliably another way.
For notebooks involving simulation studies, or very long run times consider saving intermediary outputs to make the notebook more user-friendly. Code the produces the output should be included as a markdown cell with code designated as python to ensure appropriate formatting, while preventing the cell from executing should the user run all cells.
Package builds#
The build process for the PyPI and conda distributions uses the following key files:
make.py
: generic Python script for package builds. Most configuration is imported from pytools make.py which is a build script that wraps the package build, as well as exposing the matrix dependency definitions specified in thepyproject.toml
as environment variablespyproject.toml
: metadata for PyPI, build settings and package dependenciestox.ini
: contains configurations for tox, testenv, flake8, isort, coverage report, and pytestcondabuild/meta.yml
: metadata for conda, build settings and package dependencies
Versioning#
FACET version numbering follows the semantic versioning
approach, with the pattern MAJOR.MINOR.PATCH
.
The version can be bumped in the src/__init__.py
by updating the
__version__
string accordingly.
PyPI#
PyPI project metadata, build settings and package dependencies
are obtained from pyproject.toml
. To build and then publish the package to PyPI,
use the following commands:
python make.py facet tox default
flit publish
Please note the following:
Because the PyPI package index is immutable, it is recommended to do a test upload to PyPI test first. Ensure all metadata presents correctly before proceeding to proper publishing. The command to publish to test is
flit publish --repository testpypi
which requires the specification of testpypi in a special
.pypirc
file with specifications as demonstrated here.The
pyproject.toml
does not provide specification for a short description (displayed in the top gray band on the PyPI page for the package). This description comes from thesrc/__init__.py
script.flit which is used here to publish to PyPI, also has the flexibility to support package building (wheel/sdist) via
flit build
and installing the package by copy or symlink viaflit install
.Build output will be stored in the
dist/
directory.
Conda#
conda build metadata, build settings and package dependencies
are obtained from meta.yml
. To build and then publish the package to conda,
use the following commands:
python make.py facet conda default
anaconda upload --user BCG_Gamma dist/conda/noarch/<*package.tar.gz*>
Please note the following:
Build output will be stored in the
dist/
directory.Some useful references for conda builds:
Azure DevOps CI/CD#
This project uses Azure DevOps for CI/CD pipelines.
The pipelines are defined in the azure-pipelines.yml
file and are divided into
the following stages:
code_quality_checks: perform code quality checks for isort, black and flake8.
detect_build_config_changes: detect whether the build configuration as specified in the
pyproject.yml
has been modified. If it has, then a build test is run.Unit tests: runs all unit tests and then publishes test results and coverage.
conda_tox_build: build the PyPI and conda distribution artifacts.
Release: see release process below for more detail.
Docs: build and publish documentation to GitHub Pages.
Release process#
Before initiating the release process, please ensure the version number
in src/__init__.py
is correct and the format conforms to semantic
versioning. If the version needs to be corrected/bumped then open a PR for the
change and merge into develop before going any further.
The release process has the following key steps:
Create a new release branch from the tag of the latest release named
release/<version>
where<version>
is the version number of the new releaseCreate a new branch from the baseline branch (e.g.,
2.0.x
) nameddev/<version>
where<version>
is the version number of the new releaseOpening a PR to merge
dev/<version>
ontorelease/<version>
. This will automatically run all conda/pip build tests via Azure Pipelines prior to allowing to merge the PR. This will trigger automatic upload of artifacts (conda and pip packages) from Azure DevOps. At this stage, it is recommended that the pip package build is checked using PyPI test to ensure all metadata presents correctly. This is important as package versions in PyPI proper are immutable.If everything passes and looks okay, merge the PR using a merge commit (not squashing). This will trigger the release pipeline which will:
Tag the release commit with version number as specified in
src/__init__.py
Create a release on GitHub for the new version, please check the documentation for details
Pre-fill the GitHub release title and description, including the changelog based on commits since the last release. Please note this can be manually edited to be more succinct afterwards
Attach build artifacts (conda and pip packages) to GitHub release
Upload build artifacts to conda/PyPI using
anaconda upload
andflit publish
, respectively
Remove any test versions for pip from PyPI test
Merge
release/<version>
back onto the baseline branch from whichdev/<version>
was branchedBump up version in
src/__init__.py
on the baseline branch to start work towards the next release