Did you write a cool and useful Python script? Would you like to share it with the community, but you’re not sure how to go about that? If so, then this is the article for you. We’ll go over a list of simple steps which can turn your script into a fully fledged open-source project.
The Python community has created a rich ecosystem of tools, which can help you during the development and upkeep of your project. Complete the steps in this checklist, and your project will be easier to maintain and you’ll be ready to take contributions from the community.
This is an opinionated article. I will run though a long list of tools and practices, I’ve had good experience with. Some of your favorite tools may be left out, some of my choices you may find unnecessary. Feel free to adapt the list to your liking and leave a comment below.
You can download a printable PDF version of the checklist.
I tried to complete the entire checklist in my small open-source project named
gym-demo. Feel free to use it as a reference and submit PRs if you find room for improvement.
☑ Define your command-line interface (CLI)
If you’re going to provide a command-line utility, then you need to define a friendly command-line user interface. Your interface will be more intuitive for users if it follows the GNU conventions for command line arguments.
There are many ways to parse command line arguments, but my favorite by far is to use the
docopt module developed by Vladimir Keleshev. It allows you to define your entire interface in the form of a docstring at the beginning of your script, like so:
1 2 3 4 5 6 7 8 9 10
Later you can just call the
docopt(__doc__) command and use the argument values:
1 2 3 4 5 6 7
I usually start with Docopt by copying one of the examples and modifying it to my needs.
☑ Structure your code
Python has established conventions for most things, this includes the layout of your code directory and naming of some files and directories. Follow these conventions to make your project easier to understand by other Python developers.
The basic directory structure of your project should resemble this:
package-name ├── LICENSE ├── README.md ├── main_module_name │ ├── __init__.py │ └── main.py ├── tests │ └── test_main.py ├── requirements.txt └── setup.py
package-name directory contains all of the sources of your package. Usually this is the root directory of your project repository, containing all other files. Choose your package name wisely and check if it’s available on PyPI, as this will be the name people will use to install your package using:
pip install package-name
main_module_name directory is the directory which will be copied
into your user’s
site-packages when your package is installed. You can
define more than one module if you need to, but it’s good practice to
nest them under a single module with an identifiable name.
According to the Python style guide PEP8:
Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.
If possible, the name of your package and the name of it’s main module should be the same. Since underscores are discouraged in package names, you can use
my-project as the package name and
my_project as the main module name.
Whether you’re writing code in Python or any other language, you should follow Clean Code principals. One of the most important ideas behind clean code it to split up your logic into short functions, each with a single responsibility.
Your functions should take zero, one or at most two arguments. If your functions have more than 2 parameters, that’s a well known code-smell. It indicates that your function is probably trying to do more than one thing and you should split it up into smaller sub-functions.
Still need more than two parameters? Perhaps your function parameters are related and should come into your function as a single data structure? Or perhaps you should refactor your code so your functions become methods of an object?
In Python, you can sometimes get away with more than two parameters, if you specify default values for the extra ones. This is better, but you should still consider if the function shouldn’t be split.
Small functions with a single responsibility and few parameters are easy to write unit-tests for. We’ll come back to this.
If you’re writing a command-line utility, you should create a separate function which handles the parsing of user input and initiating the logic of your utility. You can call this function
main() or anything else you think fits.
This logic should be placed in the
__main__ block of your script:
__name__ == "__main__" is only true if you’re calling the script directly. It’s not true if you include the same Python file as a library module:
from my_module import main.
The advantage of splitting the main logic into a separate
main() function is that you’ll be able to use the
main function as an entry point. We’ll come back to this when talking about
☑ Write a
Python has a mature and well maintained packaging utility called
setup.py file is the build script for
setuptools and every Python project should have one.
Writing a basic
setup.py file is very easy, all the file has to do is to call the
setup method with appropriate information about your project.
This example comes from my
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The example above assumes you have a
long_description of your project in a markdown
README.md file in the same directory. If you don’t, you can specify
long_description as a string.
More information can be found in the Python packaging tutorial.
Once you have a
setup.py file, you can use it to build your Python project’s distribution packages like so:
$ pip install setuptools wheel $ python setup.py sdist $ python setup.py bdist_wheel
sdist command creates a source distribution (such as a tarball or zip file with Python source code).
bdist_wheel creates a binary Wheel distribution file which your users may download from PyPI in the future. Both distribution files will be placed in the
bdist_wheel command comes from the
During development another
setup.py command is even more useful:
$ source my_venv/bin/activate (my_venv)$ python setup.py develop
This command installs your project inside a virtual environment named
my_venv, but it does so without copying any files. It links your source directory with your
site-packages directory by creating a link file (such as
my-project.egg-link). This is very useful, because you can work on your source code directly and test it in your virtual env without reinstalling the project after each change.
You can find out about other
setup.py commands by running:
$ python setup.py --help-commands
entry_points for your script command
If you’re writing a command-line utility, you should create a console script entry point for your command. This will create an executable launcher for your script, which users can easily call at the command line.
To do this, just add an
entry_points argument to the
setup() call in your
setup.py file. For example, the following
console_scripts entry will create an executable named
my-command.exe on Windows) and place it in the
bin path of your environment. This means your users can just use
my-command after they install your package.
1 2 3 4
my_module.main:main specifies which function to call and where to find it.
my_module.main specifies the path to the Python file
:main denotes the
main() function inside
main.py. This is the “Python path” syntax and if you know which PEP it’s defined in, leave me a note in the comments. Thanks.
There are other cool things
entry_points can do. You can use it to customize build commands of
setup.py and even to distribute discoverable services for other tools (such as parsers for a specific file format, etc.).
☑ Create a
You should provide your users with information about which other packages your package will require to work properly. The right place to put this information is inside
setup.py as an
1 2 3 4
It’s also very useful to inform your users which versions of each dependency you tested your package with. A good way to do this is to put a requirements file in your repository. The file is usually named
requirements.txt and should contain the list of your dependencies along with version numbers, for example:
Users can then install these precise versions of your dependencies by running:
$ pip install -r requirements.txt
It may be useful to create a separate
requirements_test.txt file for dependencies used only during testing and development.
The easiest way to generate a
requirements.txt file is to run the
pip freeze command. Be careful with this though, as it will list all installed packages, whether they are dependencies of your package, the dependencies of these dependencies, or simply unrelated packages you installed in your environment.
☑ Set up a Git repo
It’s time to put your code under source-control. Everyone is using Git these days, so let’s roll with it.
Let’s start by adding a Python-specific
.gitignore file to the root of your project.
$ curl https://raw.githubusercontent.com/github/gitignore/master/Python.gitignore > .gitignore
You can now create your repo and add all files:
$ git init $ git add --all
Verify that only files you want are being added to the repo with
git status and create your initial commit.
$ git commit -m 'Initial commit'
.gitignore files may be found in the GitHub gitignore repo.
☑ Use Black to format your code
The Python community is very lucky for many reasons and one of them is the early adoption of a common code-style guide the PEP8. This is a great blessing, because we don’t have to argue which coding style is better, we don’t have to define a different style for each project, in each company, etc. We have PEP8 and we should all just stick to PEP8.
To that end, Łukasz Langa crated Black – the uncompromising code formatter. You should install it, run it over your code and then re-run before every commit. Using Black is as easy as:
(my_venv) $ pip install black (my_venv) $ black my_module All done! ✨ 🍰 ✨ 1 file reformatted, 7 files left unchanged.
You may disagree with some of the formatting decisions Black makes. I would say, that it’s better to have a consistent style, rather then a prettier, but inconsistent one. Let’s just all use Black and get along. ☺
☑ Set up pre-commit hooks
The best way to run Black and any other code formatters is to use
pre-commit. This is a tool which is triggered every time you
git commit and runs code-linters and formatters on any modified files.
pre-commit as usual:
(my_venv) $ pip install pre-commit
pre-commit by creating a file named
.pre-commit-config.yaml in the root directory of your project. A simple configuration, which only runs black would look like this:
1 2 3 4 5
You can generate a sample config by calling
Set up a Git pre-commit hook by calling
From now on, each time you run
git commit Black will be called to check your style. If your style is off,
pre-commit will prevent you form committing your code and
black will reformat it.
(my_venv) $ git commit black....................................................................Failed hookid: black Files were modified by this hook. Additional output: reformatted gym_demo/demo.py All done! ✨ 🍰 ✨ 1 file reformatted.
Now simply re-add the reformatted file with
git add and commit again.
☑ Code linters
Python has a great set of code linters, which can help you avoid making common mistakes and keep your style in line with PEP8 and other standard conventions. Many of these tools are maintained by the Python Code Quality Authority.
My favorite Python linting tool is Flake8, which checks for compliance with PEP8. It’s base functionality can be extended by installing some of its many plugins. My favorite Flake8 plugins are listed below.
1 2 3 4 5 6 7 8 9 10
Once you install all those packages, you can simply run
flake8 to check your code.
You can configure Flake8 by adding a
[flake8] configuration section to
.flake8 files in your project’s root directory.
1 2 3 4 5 6 7 8 9
There are other code linters you may find interesting. For example Bugbear finds common sources of bugs, while Bandit finds common security issues in Python code. You can use them both as a Flake8 plugins of course.
☑ Create a
tox is a great tool, which aims to standardize testing in Python. You can use it to setup a virtual environment for testing your project, create a package, install the package along with its dependencies and then run tests and linters. All of this is automated, so you just need to type one
1 2 3 4 5 6 7 8 9 10 11
tox is quite configurable, so you can decide which commands are executed or use your
requirements.txt by creating a
tox.ini configuration file. The following simple example runs
pytest in a Python3 venv.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
You can use
tox to easily run tests on multiple Python versions if they are installed in your system. Just extend the
If you automate testing using
tox, you will be able to just run that one command in your continuous integration environment. Make sure you run
tox on every commit you want to merge.
☑ Refactor your code to be unit-testable and add tests
Using unit tests is one of the best practices you can adopt. Writing unit tests for your function gives you a chance to take one more look your code. Perhaps the function is too complex and should be simplified? Perhaps there’s a bug you didn’t notice before or an edge-case you didn’t consider?
Writing good unit tests is an art and it takes time, but it’s an investment which pays off many times over, especially on a large project which you maintain over a long period. For one, unit-tests make refactoring much easier and less scary. Also, you can learn to write your tests before you write your program (test-driven development), which is a very satisfying way to code.
I would recommend using the PyTest framework for writing your unit tests. It’s easy to get started with and it’s very powerful and configurable. Writing a simple unit-test is as simple as creating a
test directory with
test_*.py files. A simple test looks like this:
1 2 3 4 5 6 7
Running the tests is as simple as typing
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Make sure to add the
pytest command to your
docstrings and documentation
Writing good documentation is very important for your users. You should start by making sure each function and module are described by a docstring. The docstring should describe what the function should do in an imperative mood sentence. For example:
1 2 3
The parameters and return values of your functions should also be included in docstrings:
1 2 3 4 5 6 7 8 9 10
Notice, that I’m also using Python3 type annotations to specify parameter and return types.
flake8-docstrings plugin to verify all your functions have a docstring.
If your project grows larger, you will probably want to create a full-fledged documentation site. You can use Sphinx or the simpler MkDocs to generate the documentation and host the site on Read the docs or GitHub Pages.
☑ Add type annotations and a MyPy verification step
Python 3.5 added the option to annotate your code with type information. This is a very useful and clean type of documentation and you should use.
my_function below takes a unicode string as an argument and returns a
dict of strings mapping to numeric or textual values.
Mypy is the static type checker for Python. If you type-annotate your code,
mypy will run through it and make sure that you’re using the right parameter types when calling functions.
You can add a call to
mypy to your
tox configuration to verify that you’re not introducing any type-related mistakes in your commits.
☑ Upload to GitHub
Alright. If you completed all the previous steps and checked all the boxes, your code is ready to be shared with the world!
Most open-source projects are hosted on GitHub, so your project should probably join them. Follow these instructions to setup a repo on GitHub and push your project there.
Microsoft recently acquired GitHub, which makes some people sceptical, if this should still remain the canonical place for open-source projects online. You can consider GitLab as an alternative. So far however, Microsoft have been good stewards of GitHub.
☑ Add README and LICENSE files
The first thing people see when they visit your project’s repository is the contents of the
README.md file. GitHub and GitLab do a good job of rendering Markdown-formatted text, so you can include links, tables, pictures, etc.
Make sure you have a README file and that it contains information about:
- what does your project do?
- how to use it (with examples)?
- how can people contribute code to your project?
- what’s the license of the code?
- links to other relevant documentation
More tips on writing a README here.
The other critically important file you should include is
LICENSE. Without this file, no one will be able to legally use your code.
If you’re not sure what license to choose, use the MIT license. It’s just 160 words, read it. It’s simple and permissive and lets everyone use your code however they want.
More info about choosing a license here.
☑ Add a continuous integration service
OK, now that your project is online and you prepared a
tox configuration, it’s time to set up a continuous integration service. This will run your style-checking, static code analysis and unit-tests on every pull request (PR) made to your repository.
Setting up Travis CI for your repository is as simple as adding a hidden YAML configuration file named
.travis.yaml. For example, the following installs and runs
tox on your project in a Linux virtual machine:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
All you need to do after commiting
.travis.yaml to your repo, is to log into Travis and activate the CI service for your project.
You can set up branch protection on
master to require status checks to pass before a PR can be merged.
If you’d like to run your CI on multiple versions of Python or multiple operating systems, you can set up a test matrix like so:
1 2 3 4 5 6 7 8 9 10 11
Travis has fairly good documentation, which explains its many settings with configuration examples. You can test your configuration on a PR, where you modify the
.travis.yaml file. Travis will rerun its CI job on every change, so you can tweak settings to your liking.
A completed Travis run will look like this example.
☑ Add a requirements updater
Breaking changes in dependencies are a common problem in all software development. Your code was working just fine a while ago, but if you try to build it today, it fails, because some package it uses changed in an unforeseen way.
One way of working around this is to freeze all the dependency versions in your
requirements.txt files, but this just puts the problem off into the future.
The best way to deal with changing dependencies it to use a service, which periodically bumps versions in your
requirements.txt files and creates a pull request with each version change. Your automated CI can test your code against the new dependencies and let you know if you’re running into problems.
Single package version changes are usually relatively easy to deal with, so you can fix your code, if needed before updating the dependency version. This allows you to painlessly keep track of the changes in all the projects you depend on.
I use the PyUp service for this. The service requires no configuration, you just need to sign up using your GitHub credentials and activate it for your repository. PyUp will detect you
requirements.txt files and start issuing PRs to keep dependencies up to date with PyPI.
There are alternative services, which also do a good job of updating dependencies. GitHub recently acquired Dependabot, which works with Python and other languages and is free for all projects (not only open-source).
☑ Add test coverage checker
Python unit-testing frameworks have the ability to determine which lines and branches of code were hit when running unit tests. This coverage report is very useful, as it lets you know how much of your code is being exercised by tests and which parts are not.
If you install the
pytest-cov module, you can use the
--cov argument to
pytest to generate a coverage report.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
If you add the
--cov-report=html argument, you can generate an HTML version of the coverage report, which you can find in the
htmlcov/index.html file after running tests.
(my_venv) $ pytest --cov=my_module --cov-report=html tests/
Track test coverage over time
Online services, such as Coveralls or Codecov can track your code coverage with every commit and on every pull request. You can decide not to accept PRs which decrease your code coverage. See an example report here.
In order to start using Coveralls, sign up using your GitHub credentials and set up tracking for your repository.
You can report your coverage using the
coveralls-python package, which provides the
coveralls command. You can test it manually by specifying the
COVERALLS_REPO_TOKEN environment variable. You can find your token by going to your repository’s settings on the Coveralls site.
1 2 3 4 5 6 7
When running on Travis,
coveralls will be able to detect which repository is being tested, so you don’t have to (and shouldn’t) put
COVERALLS_REPO_TOKEN into your
tox.ini file. Instead use the
- prefix for the command to allow it fail if you are running
1 2 3 4
☑ Automated code review
The best thing you can do when working as a team is to thoroughly review each other’s code. You should point out any mistakes, parts of code which are difficult to understand or badly documented, or anything else which doesn’t quite smell right.
If you’re working alone, or would like another pair of eyes, you can set up one of the services providing automated code review. These services are still evolving and are not providing a huge value yet, but sometimes they catch something your code linters missed.
A report can look like this example.
☑ Publish your project on PyPI
So, now you’re ready to publish your project on PyPI. This is quite a simple operation, unless your package is larger than 60MB or you selected a name, which is already taken.
Before you publish a package, create a release version. Start by bumping your version number to a higher value. Make sure you follow the semantic versioning rules and add the version number to
The next step is to create a release on GitHub. This will create a tag you can use to look up the code associated with a specific version of your package.
Now you’ll need to set up an account on the Test version of PyPI.
You should always start by uploading your package to the test version of PyPI. You should then test your package from test PyPI on multiple environments to make sure it works, before posting it on the official PyPI.
Use the following instructions to create your packages and upload them to test PyPI:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
You can now visit your project page on test PyPI under the URL: https://test.pypi.org/project/my-package/
Once you test your package thoroughly, you can repeat the same steps for the official version of PyPI. Just change the upload command to:
(my_venv) $ twine upload dist/*
Congratulations, your project is now online and fully ready to be used by the community!
☑ Advertise your project
OK, you’re done. Take to Twitter, Facebook, LinkedIn or wherever else your potential users and contributors may be and let them know about your project.
Congratulations and good luck!