Python3 Package

最近在做Project的时候发现一个package无法在jump box之外的机器上通过pip安装,后来发现这是一个内部开发的python package, 并且为jump box的pip做了设置,加入了内部的package repo address/credential。关于package 的创建和发布还不是很了解,这里专门总结一下。 Python Packaging User Guide

Package and Module

Packages contains modules (module is normally a single python source file) or other packages. Modules also are objects with special attributes.

1
2
3
4
5
6
7
8
9
10
11
12
13
## urllib is package because it contains other modules or packages
## request is a nested module
import urllib.request
from urllib import request

## although both are marked as module type
type(urllib)
type(urllib.request)

## show you the package location
urllib.__path__
## error, because only package has this attribute
urllib.request.__path__

How does python know where to import?

1
2
3
4
5
import sys
## for system built-in modules
sys.path
## you can manipulate on it
sys.path.append("<path>")

Or specify in environment variable (see python --help):

1
2
## will append to sys.path
export PYTHONPATH=path1:path2:path3

Package Structure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
package/:
| ## init usually is empty, > 3.3 version, it is optional
| ## but explicitly have it is good
|--- __init__.py
|--- module1.py
|--- module2.py
|
|--- subpackage1/
| |
| |--- __init__.py
| |--- module3.py
|
|--- subpackage2/
|
|--- __init__.py
|--- module4.py

When import package, __init__.py will be executed if it has contents, so you can have init code here. module1.py and modul2.py are normal python source files, subpackage1 and subpackage2 are nested packages that has its own module. module1.py can import subpackage1 resources, and so on.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
## absolute imports
import package
import package.module1
from package import module2

import package.subpackage1
from package.subpackage1 import module3

## relative imports
## for example, in module3 it wants to use something in module4
## .. the same meaning in bash `cd` command
from ..subpackage2 import module4

## other forms
from . import sth
from .. import sth

Note that relative import can only be used to import modules within the current top-level package and can only in the form if from ... import.

Sometimes you will see __all__ in __init__.py, it control the public objects you can use when from .. import *. If you want to import other modules or packages manually, it is fine.

Namespace Package

For splitting a single python package across multiple directories on disk. Namespace package may not have __init__.py.

For example, split package1 to different path: path1 and path2, 注意这里package1 top-level 没有__init__.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
path1/
|
|--- package1/
|
|--- module1.py
|--- ## other packages

path2/
|
|--- package1/
|
|--- module2.py
|--- ## other packages

When import:

1
2
3
4
5
6
7
8
9
import sys
## must include both paths
sys.path.extend()['path1', 'path2']

import package
## you will see 2 paths
package.__path__
import package.module1
import package.module2

Executable Directory

You can execute a directory if it contains __main__.py, then you can zip the directory and run the zip file.

1
2
3
4
directory/
|
|--- __main__.py
|--- ## other modules or packages

注意directory 没有__init__.py,它不是一个package.

1
2
3
4
5
6
7
8
## it will run __main__.py
python directory

## zip it
cd directory
python -m zipfile -c ../directory.zip *
## run it
python directory.zip

这就相当于打包了一个executable,别人使用时就不需要安装其他依赖了。

Executable Package

if you want to execute a package, also need to adds __main__.py, you cannot use __init__.py since it is only executed when import.

1
2
3
4
5
package/
| ## you can wrap the function here
|--- __main__.py
|--- __init__.py
|--- ## other modules or packages
1
2
## run it, arguments will be read by __main__.py
python -m package <arguments>

Package Layout

This is the recommended structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
project_name/
|
|--- REAMDME.rst
|--- doc/
|--- src/
| | ## package is here
| |--- package/
| ## unit test code
|--- tests/
| |
| |--- test_code.py
| ## use setuptolls package
|--- setup.py
| ## see later discussion
|--- tox.ini

The setup.py for example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import setuptools

setuptools.setup(
name="<package name>",
version="<version number>",
author="chengdol",
author_email="chengdol@xxx.com",
description="...",
url="<package access url>",
packages=setuptools.find_packages('src'),
package_dir={'': 'src'},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
install_requires=['Flask==3.0.0', 'pysocks', 'pyyaml'],
)

关于tox, 是一个方便自动化测试的工具: tox: Automate and standardize testing in Python.

后面讲了plugins的实现 via setuptools or namespace packages. 目前没用到。

Package Distribution

When you create a virtualenv, there are pip, wheel and setuptools installed already.

There are source and built distrubutions, built package can place directly into installation directory and can be platform-specific, it is a .whl file. source package is tar.gz file, need to build before installing it. If you run pip download, you will see these distribution files.

For source package:

1
2
3
4
5
6
cd package
python setup.py sdist

## you will see a xxx.tar.gz file
cd dist
pip install xxx.tar.gz

For built package:

1
2
3
4
5
6
7
8
9
cd package
python setup.py bdist_wheel

## you will see a xx-py3-none-any.whl file
cd dist
## py3: python 3
## none: ABI requiremens, work with other language
## any: platform specifc
pip install xx-none-any.whl

Reading about what is wheel: https://realpython.com/python-wheels/ A Python .whl file is essentially a ZIP (.zip) archive with a specially crafted filename that tells installers what Python versions and platforms the wheel will support.

A wheel is a type of built distribution. In this case, built means that the wheel comes in a ready-to-install format and allows you to skip the build stage required with source distributions.

Then you register account on PyPI and upload the package:

1
2
3
4
5
6
7
8
9
## install twine
python -m pip install --user --upgrade twine

cd package
python setup.py sdist bdist_wheel && \
twine upload dist/* -u ${USER_NAME} -p ${PASSWORD}

## or upload to your personal repo
twine upload --repository-url ${PACKAGES_REPO} dist/* -u ${USER_NAME} -p ${PASSWORD}

Tools used: twine: Twine is a utility for publishing Python packages on PyPI.

After uploading you can use pip install the paclage in your new virtual environment.

0%