Switch our python package management solution to poetry.

Mainly because it has better support for installing the packages from different python versions. It also has better dependency resolver than Pipenv. And supports modern standard for python dependency management. This includes usage of pyproject.toml for project specific configuration instead of per tool conf files. See following links for details: https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/ https://www.python.org/dev/peps/pep-0518/
2025-12-22 21:59:59 +00:00 · 2022-01-19 13:16:51 +03:00
parent e209764877
commit 5f5a11525c
18 changed files with 2164 additions and 1160 deletions
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -223,15 +223,15 @@ jobs:
      - checkout
      - run:
          name: Install deps
-          command: pipenv --python 3.7 install --dev
+          command: ./scripts/pysync
      - run:
          name: Run yapf to ensure code format
          when: always
-          command: pipenv run yapf --recursive --diff .
+          command: poetry run yapf --recursive --diff .
      - run:
          name: Run mypy to check types
          when: always
-          command: pipenv run mypy .
+          command: poetry run mypy .

  run-pytest:
    executor: zenith-python-executor
@@ -275,11 +275,7 @@ jobs:
            - run: git submodule update --init --depth 1
      - run:
          name: Install deps
-          command: pipenv --python 3.7 install
-      - run:
-          name: Install dev deps
-          # we need to run moto s3 mock server, we start is as a separate process, so it's in the dev deps
-          command: pipenv --python 3.7 install --dev
+          command: ./scripts/pysync
      - run:
          name: Run pytest
          # pytest doesn't output test logs in real time, so CI job may fail with
@@ -331,7 +327,7 @@ jobs:
            # -n4 uses four processes to run tests via pytest-xdist
            # -s is not used to prevent pytest from capturing output, because tests are running
            # in parallel and logs are mixed between different tests
-            "${cov_prefix[@]}" pipenv run pytest \
+            "${cov_prefix[@]}" ./scripts/pytest \
              --junitxml=$TEST_OUTPUT/junit.xml \
              --tb=short \
              --verbose \
--- a/.github/workflows/benchmarking.yml
+++ b/.github/workflows/benchmarking.yml
@@ -36,20 +36,20 @@ jobs:
    # see https://github.com/actions/setup-python/issues/162
    # and probably https://github.com/actions/setup-python/issues/162#issuecomment-865387976 in particular
    # so the simplest solution to me is to use already installed system python and spin virtualenvs for job runs.
-    # there is Python 3.7.10 already installed on the machine so use it to install pipenv and then use pipenv's virtuealenvs
-    - name: Install pipenv & deps
+    # there is Python 3.7.10 already installed on the machine so use it to install poetry and then use poetry's virtuealenvs
+    - name: Install poetry & deps
      run: |
-        python3 -m pip install --upgrade pipenv wheel
-        # since pip/pipenv caches are reused there shouldn't be any troubles with install every time
-        pipenv install
+        python3 -m pip install --upgrade poetry wheel
+        # since pip/poetry caches are reused there shouldn't be any troubles with install every time
+        poetry install

    - name: Show versions
      run: |
        echo Python
        python3 --version
-        pipenv run python3 --version
+        poetry run python3 --version
        echo Pipenv
-        pipenv --version
+        poetry --version
        echo Pgbench
        $PG_BIN/pgbench --version

@@ -90,7 +90,7 @@ jobs:
        REMOTE_ENV: "1" # indicate to test harness that we do not have zenith binaries locally
      run: |
        mkdir -p perf-report-staging
-        pipenv run pytest test_runner/performance/ -v -m "remote_cluster" --skip-interfering-proc-check --out-dir perf-report-staging
+        ./scripts/pytest test_runner/performance/ -v -m "remote_cluster" --skip-interfering-proc-check --out-dir perf-report-staging

    - name: Submit result
      env:
--- a/35
+++ b/35
@@ -1,35 +0,0 @@
-[[source]]
-url = "https://pypi.python.org/simple"
-verify_ssl = true
-name = "pypi"
-
-[packages]
-pytest = ">=6.0.0"
-typing-extensions = "*"
-pyjwt = {extras = ["crypto"], version = "*"}
-requests = "*"
-pytest-xdist = "*"
-asyncpg = "*"
-cached-property = "*"
-psycopg2-binary = "*"
-jinja2 = "*"
-boto3 = "*"
-boto3-stubs = "*"
-
-[dev-packages]
-# Behavior may change slightly between versions. These are run continuously,
-# so we pin exact versions to avoid suprising breaks. Update if comfortable.
-yapf = "==0.31.0"
-mypy = "==0.910"
-# Non-pinned packages follow.
-pipenv = "*"
-flake8 = "*"
-moto = "*"
-flask = "*"
-flask-cors = "*"
-types-requests = "*"
-types-psycopg2 = "*"
-
-[requires]
-# we need at least 3.7, but pipenv doesn't allow to say this directly
-python_version = "3"
--- a/Pipfile.lock
+++ b/Pipfile.lock
--- a/README.md
+++ b/README.md
@@ -28,12 +28,12 @@ apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libsec
 libssl-dev clang pkg-config libpq-dev
 ```

-[Rust] 1.55 or later is also required.
+[Rust] 1.56.1 or later is also required.

 To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `tmp_install/bin` and `tmp_install/lib`, respectively.

 To run the integration tests or Python scripts (not required to use the code), install
-Python (3.7 or higher), and install python3 packages using `pipenv install` in the project directory.
+Python (3.7 or higher), and install python3 packages using `./scripts/pysync` (requires poetry) in the project directory.

 2. Build zenith and patched postgres
 ```sh
@@ -128,8 +128,7 @@ INSERT 0 1
 ```sh
 git clone --recursive https://github.com/zenithdb/zenith.git
 make # builds also postgres and installs it to ./tmp_install
-cd test_runner
-pipenv run pytest
+./scripts/pytest
 ```

 ## Documentation
--- a/docs/sourcetree.md
+++ b/docs/sourcetree.md
@@ -87,31 +87,29 @@ so manual installation of dependencies is not recommended.
 A single virtual environment with all dependencies is described in the single `Pipfile`.

 ### Prerequisites
- Install Python 3.7 (the minimal supported version)
-    - Later version (e.g. 3.8) is ok if you don't write Python code
-    - You can install Python 3.7 separately, e.g.:
+- Install Python 3.7 (the minimal supported version) or greater.
+    - Our setup with poetry should work with newer python versions too. So feel free to open an issue with a `c/test-runner` label if something doesnt work as expected.
+    - If you have some trouble with other version you can resolve it by installing Python 3.7 separately, via pyenv or via system package manager e.g.:
      ```bash
      # In Ubuntu
      sudo add-apt-repository ppa:deadsnakes/ppa
      sudo apt update
      sudo apt install python3.7
      ```
- Install `pipenv`
-    - Exact version of `pipenv` is not important, you can use Debian/Ubuntu package `pipenv`.
- Install dependencies via either
-  * `pipenv --python 3.7 install --dev` if you will write Python code, or
-  * `pipenv install` if you only want to run Python scripts and don't have Python 3.7.
+- Install `poetry`
+    - Exact version of `poetry` is not important, see installation instructions available at poetry's [website](https://python-poetry.org/docs/#installation)`.
+- Install dependencies via `./scripts/pysync`. Note that CI uses Python 3.7 so if you have different version some linting tools can yield different result locally vs in the CI.

-Run `pipenv shell` to activate the virtual environment.
-Alternatively, use `pipenv run` to run a single command in the venv, e.g. `pipenv run pytest`.
+Run `poetry shell` to activate the virtual environment.
+Alternatively, use `poetry run` to run a single command in the venv, e.g. `poetry run pytest`.

 ### Obligatory checks
 We force code formatting via `yapf` and type hints via `mypy`.
 Run the following commands in the repository's root (next to `setup.cfg`):

 ```bash
-pipenv run yapf -ri .  # All code is reformatted
-pipenv run mypy .  # Ensure there are no typing errors
+poetry run yapf -ri .  # All code is reformatted
+poetry run mypy .  # Ensure there are no typing errors
 ```

 **WARNING**: do not run `mypy` from a directory other than the root of the repository.
@@ -123,17 +121,6 @@ Also consider:
 * Adding more type hints to your code to avoid `Any`.

 ### Changing dependencies
-You have to update `Pipfile.lock` if you have changed `Pipfile`:
+To add new package or change an existing one you can use `poetry add` or `poetry update` or edit `pyproject.toml` manually. Do not forget to run `poetry lock` in the latter case.

-```bash
-pipenv --python 3.7 install --dev  # Re-create venv for Python 3.7 and install recent pipenv inside
-pipenv run pipenv --version  # Should be at least 2021.5.29
-pipenv run pipenv lock  # Regenerate Pipfile.lock
-```
-
-As the minimal supported version is Python 3.7 and we use it in CI,
-you have to use a Python 3.7 environment when updating `Pipfile.lock`.
-Otherwise some back-compatibility packages will be missing.
-
-It is also important to run recent `pipenv`.
-Older versions remove markers from `Pipfile.lock`.
+More details are available in poetry's [documentation](https://python-poetry.org/docs/).
--- a/poetry.lock
+++ b/poetry.lock
--- a/pre-commit.py
+++ b/pre-commit.py
@@ -38,7 +38,7 @@ def rustfmt(fix_inplace: bool = False, no_color: bool = False) -> str:


 def yapf(fix_inplace: bool) -> str:
-    cmd = "pipenv run yapf --recursive"
+    cmd = "poetry run yapf --recursive"
    if fix_inplace:
        cmd += " --in-place"
    else:
@@ -47,7 +47,7 @@ def yapf(fix_inplace: bool) -> str:


 def mypy() -> str:
-    return "pipenv run mypy"
+    return "poetry run mypy"


 def get_commit_files() -> List[str]:
@@ -72,7 +72,7 @@ def check(name: str, suffix: str, cmd: str, changed_files: List[str], no_color:
            print("Please inspect the output below and run make fmt to fix automatically.")
        if suffix == ".py":
            print("If the output is empty, ensure that you've installed Python tooling by\n"
-                  "running 'pipenv install --dev' in the current directory (no root needed)")
+                  "running './scripts/pysync' in the current directory (no root needed)")
        print()
        print(res.stdout.decode())
        exit(1)
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,32 @@
+[tool.poetry]
+name = "zenith"
+version = "0.1.0"
+description = ""
+authors = ["Dmitry Rodionov <dmitry@zenith.tech>"]
+
+[tool.poetry.dependencies]
+python = "^3.7"
+pytest = "^6.2.5"
+psycopg2-binary = "^2.9.1"
+typing-extensions = "^3.10.0"
+PyJWT = {version = "^2.1.0", extras = ["crypto"]}
+requests = "^2.26.0"
+pytest-xdist = "^2.3.0"
+asyncpg = "^0.24.0"
+aiopg = "^1.3.1"
+cached-property = "^1.5.2"
+Jinja2 = "^3.0.2"
+types-requests = "^2.27.7"
+types-psycopg2 = "^2.9.6"
+boto3 = "^1.20.40"
+boto3-stubs = "^1.20.40"
+moto = {version = "^3.0.0", extras = ["server"]}
+
+[tool.poetry.dev-dependencies]
+yapf = "==0.31.0"
+flake8 = "^3.9.2"
+mypy = "==0.910"
+
+[build-system]
+requires = ["poetry-core>=1.0.0"]
+build-backend = "poetry.core.masonry.api"
--- a/pytest.ini
+++ b/pytest.ini
@@ -3,6 +3,8 @@ addopts =
    -m 'not remote_cluster'
 markers =
    remote_cluster
+testpaths =
+    test_runner
 minversion = 6.0
 log_format = %(asctime)s.%(msecs)-3d %(levelname)s [%(filename)s:%(lineno)d] %(message)s
 log_date_format = %Y-%m-%d %H:%M:%S
--- a/scripts/coverage
+++ b/scripts/coverage
@@ -446,7 +446,7 @@ prerequisites:

 self-contained example:
    {app} run make
-    {app} run pipenv run pytest test_runner
+    {app} run poetry run pytest test_runner
    {app} run cargo test
    {app} report --open
    """
--- a/scripts/generate_and_push_perf_report.sh
+++ b/scripts/generate_and_push_perf_report.sh
@@ -14,7 +14,7 @@ mkdir -p data/$REPORT_TO
 cp $REPORT_FROM/* data/$REPORT_TO

 echo "Generating report"
-pipenv run python $SCRIPT_DIR/generate_perf_report_page.py --input-dir data/$REPORT_TO --out reports/$REPORT_TO.html 
+poetry run python $SCRIPT_DIR/generate_perf_report_page.py --input-dir data/$REPORT_TO --out reports/$REPORT_TO.html
 echo "Uploading perf result"
 git add data reports
 git \
--- a/scripts/pysync
+++ b/scripts/pysync
@@ -0,0 +1,7 @@
+#!/usr/bin/env bash
+
+# This is a helper script for setting up/updating our python environment.
+# It is intended to be a primary endpoint for all the people who want to
+# just setup test environment without going into details of python package management
+
+poetry install --no-root # this installs dev dependencies by default
--- a/scripts/pytest
+++ b/scripts/pytest
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+
+# This is a helper script to run pytest without going too much
+# into python dependency management details
+
+# It may be desirable to create more sophisticated pytest launcher
+# with commonly used options to simplify launching from e.g CI
+
+poetry run pytest "${@:1}"
--- a/test_runner/README.md
+++ b/test_runner/README.md
@@ -22,23 +22,24 @@ runtime. Currently, there are only two batches:

 ### Running the tests

-Because pytest will search all subdirectories for tests, it's easiest to
-run the tests from within the `test_runner` directory.
+There is a wrapper script to invoke pytest: `./scripts/pytest`.
+It accepts all the arguments that are accepted by pytest.
+Depending on your installation options pytest might be invoked directly.

 Test state (postgres data, pageserver state, and log files) will
 be stored under a directory `test_output`.

 You can run all the tests with:

-`pipenv run pytest`
+`./scripts/pytest`

 If you want to run all the tests in a particular file:

-`pipenv run pytest test_pgbench.py`
+`./scripts/pytest test_pgbench.py`

 If you want to run all tests that have the string "bench" in their names:

-`pipenv run pytest -k bench`
+`./scripts/pytest -k bench`

 Useful environment variables:

@@ -53,12 +54,12 @@ should go.
 `RUST_LOG`: logging configuration to pass into Zenith CLI

 Let stdout, stderr and `INFO` log messages go to the terminal instead of capturing them:
-`pytest -s --log-cli-level=INFO ...`
+`./scripts/pytest -s --log-cli-level=INFO ...`
 (Note many tests capture subprocess outputs separately, so this may not
 show much.)

 Exit after the first test failure:
-`pytest -x ...`
+`./scripts/pytest -x ...`
 (there are many more pytest options; run `pytest -h` to see them.)

 ### Writing a test
--- a/test_runner/batch_others/test_remote_storage.py
+++ b/test_runner/batch_others/test_remote_storage.py
@@ -1,5 +1,5 @@
 # It's possible to run any regular test with the local fs remote storage via
-# env ZENITH_PAGESERVER_OVERRIDES="remote_storage={local_path='/tmp/zenith_zzz/'}" pipenv ......
+# env ZENITH_PAGESERVER_OVERRIDES="remote_storage={local_path='/tmp/zenith_zzz/'}" poetry ......

 import time, shutil, os
 from contextlib import closing
--- a/test_runner/fixtures/log_helper.py
+++ b/test_runner/fixtures/log_helper.py
@@ -7,7 +7,7 @@ own section after all tests are executed.

 To see logs for all (even successful) tests, run
 pytest with the following command:
- `pipenv run pytest -n8 -rA`
+- `poetry run pytest -n8 -rA`

 Other log config can be set in pytest.ini file.
 You can add `log_cli = true` to it to watch
--- a/test_runner/fixtures/zenith_fixtures.py
+++ b/test_runner/fixtures/zenith_fixtures.py
@@ -334,7 +334,7 @@ class AuthKeys:
 class MockS3Server:
    """
    Starts a mock S3 server for testing on a port given, errors if the server fails to start or exits prematurely.
-    Relies that `pipenv` and `moto` server are installed, since it's the way the tests are run.
+    Relies that `poetry` and `moto` server are installed, since it's the way the tests are run.

    Also provides a set of methods to derive the connection properties from and the method to kill the underlying server.
    """
@@ -344,7 +344,7 @@ class MockS3Server:
    ):
        self.port = port

-        self.subprocess = subprocess.Popen([f'pipenv run moto_server s3 -p{port}'], shell=True)
+        self.subprocess = subprocess.Popen([f'poetry run moto_server s3 -p{port}'], shell=True)
        error = None
        try:
            return_code = self.subprocess.poll()