fix(python): typing (#2167)

@wjones127 is there a standard way you guys setup your virtualenv? I can either relist all the dependencies in the pyright precommit section, or specify a venv, or the user has to be in the virtual environment when they run git commit. If the venv location was standardized or a python manager like `uv` was used it would be easier to avoid duplicating the pyright dependency list. Per your suggestion, in `pyproject.toml` I added in all the passing files to the `includes` section. For ruff I upgraded the version and removed "TCH" which doesn't exist as an option. I added a `pyright_report.csv` which contains a list of all files sorted by pyright errors ascending as a todo list to work on. I fixed about 30 issues in `table.py` stemming from str's being passed into methods that required a string within a set of string Literals by extracting them into `types.py` Can you verify in the rust bridge that the schema should be a property and not a method here? If it's a method, then there's another place in the code where `inner.schema` should be `inner.schema()` ``` python class RecordBatchStream: @property def schema(self) -> pa.Schema: ... ``` Also unless the `_lancedb.pyi` file is wrong, then there is no `__anext__` here for `__inner` when it's not an `AsyncGenerator` and only `next` is defined: ``` python async def __anext__(self) -> pa.RecordBatch: return await self._inner.__anext__() if isinstance(self._inner, AsyncGenerator): batch = await self._inner.__anext__() else: batch = await self._inner.next() if batch is None: raise StopAsyncIteration return batch ``` in the else statement, `_inner` is a `RecordBatchStream` ```python class RecordBatchStream: @property def schema(self) -> pa.Schema: ... async def next(self) -> Optional[pa.RecordBatch]: ... ``` --------- Co-authored-by: Will Jones <willjones127@gmail.com>
2026-01-12 23:02:59 +00:00 · 2025-03-10 09:01:23 -07:00
parent bc49c4db82
commit cc81f3e1a5
16 changed files with 294 additions and 86 deletions
--- a/python/python/tests/test_embeddings.py
+++ b/python/python/tests/test_embeddings.py
@@ -419,17 +419,17 @@ def test_embedding_function_safe_model_dump(embedding_type):

    dumped_model = model.safe_model_dump()

-    assert all(
-        not k.startswith("_") for k in dumped_model.keys()
-    ), f"{embedding_type}: Dumped model contains keys starting with underscore"
+    assert all(not k.startswith("_") for k in dumped_model.keys()), (
+        f"{embedding_type}: Dumped model contains keys starting with underscore"
+    )

-    assert (
-        "max_retries" in dumped_model
-    ), f"{embedding_type}: Essential field 'max_retries' is missing from dumped model"
+    assert "max_retries" in dumped_model, (
+        f"{embedding_type}: Essential field 'max_retries' is missing from dumped model"
+    )

-    assert isinstance(
-        dumped_model, dict
-    ), f"{embedding_type}: Dumped model is not a dictionary"
+    assert isinstance(dumped_model, dict), (
+        f"{embedding_type}: Dumped model is not a dictionary"
+    )

    for key in model.__dict__:
        if key.startswith("_"):