* 1. Reimplement Eq for Timestamp
2. Add and/or for GenericRange
* feat: extract time range from filters
* feat: select sst files according to time range
* fix: clippy
* fix: empty value in range
* fix: some cr comments
* fix: return optional timestamp range
* fix: cr comments
If the table has a non-null column, we need to use default value instead
of null to fill the value columns in the record batch for deletion.
Otherwise, we can't create the record batch since the schema check
doesn't allow null in the non-null column.
* docs: Fix incorrect comment of Vector::only_null
* feat: Add delete to WriteRequest and WriteBatch
* feat: Filter deleted rows
* fix: Fix panic after reopening engine
This is detected by adding a reopen step to the delete test for region.
* fix: Fix OpType::min_type()
* test: Add delete absent key test
* chore: Address CR comments
* fix: carry not recordbatch result in FlightData, to allow executing SQLs other than selection in new GRPC interface
* Update src/datanode/src/instance/flight/stream.rs
Co-authored-by: Jiachun Feng <jiachun_feng@proton.me>
* chore: Remove unused MutationExtra
* refactor(storage): Refactor Mutation and Payload
Change Mutation from enum to a struct that holds op type and record
batches so the encoder don't need to convert the mutation into record
batch. Now The Payload is no more an enum, it just holds the data, to
be serialized to the WAL, of the WriteBatch. The encoder and decoder
now deal with the Payload instead of the WriteBatch, so we could hold
more information not necessary to be stored to the WAL in the
WriteBatch.
This commit also merge variants in write_batch::Error to storage::Error
as some variants of them denote the same error.
* test(storage): Pass all tests in storage
* chore: Remove unused codes then format codes
* test(storage): Fix test_put_unknown_column test
* style(storage): Fix clippy
* chore: Remove some unused codes
* chore: Rebase upstream and fix clippy
* chore(storage): Remove unused codes
* chore(storage): Update comments
* feat: Remove PayloadType from wal.proto
* chore: Address CR comments
* chore: Remove unused write_batch.proto
* chore: upgrade to Arrow 29.0 and use workspace package and dependencies
* fix: resolve PR comments
Co-authored-by: luofucong <luofucong@greptime.com>
* chore: fix SequenceNotMonotonic error message
previous sequence should greater than or equal to given sequence
* Apply suggestions from code review
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* deps: Bump OpenDAL to v0.21.1
Signed-off-by: Xuanwo <github@xuanwo.io>
* Avoid using raw types when not needed
Signed-off-by: Xuanwo <github@xuanwo.io>
Signed-off-by: Xuanwo <github@xuanwo.io>
* fix: dedup should not mark element as unneeded
It should only mark element as selected, because some column of
different rows may have same value.
* refactor: Rename dedup to find_unique
As the original `dedup` method only mark bitmap to true when it finds
the element is unique, so `find_unique` is more appropriate for its
name.
* test: Renew bitmap in test_batch_find_unique
* chore: Update comments
* feat: move time index metadata from schema into field
* chore: remove useless code
* test: test select with column alias
* fix: conflicts with develop branch
* test: add test
* test: order by timestamp to ensure query results order
* fix: comment
* refactor: Serialize Schema/TableMeta/TableInfo to raw structs
* test: Add tests for raw struct conversion
* style: Fix clippy
* refactor: SchemaBuilder::timestamp_index takes Option<usize>
So caller could chain the timestamp_index method call where there is no
timestamp index.
* style(datatypes): Chains SchemaBuilder method calls
* feat(storage): Implement skeleton of ReadResolver
ReadResolver is used to resolve difference between schemas
* feat(storage): Add user_column_end to ReadResover
* feat(storage): Implement Batch::batch_from_parts
Used to construct Batch from parts according to the schema that user
expects to read.
* feat(storage): Compat memtable schema
* feat(storage): Compat parquet file schema
* fix(storage): ReadResolver supports projection under same schema version
Now ReadResolver takes ProjectedSchemaRef as dest schema, and checks
whether a value column is needed by the schema after projection.
* feat(storage): Check whether columns are same columns
is_source_column_readable() takes ColumnMetadata instead of
ColumnSchema, and compares their column id to check whether they are
same columns.
* refactor(storage): Use row_key_end/user_column_end in source_schema
Rename ReadResolver::is_needed to ReadResolver::is_source_needed, and
remove row_key_end/user_column_end from ReadResolver, since they should
be same as source_schema's
* chore(storage): Remove unused codes
* test(storage): Add tests for the resolver
* feat(storage): Returns error on different source and dest column names
* style(storage): Fix clippy
* refactor: Rename ReadResolver to ReadAdapter
* chore(table): Removed unused comment
* refactor: rename to is_source_column_compatible
* feat: align_bucket support i64 and timestamp values
* feat: add Int64 to timestamp
* feat: support query i64 timestamp vector
* test: fix failling tests
* refactor: simplify some code
* fix: CR comments and add insert and query test for i64 timestamp column
* chore: Update StoreSchema comment
* feat: Add metadata to ColumnSchema
* feat: Impl conversion between ColumnMetadata and ColumnSchema
We could use this feature to store the ColumnMetadata as arrow's
Schema, since the ColumnSchema could be further converted to an arrow
schema. Then we could use ColumnMetadata in StoreSchema, which contains
more information, especially the column id.
* feat(storage): Merge schema::Error to metadata::Error
To avoid cyclic dependency of two Errors
* feat(storage): Store ColumnMetadata in StoreSchema
* feat(storage): Use StoreSchemaRef to avoid cloning the whole StoreSchema struct
* test(storage): Fix test_store_schema
* feat(datatypes): Return error on duplicate meta key
* chore: Address CR comments
* refactor: Remove column_null_mask in MutationExtra
MutationExtra::column_null_mask is no longer needed as we could ensure
there is no missing column in WriteBatch.
* feat(storage): Remove MutationExtra
Just stores MutationType in the WalHeader, no longer needs MutationExtra
* feat: Adds ColumnDefaultConstraint::create_default_vector
ColumnDefaultConstraint::create_default_vector is ported from
MitoTable::try_get_column_default_constraint_vector.
* refactor: Replace try_get_column_default_constraint_vector by create_default_vector
* style: Remove unnecessary map_err in MitoTable::insert
* feat: Adds compat_write
For column in `dest_schema` but not in `write_batch`, this method would insert a
vector with default value to the `write_batch`. If there are columns not in
`dest_schema`, an error would be returned.
* chore: Add info log to RegionInner::alter
* feat(storage): RegionImpl::write support request with old version
* feat: Add nullable check when creating default value
* feat: Validate nullable and default value
* chore: Modify PutOperation comments
* chore: Make ColumnDescriptor::is_nullable readonly and validate name
* feat: Use CompatWrite trait to replace campat::compat_write method
Adds a CompactWrite trait to support padding columns to WriteBatch:
- The WriteBatch and PutData implements this trait
- Fix the issue that WriteBatch::schema is not updated to the
schema after compat
- Also validate the created column when adding to PutData
The WriteBatch would also pad default value to missing columns in
PutData, so the memtable inserter don't need to manually check whether
the column is nullable and then insert a NullVector. All WriteBatch is
ensured to have all columns defined by the schema in its PutData.
* feat: Validate constraint by ColumnDefaultConstraint::validate()
The ColumnDefaultConstraint::validate() would also ensure the default
value has the same data type as the column's.
* feat: Use NullVector for null columns
* fix: Fix BinaryType returns wrong logical_type_id
* fix: Fix tests and revert NullVector for null columns
NullVector doesn't support custom logical type make it hard to
encode/decode, which also cause the arrow/protobuf codec of write batch
fail.
* fix: create_default_vector use replicate to create vector with default value
This would fix the test_codec_with_none_column_protobuf test, as we need
to downcast the vector to construct the protobuf values.
* test: add tests for column default constraints
* test: Add tests for CompatWrite trait impl
* test: Test write region with old schema
* fix(storage): Fix replay() applies metadata too early
The committed sequence of the RegionChange action is the sequence of the
last entry that use the old metadata (schema). During replay, we should
apply the new metadata after we see an entry that has sequence greater
than (not equals to) the `RegionChange::committed_sequence`
Also remove duplicate `set_committed_sequence()` call in
persist_manifest_version()
* chore: Removes some unreachable codes
Also add more comments to document codes in these files
* refactor: Refactor MitoTable::insert
Return error if we could not create a default vector for given column,
instead of ignoring the error
* chore: Fix incorrect comments
* chore: Fix typo in error message
* refactor:replace another axum-test-helper branch
* refactor: upgrade opendal version
* refactor: use cursor for file buffer
* refactor:remove native-tls in mysql_async
* refactor: use async block and pipeline for newer opendal api
* chore: update Cargo.lock
* chore: update dependencies
* docs: removed openssl from build requirement
* fix: call close on pipe writer to flush reader for parquet streamer
* refactor: remove redundant return
* chore: use pinned revision for our forked mysql_async
* style: avoid wild-card import in test code
* Apply suggestions from code review
Co-authored-by: Yingwen <realevenyag@gmail.com>
* style: use chained call for builder
Co-authored-by: liangxingjian <965662709@qq.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: adds commited_sequence to RegionChange action, #281
* refactor: saving protocol action when writer version is changed
* feat: recover all region medata in manifest and replay them when replaying WAL, #282
* refactor: minor change and test recovering metadata after altering table schema
* fix: write wrong min_reader_version into manifest for region
* refactor: move up DataRow
* refactor: by CR comments
* test: assert recovered metadata
* refactor: by CR comments
* fix: comment