Mask as a struct attribute.#304
Open
yfukai wants to merge 40 commits into
Open
Conversation
The previous commit (db35287) mistakenly deleted upstream/main's _SQLIDSet scratch-table machinery, _create_id_scratch_table, the out_degree/copy bound-variable handling, and the three scratch-table tests — they were diffed against a stale fork main and wrongly treated as PR-added code. Restore them verbatim from main; the struct-attr column-expansion simplification from the previous commit is kept.
Masks were registered as pl.Object and round-tripped through pickle,
leaving the bbox locked inside an opaque blob. They are now stored as
pl.Struct({min_(z)yx, max_(z)yx: Int64, data: Binary}) so bbox fields
are natively filterable via NodeAttr("mask").struct.field(...), while
the binary mask stays blosc2-compressed in the data field.
- Mask.struct_dtype() / to_struct() / from_struct() conversion API,
plus as_mask() to coerce struct dicts and legacy Mask objects alike
- RegionPropsNodes and MaskDiskAttrs register the struct dtype and
write struct values
- consumers (GraphArrayView, IoUEdgeAttr, MaskMatching, ctc metrics,
compute_overlaps, to_geff) materialize masks via as_mask(), so
legacy pl.Object mask attributes keep working
- ctc metrics skip the pickle-to-bytes multiprocessing shim for
struct mask columns, which are Arrow-native
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This was referenced Jun 14, 2026
Merged
Resolve conflicts from upstream's struct-attribute PR (royerlab#268) and the single->bulk add_node/add_edge refactor: - attrs.py: take upstream's `columns` property and `f.to_attr().expr` filter reduction (handles compound AttrFilters as well as struct-field AttrComparisons). - _rustworkx_graph.py: drop the now-duplicate nested `_extract_field_path`; use upstream's module-level `_eval_filter` (supports AND/OR/XOR/NOT). - _sql_graph.py: adopt upstream's `_to_sql_clause` and the removal of the single `add_node`/`add_edge` methods (BaseGraph now delegates to the bulk variants). Mask struct flattening is preserved via `_flatten_attrs_for_write` on the bulk write paths. Mask remains stored as a struct attribute (this branch's feature); upstream left Mask as pl.Object. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The Mask struct's `data` leaf (blosc2-compressed bytes) was stored through a SQLAlchemy PickleType column, wrapping the already-compressed bytes in a second pickle layer on every write and unpickling on every read. Map `pl.Binary` to `sa.LargeBinary` so binary bytes are stored as a raw BLOB. Pickling is now detected from the actual SQL column type (PickleType only, not LargeBinary): - `_is_pickled_sql_type` returns True only for PickleType columns. - `unpickle_bytes_columns` takes the explicit set of pickled physical columns so raw-binary columns (e.g. the Mask `data` leaf) are left untouched. - `_restore_pickled_column_types` re-tags reflected LargeBinary columns as PickleType except genuine raw-binary columns (schema dtype pl.Binary), which reflection cannot otherwise distinguish from pickled blobs. Reordered `_define_schema` so the attribute schemas are available when this runs. Adds a regression test asserting the Mask `data` leaf is a raw LargeBinary column (not PickleType) before and after reload, and that the mask round-trips and struct-field filtering still works. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- Remove duplicate `_StructNamespace` class in attrs.py left by the upstream merge (this branch and upstream PR royerlab#268 each added an identical copy). - Drop the now-empty `if TYPE_CHECKING: pass` block and unused `TYPE_CHECKING` import in array/_graph_array.py. - Fix `_restore_pickled_column_types` docstring to reference the helper it actually uses (`_raw_binary_physical_columns`). - Align Mask.bbox_struct_fields/struct_dtype docstrings with the supported 1-to-3 dimension range. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
These four imports break a graph <-> nodes package import cycle: `_base_graph` is imported by nearly every package `__init__`, so a module-level `from tracksdata.nodes._mask import as_mask` re-enters the partially-initialized graph package. Add a short comment at each site so the pattern isn't naively "tidied" into a module-level import. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR explores options to save the mask as a struct attribute. Expect #268 be merged.