Skip to content

Freeze metadata #3140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

benjeffery
Copy link
Member

Fixes #993

Here's a suggestion for how frozen metadata could work - note that this will have a performance impact as we have to traverse the metadata structure. There is probably a way to avoid the traversal with object_hook in JSON and changing the decoder functions in struct. I'll do that if we think this is the right way to go.

Copy link

codecov bot commented Apr 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.15%. Comparing base (feeecb5) to head (f1cb404).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3140      +/-   ##
==========================================
- Coverage   89.94%   86.15%   -3.80%     
==========================================
  Files          29       10      -19     
  Lines       32651    17132   -15519     
  Branches     5854     3311    -2543     
==========================================
- Hits        29368    14760   -14608     
+ Misses       1869     1325     -544     
+ Partials     1414     1047     -367     
Flag Coverage Δ
c-tests 86.66% <ø> (ø)
lwt-tests 80.38% <ø> (-0.40%) ⬇️
python-c-tests ?
python-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

see 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@petrelharp
Copy link
Contributor

Just thinking this through. Currently the nice way to modify metadata is like

md = tables.metadata
md['foo'] = 123
tables.metadata = md

IIUC, with this change, this won't work - it'll need an additional thing to convert md to a regular Dict? Just an observation, not sure what to do about it.

I guess we should also measure the performance implications.

@jeromekelleher
Copy link
Member

jeromekelleher commented Apr 15, 2025

Hmm. I'm leaning towards marking #993 as a wontfix and living with it as a known wart. Changes like this will break code, probably quite a lot of it. As it's not fixing an outright bug, and more an attempt to prevent people from making mistakes (or helping them find their mistakes quicker) it's not obvious that the good outweighs the harm here. For example, I can see a lot of older code getting broken by the change, resulting in people pinning to older versions of tskit rather than bothering with a fix.

@jeromekelleher
Copy link
Member

(Thanks for working it up @benjeffery, it's really helpful to have a concrete proposal to discuss!)

@jeromekelleher
Copy link
Member

A large and loud warning about this behaviour at the top of the metadata section, perhaps along with a warning on every accessor function (via an rst macro thingy) would be a better solution all round at this point I think)

@hyanwong
Copy link
Member

hyanwong commented Apr 15, 2025

What about a user warning if you deliberately attempt to set metadata directly? Or is that too complicated / intricate to figure out how to do in a performant way?

However, I can't figure out how not to warn in the example Peter gave above, of setting stuff in the returned object.

@benjeffery
Copy link
Member Author

Hmm. I'm leaning towards marking #993 as a wontfix and living with it as a known wart.

I completely agree, having worked it up and seeing how much unrelated test breakage we had. I'd be happy with some prominent docs warnings.

I also don't think the behaviour is too unexpected, but maybe I'm too close to the code to simulate what it is like for an outsider.

@petrelharp
Copy link
Contributor

I also don't think the behaviour is too unexpected, but maybe I'm too close to the code to simulate what it is like for an outsider.

I think it's pretty mystifying that assigning does nothing; however, people will figure this out reasonably quickly (and actually modifying metadata is a semi-advanced thing). But given there doesn't seem to be a reasonable way to fix it, a bigger issue IMO is that the way to actually modify metadata is relatively obscure. I'd propose we provide an example of modifying existing (top-level and per-table-entry) metadata (with a header, so it's easily findable) in the docs.

@hyanwong
Copy link
Member

Because it's specifically a python thing, it's documented at https://tskit.dev/tutorials/metadata.html#modifying-metadata-and-schemas, I think.

@petrelharp
Copy link
Contributor

Because it's specifically a python thing, it's documented at https://tskit.dev/tutorials/metadata.html#modifying-metadata-and-schemas, I think.

Ah, right - and, there's a link up top from the other Metadata docs. Still, it'd be good to have not just "see here" but maybe a warning that in-place modification wont' work. (Also, those docs don't have the simplest use cases of "change an existing value in top-level metadata" or "change/add a value in each row".)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metadata can appear to be set directly, but doesn't work: unfriendly to the user
4 participants