-
Notifications
You must be signed in to change notification settings - Fork 899
Add evaluation results module #3633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Wauplin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a first pass and the parsing logic looks good to me 👍
I would also add a simple HfApi.get_eval_results method that takes a repo as input (+token+revision) and returns a list of eval results entries taken from README + .eval_results folder. A bit similar to get_safetensors_metadata that parses high-level info from a repo. I don't think we need a method to upload eval results though.
(I know it's only a draft, happy to review again when ready^^)
src/huggingface_hub/_eval_results.py
Outdated
| entry = EvalResultEntry( | ||
| dataset_id=dataset["id"], | ||
| value=item["value"], | ||
| task_id=dataset.get("task_id"), | ||
| dataset_revision=dataset.get("revision"), | ||
| verify_token=item.get("verifyToken"), | ||
| date=item.get("date"), | ||
| source_url=source.get("url") if source else None, | ||
| source_name=source.get("name") if source else None, | ||
| source_user=source.get("user") if source else None, | ||
| ) | ||
| entries.append(entry) | ||
| else: | ||
| # https://github.com/huggingface/hub-docs/blob/434609e6d09f7c1203ea59fcc32c7ff4d308a68e/modelcard.md?plain=1#L23 format | ||
| source = item.get("source", {}) | ||
| for metric in item.get("metrics", []): | ||
| entry = EvalResultEntry( | ||
| dataset_id=dataset["type"], | ||
| value=metric["value"], | ||
| task_id=dataset.get("config"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe those should be two different types (legacy and new)? no strong opinion though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or just keep using EvalResult (the previous type)
on the Hub at least we have two different types
Co-authored-by: Julien Chaumond <[email protected]>
src/huggingface_hub/hf_api.py
Outdated
| ) | ||
|
|
||
| @validate_hf_hub_args | ||
| def get_eval_results( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure we need this, because this will be exposed by the Hub API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok nice, we will have to update ModelInfo (and model_info) then
hanouticelina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added eval_results property to ModelInfo and updated expand docstring in model_info and list_models.
We need the server-side PR (private) to be merged first.
Add Evaluation Results module to support the Hub's new decentralized evaluation results system: https://huggingface.co/docs/hub/eval-results
This PR introduces:
EvalResultEntrydataclass representing evaluation scores stored in.eval_results/*.yamlfiles.eval_result_entries_to_yaml()to serialize entries to the YAML format.parse_eval_result_entries()to parse YAML data back intoEvalResultEntryobjects.This lives in a new module, separate from the existing
repocard_data.pywhich handles the (legacy?)model-indexformat inREADMEmetadata. Backward compatibility is maintained for now.