Don't include documents in entities in annotation JSON #2913
Labels
No labels
backend
critical
defect
duplicate
enhancement
fixed
frontend
general
invalid
major
minor
normal
oxjs
pandora_client
python-ox
task
trivial
wontfix
worksforme
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: 0x2620/pandora#2913
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
cf #2830
If you have many documents attached to entities, each entity has a lot of data, and an item
X
has lots of annotations pointing to entities, thenget(id=X, keys=['layers'])
gets a bit painful. Partly because fetching the documents for each entity in turn is expensive, and partly because the JSON for each entity is big. Also, if the same entity appears many times in an item, then it's completely redundant to send the full entity JSON along with each annotation.So here are two ideas for improving this, both branches on https://gitlab.com/wjt/pandora.git (on top of #2804). They're the same except for the top patch on each:
get-layers-no-entity-documents
: leave documents out of entities inside annotations. This is about 30% faster in my testingget-layers-only-entity-name
: Pandora's own interface only uses theid
andname
keys ofannotation
.entity
(it refetches the whole entity to show the popover), so just leave everything else out. This is ~3× fasterBoth could break applications using the API, obviously 2 is more likely to.
A Third Way would be to split the entities out, so
get
would return:This is somewhere in between the two branches here, speed-wise, and every API method that returns annotations would need to be changed...
What do you think?
i think just including id/name is good enough. always including entities instead of fetching them on demand might also slow down things. will test your patches and merge, not aware of it breaking any use of the api besides your possibly.
Yes, only fetching & returning id & name for entities does break my application, but obviously I am happy to fix it! :-)
Ah, one of these patches actually breaks
Entity.json(keys=['documents'])
, will fix…Fixed both branches. Previously, “Entity.json: get document ids from join table” was fetching
id
fromentity_documentproperties
rather thandocument_id
.Apart from this, no other problems to report from running the
get-layers-only-entity-name
branch here for the last couple of days.In 34747c0/pandora: