0x2620/pandora

Author	SHA1	Message	Date
Will Thompson	aa40a40595	Annotation.json: only include entity id & name Fetching documents for each entity in turn is expensive. (I have tried using ArrayAgg to fetch them in the same query as the Entity — no improvement. It's possible that being able to join to entity_entity, and then use ArrayAgg, would be better.) Even once you've fetched them all, if the same entity appears many times in an item, then get(..., keys=['layers']) duplicates the whole JSON for the entity many times: expensive to serialize, expensive to send over the wire. Pandora's own web interface only depends on the 'id' key of 'entity' in each annotation, and refetches the rest of the entity to show the pop-up dialog when you press E. So by just not bothering to fetch and send any other keys, get(..., keys=['layers']) on an item with many entity annotations is substantially faster. (I experimented with splitting the full entities off to one side, so, you'd have: { "layers": { somelayer: [..., {..., "entity": {"id": ABC}}, ], ... }, "entities": { ABC: {...}, ... } } This is quicker than the status quo, but obviously not as fast as not fetching & sending the rest at all!)	2016-04-28 14:15:23 +01:00
Will Thompson	aa0fbc9d4a	Entity.json: get document ids from join table This is a bit quicker because it's just a lookup in a single table, not a join.	2016-04-28 14:15:12 +01:00
j	c149c5c42e	use xenial repository if installing on 16.04	2016-04-28 13:27:46 +02:00
Will Thompson	400b6650a2	Annotation.json: document empty-subtitle special case	2016-04-19 13:52:52 +01:00
Will Thompson	af0d87b569	Annotation.json: reduce repeated layer lookups It's actually quite costly to look up keys in CONFIG, particularly inside a loop: this trims ~5% off get(keys=['layers']) for annotation-heavy items.	2016-04-19 13:52:47 +01:00
Will Thompson	3f5be0bd27	findClips: look up entity names (fixes #2804 )	2016-04-19 12:28:58 +01:00
Will Thompson	d0129a4416	findClips: avoid O(n²) lookup of clip from annotation This doesn't make much difference for small ranges, of course.	2016-04-19 11:25:12 +01:00
Will Thompson	ba00bcbf7b	findClips: select_related('item') / ('item__sort') Clip.public_id uses self.item.public_id. Clip.json() uses self.item.sort, so we should select_related on that rather than the clip's own sort field. (They are identical objects. Is Clip.sort ever used directly?) With this change, findClips() issues one query to fetch clips plus one query per flavour of annotation; before, it issued two extra queries per clip.	2016-04-19 11:25:06 +01:00
Will Thompson	6dbb7f921a	findClips: only scan layers once	2016-04-19 11:14:25 +01:00
j	27830d7c58	use markdown in readme	2016-04-15 14:21:24 +02:00
Will Thompson	b3df5b8d56	findAnnotations: match some fields case-sensitively Requiring layer to have the right case is consistent with addAnnotation(), and means the _layer[_like] index can be used. In my testing, if itemsQuery specifies a single item, then postgres doesn't bother with the layer index anyway; but if not, it makes a pretty big (~3×) difference. Matching public_id and item__public_id case-sensitively also seems reasonable (it's consistent with get() and getAnnotation()). (Is lower() redundant for the case-insensitive comparisons? ie. is UPPER(x.lower()) == UPPER(x)? I'm not sure, it's cheap, let's leave it.)	2016-04-05 12:19:32 +01:00
Will Thompson	8d1b4de337	findAnnotations(): make 'findvalue' the default key Annotations have no 'name' field, so findAnnotations({query: {conditions: [{value: 'foo'}]}}) would previously raise an exception.	2016-04-05 12:19:31 +01:00
Will Thompson	284caf03c3	get_by_key: short-circuit This is about 30% faster, presumably because it avoids allocation and/or closing over variables is slow(?). It's not hugely significant (I misread a line_profile report) but why not.	2016-04-05 12:19:31 +01:00
j	7ac68697d4	update pdf.js	2016-04-04 15:50:07 +02:00
j	e1967e96bc	fix pdf zoom	2016-04-04 15:50:07 +02:00
j	652df88342	return 404	2016-04-04 15:50:07 +02:00
j	1bff4aa0e9	avoid storing invalid poster frames, only show videos with video	2016-04-01 16:40:20 +02:00
j	b8beb51480	fix multipart audio only timelines	2016-03-31 14:54:38 +02:00
j	30ce422452	disable apt translations	2016-03-26 22:56:09 +01:00
j	94b940436f	fix timelines for items with many parts - use durations from streams not from timelines - don't accumulate timeline drift	2016-03-19 18:58:48 +01:00
j	f0b8b2b81e	check that range is [int, int]	2016-03-17 16:06:08 +01:00
j	e536dcb3b0	<=	2016-03-17 10:47:08 +01:00
j	7761cf9ec2	update celery package and promt to install new init files for workers	2016-03-17 10:38:15 +01:00
Will Thompson	7554b0c105	init: restart celery workers on 'reload' (fixes #2904 ) Sending HUP to the parent of a family of celery workers causes the parent to re-exec itself, spawning a new set of child workers without terminating the old ones. So instead we send TERM to the parent on 'reload', which cleans up the children, and rely on systemd/upstart to respawn the whole family.	2016-03-17 10:32:58 +01:00
j	e16310062b	fix vm build on Ubuntu 16.04	2016-03-15 23:04:53 +01:00
Will Thompson	eeaeda3970	Support WebVTT subtitle export	2016-03-11 14:16:23 +01:00
j	36463a8120	fix typo in README	2016-03-11 10:33:48 +01:00
j	697e501a4f	only update item timeline once all parts are done	2016-03-11 10:33:48 +01:00
j	f6cebcaec9	fix user/group api	2016-03-08 20:14:05 +05:30
j	3a56a8138d	only install systemd if /bin/systemctl is available	2016-03-08 19:51:19 +05:30
j	bff4a9553d	needs daemon-reload after replacing systemd service file	2016-03-08 13:20:33 +05:30
j	4fd865efeb	fix pid	2016-03-08 13:18:35 +05:30
j	29204b6fb5	move gunicorn configuration from init script to config file	2016-03-07 14:25:24 +05:30
j	7ec1e9f6da	update django version	2016-03-06 21:59:49 +05:30
j	9d0b50bced	build vm with vmdebootstrap	2016-03-05 18:36:39 +05:30
j	4f28c2c548	fix annotation import, values are decoded in d1.9	2016-03-05 15:36:47 +05:30
Will Thompson	a8dcbbbe89	Include DocumentProperties.data in Document.json()	2016-03-05 15:07:47 +05:30
Will Thompson	a55cbcfb9f	DocumentProperties: add data field	2016-03-05 15:07:47 +05:30
j	42ac4a88b8	Only show Find: Entity if config defines entites Followup to 9a4c24	2016-03-05 14:49:51 +05:30
Will Thompson	0c98cd080e	Entity.alternativeNames: default to () not [] (fixes #2896 ) Otherwise this: self.name_find = '\|\|' + '\|\|'.join((self.name,) + self.alternativeNames) + '\|\|' ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ fails because () + [] is an error. I guess this must have been introduced by the DictField/TupleField rewrite. Without this fix, it is impossible to create a new entity. Basically the same logic is used for Event and Place too so I've made the same change to those, and, in passing, fix another copy of the bug fixed for Entity.name_find in `fe7f961`.	2016-03-04 17:11:36 +00:00
Will Thompson	9a4c24cdb4	Support searching documents by entities	2016-03-04 12:41:41 +00:00
Will Thompson	738a9282b4	Document: fix negating id queries	2016-03-04 12:41:41 +00:00
Will Thompson	8c23bdff6d	Implement DocumentProperties.__unicode__	2016-03-04 12:41:41 +00:00
j	4613005b83	use geoip2 api to fix ipv6 lookups	2016-03-04 12:50:44 +05:30
Will Thompson	340277db1a	Raise Error.stackTraceLimit, if it exists (fixes #2894 )	2016-03-03 18:15:37 +05:30
Will Thompson	c6f9f87c8e	Fix autocompleteSort with multiple keys (fixes #2893 ) QuerySet.order_by() takes each key as a separate argument, not as a single comma-separated string.	2016-03-03 18:15:37 +05:30
Will Thompson	2a07e2a1ab	Remove redundant overrides of Model.delete Both of these models have pre_delete handlers which do the same things, so I think these are unnecessary.	2016-03-03 18:10:29 +05:30
Will Thompson	d69a8efd97	Don't save other file-owning models on delete, either	2016-03-03 18:10:29 +05:30
Will Thompson	6e0049a20c	Don't save Document in pre_delete handler (fixes #2889 ) FileField.delete() will, by default, save() the model instance it is attached to. This is pointless if we're in the process of deleting the Document -- and since Document.save() calls Document.update_matches(), this scans all annotations every time a document is deleted.	2016-03-03 18:10:29 +05:30
Will Thompson	7d99950942	Only setInterval once to animate the loading icon (fixes #2888 ) (On Chrome, at least,) window.onload() is called once by hand, and once by the browser. This ends up calling setInterval() twice. When stopAnimation() is called later, only the second interval is cleared; so the first one keeps firing forever. Mostly harmless but unnecessary. Only the first hunk of this patch is really needed, but making startAnimation() / stopAnimation() idempotent can't hurt.	2016-03-03 18:08:46 +05:30

... 8 9 10 11 12 ...

5926 commits