decentral1se/pandora

Author	SHA1	Message	Date
Will Thompson	8d25e3be78	findDocuments: improve entity query performance When I implemented this in `9a4c24c`, there were not many rows in entity_documentproperties in the database here. Now that there are, computing the document_document -> entity_documentproperties -> entity_entity join and then filtering is really, really slow. Postgres seems to materialize the whole join and then scan it. If we get a set of matching document IDs for the entity query in a subquery, and then just filter with IN on that, things are much faster: scan entity_entity; in a nested loop, get the document_ids via entity_documentproperties; hash this set; and then scan document_document. Searching for a single character, this brings the query from ~1.1s to ~400ms. Searching for a full word, ~800ms to 120ms This condition is getting really ugly -- I am sorry! References #2935	2016-06-28 16:33:01 +01:00
j	5aeffcfb6a	check first audio track	2016-06-27 16:51:18 +02:00
j	adfcc1cb27	never set display aspect ratio to 0:0	2016-06-27 16:08:30 +02:00
j	8ac78f3bd6	remove unused force flag from make_poster, update_timeline	2016-06-26 23:24:11 +02:00
j	0f9e80e1e6	avoid saving item twice	2016-06-26 23:22:27 +02:00
j	de9b062d63	make sure existing index is using gin	2016-06-26 16:55:58 +02:00
j	ab0dfddf31	set SECURE_PROXY_SSL_HEADER by default	2016-06-26 15:34:19 +02:00
j	0d89ad640b	ignore some broken audio codecs	2016-06-26 15:33:52 +02:00
j	92f642cbac	pcm sound can have no codec	2016-06-26 14:41:58 +02:00
j	2cec1b9ad5	s/import Image/from PIL import Image/g	2016-06-25 20:39:29 +02:00
j	4785f314cb	Add VP9/Opus support, use VP8 by default - support vp9 and opus - switch to 2 pass encoding - use ffmpeg -movflags +faststart instead of qtfaststart	2016-06-23 17:36:41 +02:00
j	aaacc48259	only save if update_external fails	2016-06-20 18:28:05 +02:00
j	d83647c4a5	don't hide oxtimelines errors	2016-06-20 18:27:31 +02:00
j	6dcbcdd19c	dont update timeline in update_selected, remove unused async get_item case	2016-06-16 14:48:54 +02:00
j	0486d62ec9	use absolute path	2016-06-16 14:48:09 +02:00
j	f25218466b	formating	2016-06-16 14:48:01 +02:00
j	70f34bfde9	typo	2016-06-15 19:13:00 +02:00
j	e3c5ab18c7	only update itemsort if name is changed	2016-06-15 18:31:40 +02:00
j	22f83288c5	avoid looking up item twice	2016-06-15 18:29:09 +02:00
j	7c53dca65b	less async item creation	2016-06-15 18:12:59 +02:00
j	b2a9a5f711	space	2016-06-15 17:56:31 +02:00
j	3c1f4a8c95	dont call module	2016-06-15 17:55:57 +02:00
j	b010aca0a9	s/taskId/id/	2016-06-15 15:45:51 +02:00
j	a0fc6ffadc	typo	2016-06-15 14:55:45 +02:00
j	f4cbe6a114	return empty sequences if no data timeline exists	2016-06-15 14:48:02 +02:00
j	af0e0cffe8	person can be removed again, let async itemsort fail without exception	2016-06-15 14:34:46 +02:00
j	fd9d3bdabf	flake8 + map->[]	2016-06-15 14:34:46 +02:00
j	05c4cfcbc8	add space and other flake8 cleanups	2016-05-28 11:30:43 +02:00
j	5e149a5cb8	add space and other flake8 cleanups	2016-05-28 11:26:46 +02:00
j	225259e521	add space and other flake8 cleanups	2016-05-28 11:18:51 +02:00
j	f21e8413fb	use get_random_string	2016-05-28 11:18:51 +02:00
j	7fdaf6d1ce	include Access-Control-Allow-Origin in 404 not found response	2016-05-27 11:51:47 +02:00
Will Thompson	05e6118a88	findAnnotations: include duration alongside result count fixes #2921	2016-05-05 15:54:25 +01:00
j	41cc8e3573	expose encoding status via api	2016-05-05 10:49:34 +02:00
j	be163826ef	Merge remote-tracking branch 'wjt/fix-migrations'	2016-05-05 10:48:24 +02:00
Will Thompson	39b9b48be2	archive: fix migrations for upload_to function renamings `9c75526` renamed these functions. The function doesn't affect the DB schema so it should be safe to just modify the migraiton.	2016-05-04 17:01:44 +01:00
Will Thompson	e29ea230fb	Add migration for Document.documentproperties ref This should have been included with `a8dcbbb`, which changed the related_name to access DocumentProperties from Document. (There's no actual change to the database.)	2016-05-04 16:55:11 +01:00
j	0f28a2b7d5	fix queue status	2016-04-30 14:15:13 +02:00
j	9c7552699f	fix upload_to callbacks	2016-04-29 13:46:55 +02:00
Will Thompson	2812834ce3	findAnnotations: don't lowercase ids (fixes #2916 ) Without this fix, a condition like: {key: 'id', operator: '==', value: 'A/B'} gets mapped to: public_id__exact=('A/B'.lower()) which is wrong. I introduced this bug in `b3df5b8`. I didn't catch it because I was mostly interested in the 'layer' key -- but layer names are conventionally lowercase anyway so lowercasing them had no effect.	2016-04-29 11:03:45 +01:00
Will Thompson	aa40a40595	Annotation.json: only include entity id & name Fetching documents for each entity in turn is expensive. (I have tried using ArrayAgg to fetch them in the same query as the Entity — no improvement. It's possible that being able to join to entity_entity, and then use ArrayAgg, would be better.) Even once you've fetched them all, if the same entity appears many times in an item, then get(..., keys=['layers']) duplicates the whole JSON for the entity many times: expensive to serialize, expensive to send over the wire. Pandora's own web interface only depends on the 'id' key of 'entity' in each annotation, and refetches the rest of the entity to show the pop-up dialog when you press E. So by just not bothering to fetch and send any other keys, get(..., keys=['layers']) on an item with many entity annotations is substantially faster. (I experimented with splitting the full entities off to one side, so, you'd have: { "layers": { somelayer: [..., {..., "entity": {"id": ABC}}, ], ... }, "entities": { ABC: {...}, ... } } This is quicker than the status quo, but obviously not as fast as not fetching & sending the rest at all!)	2016-04-28 14:15:23 +01:00
Will Thompson	aa0fbc9d4a	Entity.json: get document ids from join table This is a bit quicker because it's just a lookup in a single table, not a join.	2016-04-28 14:15:12 +01:00
Will Thompson	400b6650a2	Annotation.json: document empty-subtitle special case	2016-04-19 13:52:52 +01:00
Will Thompson	af0d87b569	Annotation.json: reduce repeated layer lookups It's actually quite costly to look up keys in CONFIG, particularly inside a loop: this trims ~5% off get(keys=['layers']) for annotation-heavy items.	2016-04-19 13:52:47 +01:00
Will Thompson	3f5be0bd27	findClips: look up entity names (fixes #2804 )	2016-04-19 12:28:58 +01:00
Will Thompson	d0129a4416	findClips: avoid O(n²) lookup of clip from annotation This doesn't make much difference for small ranges, of course.	2016-04-19 11:25:12 +01:00
Will Thompson	ba00bcbf7b	findClips: select_related('item') / ('item__sort') Clip.public_id uses self.item.public_id. Clip.json() uses self.item.sort, so we should select_related on that rather than the clip's own sort field. (They are identical objects. Is Clip.sort ever used directly?) With this change, findClips() issues one query to fetch clips plus one query per flavour of annotation; before, it issued two extra queries per clip.	2016-04-19 11:25:06 +01:00
Will Thompson	6dbb7f921a	findClips: only scan layers once	2016-04-19 11:14:25 +01:00
Will Thompson	b3df5b8d56	findAnnotations: match some fields case-sensitively Requiring layer to have the right case is consistent with addAnnotation(), and means the _layer[_like] index can be used. In my testing, if itemsQuery specifies a single item, then postgres doesn't bother with the layer index anyway; but if not, it makes a pretty big (~3×) difference. Matching public_id and item__public_id case-sensitively also seems reasonable (it's consistent with get() and getAnnotation()). (Is lower() redundant for the case-insensitive comparisons? ie. is UPPER(x.lower()) == UPPER(x)? I'm not sure, it's cheap, let's leave it.)	2016-04-05 12:19:32 +01:00
Will Thompson	8d1b4de337	findAnnotations(): make 'findvalue' the default key Annotations have no 'name' field, so findAnnotations({query: {conditions: [{value: 'foo'}]}}) would previously raise an exception.	2016-04-05 12:19:31 +01:00

... 4 5 6 7 8 ...

2768 commits