|
95af9f4f4a
|
pdf with html description
|
2016-01-29 22:17:39 +05:30 |
|
|
36fc6e7e73
|
pages can also end with xml
|
2016-01-28 17:53:50 +05:30 |
|
|
c03f72b47c
|
dont fail parsing parts of the pdf
|
2016-01-25 15:51:54 +05:30 |
|
|
d70bd8797a
|
s/exc_info=1/exc_info=True/g
|
2016-01-24 14:43:03 +05:30 |
|
|
f008f3d200
|
disable PIL debug logging
|
2016-01-23 18:56:13 +05:30 |
|
|
f43fc6a172
|
add meta.extract_text
|
2016-01-19 21:34:32 +05:30 |
|
|
08e9472d2a
|
get size from zip
|
2016-01-16 09:19:22 +05:30 |
|
|
4528ab60c4
|
try not to break if file can not be parsed
|
2016-01-15 13:03:42 +05:30 |
|
|
b33d066322
|
import kepub as epub, http://wiki.mobileread.com/wiki/Kepub
|
2016-01-13 23:18:32 +05:30 |
|
|
e6e52d53d5
|
ignore covers that are to tall or to wide
|
2016-01-13 16:41:28 +05:30 |
|
|
b9a8c91868
|
some attributes don't work
|
2016-01-13 11:33:47 +05:30 |
|
|
1efe02c87c
|
avoid ['']
|
2016-01-13 10:58:55 +05:30 |
|
|
de984a344e
|
extract tableofcontents from pdf
|
2016-01-12 14:57:33 +05:30 |
|
|
18a72db811
|
cleanup toc and extract for all epubs
|
2016-01-12 00:23:11 +05:30 |
|
|
fc11869088
|
allow all meta keys from file
|
2016-01-11 19:59:07 +05:30 |
|
|
bb09596566
|
toc href can contain #
|
2016-01-11 19:25:33 +05:30 |
|
|
02e040d9f5
|
store metadata per user. remove primaryid. only store isbn13
|
2016-01-11 19:17:12 +05:30 |
|
|
59a3709f84
|
use new default poster, remove black posters from icon cache
|
2016-01-10 12:58:25 +05:30 |
|
|
5d02474ce8
|
decode html
|
2016-01-08 16:15:10 +05:30 |
|
|
71d8825783
|
normalize names
|
2016-01-08 16:15:10 +05:30 |
|
|
d866b4de91
|
parse epubs without manifest
|
2016-01-06 18:40:23 +05:30 |
|
|
4ed4926bd8
|
epub: metadata.conver is id not name
|
2016-01-05 15:30:15 +05:30 |
|
|
ca3888869b
|
epub: use metadata.cover if set
|
2016-01-05 15:20:47 +05:30 |
|
|
78c9c5443f
|
epub parser: take larges image from manifest, strip html tags from description
|
2016-01-05 14:42:02 +05:30 |
|
|
051b634008
|
ignore errors for non utf-8 html files
|
2016-01-03 21:00:30 +05:30 |
|
|
619a2fbd37
|
split pdf author
|
2015-12-25 20:23:22 +05:30 |
|
|
f8c09226de
|
normalize language
|
2015-12-25 19:40:49 +05:30 |
|
|
c5afc46af1
|
cleanup pdf
|
2015-12-25 13:33:32 +05:30 |
|
|
ebc0b95022
|
better pdf parsing
|
2015-12-24 20:30:14 +05:30 |
|
|
ccd3b166d0
|
fix empty author
|
2015-12-24 19:07:36 +05:30 |
|
|
fe7769a7ba
|
dont fail if reading metadata.opf fails
|
2015-12-08 11:54:04 +00:00 |
|
|
81cd9c2337
|
fix epub metadata parser
|
2015-12-01 17:20:32 +01:00 |
|
|
d497e89b2b
|
use logging.getLogger(__name__)
|
2015-11-29 15:56:38 +01:00 |
|
|
c3548a1530
|
cover can be in svg
|
2015-11-17 19:23:07 +01:00 |
|
|
fba2fa78ce
|
ignore none as epub metadata value
|
2015-11-16 16:52:36 +01:00 |
|
|
a24061518a
|
better epub parsing
- dont fail if epubs are invalid zip
- handle quoted filenames
- dont fail if file is missing
|
2015-11-16 16:02:45 +01:00 |
|
|
62e50c29c6
|
import description from opf
|
2015-10-30 11:31:52 +01:00 |
|
|
6d19dd5e81
|
inital cbr support
|
2015-03-14 13:05:15 +05:30 |
|
|
6d3d0bbc43
|
txt.js/txt.py path has changed
|
2015-03-08 01:46:55 +05:30 |
|
|
7a76e21e99
|
only strip strings
|
2015-02-22 16:37:42 +05:30 |
|
|
121a2c9ac3
|
ignore osx resource forks
|
2014-11-15 01:05:33 +00:00 |
|
|
d722ae004b
|
handle utf-16 pdf info
|
2014-11-15 00:57:49 +00:00 |
|
|
89d9ab4f11
|
fix default icon
|
2014-10-31 19:49:36 +01:00 |
|
|
c6c8e0dc8a
|
try to decrypt pdf with empty password if its encrypted
|
2014-10-31 16:13:02 +01:00 |
|
|
a306370f0d
|
more utf-8 issues
|
2014-10-31 15:41:46 +01:00 |
|
|
3f3299e820
|
fix epub parsing
|
2014-10-31 09:58:52 +01:00 |
|
|
9db6adc222
|
run txt cover script with python3
|
2014-10-01 10:50:46 +02:00 |
|
|
c961aa5c64
|
fix text extraction on osx
|
2014-09-30 22:30:09 +02:00 |
|
|
461fe3b9cf
|
more str/bytes
|
2014-09-08 21:17:35 +02:00 |
|
|
8c6164e0c4
|
use PyPDF2
|
2014-09-08 20:46:09 +02:00 |
|