Commit graph

84 commits

Author SHA1 Message Date
j
209ebeaab8 make extension lower case 2016-02-13 15:58:06 +05:30
j
4549a4be4e epubs can be invalid 2016-02-07 20:00:04 +05:30
j
9747f27d31 run pdftotext only once 2016-02-07 17:11:00 +05:30
j
828705923c ignore asin 2016-02-07 13:54:38 +05:30
j
8c5f21c83d strip html tags from book metadata 2016-02-04 15:25:27 +05:30
j
b8fc91142a better author names 2016-02-03 14:45:09 +05:30
j
0e3794e6a3 hide window, open file not folder 2016-02-01 00:49:25 +05:30
j
24d4c4dc70 pdftotext also need short names 2016-01-31 23:01:52 +05:30
j
5dead44107 windows pathnames 2016-01-31 22:58:53 +05:30
j
7380c9aab7 dont cloes_fds if stdout/stderr is piped 2016-01-31 18:55:12 +05:30
j
95af9f4f4a pdf with html description 2016-01-29 22:17:39 +05:30
j
36fc6e7e73 pages can also end with xml 2016-01-28 17:53:50 +05:30
j
c03f72b47c dont fail parsing parts of the pdf 2016-01-25 15:51:54 +05:30
j
d70bd8797a s/exc_info=1/exc_info=True/g 2016-01-24 14:43:03 +05:30
j
f008f3d200 disable PIL debug logging 2016-01-23 18:56:13 +05:30
j
f43fc6a172 add meta.extract_text 2016-01-19 21:34:32 +05:30
j
08e9472d2a get size from zip 2016-01-16 09:19:22 +05:30
j
4528ab60c4 try not to break if file can not be parsed 2016-01-15 13:03:42 +05:30
j
b33d066322 import kepub as epub, http://wiki.mobileread.com/wiki/Kepub 2016-01-13 23:18:32 +05:30
j
e6e52d53d5 ignore covers that are to tall or to wide 2016-01-13 16:41:28 +05:30
j
b9a8c91868 some attributes don't work 2016-01-13 11:33:47 +05:30
j
1efe02c87c avoid [''] 2016-01-13 10:58:55 +05:30
j
de984a344e extract tableofcontents from pdf 2016-01-12 14:57:33 +05:30
j
18a72db811 cleanup toc and extract for all epubs 2016-01-12 00:23:11 +05:30
j
fc11869088 allow all meta keys from file 2016-01-11 19:59:07 +05:30
j
bb09596566 toc href can contain # 2016-01-11 19:25:33 +05:30
j
02e040d9f5 store metadata per user. remove primaryid. only store isbn13 2016-01-11 19:17:12 +05:30
j
59a3709f84 use new default poster, remove black posters from icon cache 2016-01-10 12:58:25 +05:30
j
5d02474ce8 decode html 2016-01-08 16:15:10 +05:30
j
71d8825783 normalize names 2016-01-08 16:15:10 +05:30
j
d866b4de91 parse epubs without manifest 2016-01-06 18:40:23 +05:30
j
4ed4926bd8 epub: metadata.conver is id not name 2016-01-05 15:30:15 +05:30
j
ca3888869b epub: use metadata.cover if set 2016-01-05 15:20:47 +05:30
j
78c9c5443f epub parser: take larges image from manifest, strip html tags from description 2016-01-05 14:42:02 +05:30
j
051b634008 ignore errors for non utf-8 html files 2016-01-03 21:00:30 +05:30
j
619a2fbd37 split pdf author 2015-12-25 20:23:22 +05:30
j
f8c09226de normalize language 2015-12-25 19:40:49 +05:30
j
c5afc46af1 cleanup pdf 2015-12-25 13:33:32 +05:30
j
ebc0b95022 better pdf parsing 2015-12-24 20:30:14 +05:30
j
ccd3b166d0 fix empty author 2015-12-24 19:07:36 +05:30
j
fe7769a7ba dont fail if reading metadata.opf fails 2015-12-08 11:54:04 +00:00
j
81cd9c2337 fix epub metadata parser 2015-12-01 17:20:32 +01:00
j
d497e89b2b use logging.getLogger(__name__) 2015-11-29 15:56:38 +01:00
j
c3548a1530 cover can be in svg 2015-11-17 19:23:07 +01:00
j
fba2fa78ce ignore none as epub metadata value 2015-11-16 16:52:36 +01:00
j
a24061518a better epub parsing
- dont fail if epubs are invalid zip
- handle quoted filenames
- dont fail if file is missing
2015-11-16 16:02:45 +01:00
j
62e50c29c6 import description from opf 2015-10-30 11:31:52 +01:00
j
6d19dd5e81 inital cbr support 2015-03-14 13:05:15 +05:30
j
6d3d0bbc43 txt.js/txt.py path has changed 2015-03-08 01:46:55 +05:30
j
7a76e21e99 only strip strings 2015-02-22 16:37:42 +05:30