Switch to python3

This commit is contained in:
j 2014-09-30 18:15:32 +02:00
commit 9ba4b6a91a
5286 changed files with 677347 additions and 576888 deletions

View file

@ -1,69 +0,0 @@
Metadata-Version: 1.1
Name: backports.ssl-match-hostname
Version: 3.4.0.2
Summary: The ssl.match_hostname() function from Python 3.4
Home-page: http://bitbucket.org/brandon/backports.ssl_match_hostname
Author: Brandon Craig Rhodes
Author-email: brandon@rhodesmill.org
License: UNKNOWN
Description:
The Secure Sockets layer is only actually *secure*
if you check the hostname in the certificate returned
by the server to which you are connecting,
and verify that it matches to hostname
that you are trying to reach.
But the matching logic, defined in `RFC2818`_,
can be a bit tricky to implement on your own.
So the ``ssl`` package in the Standard Library of Python 3.2
and greater now includes a ``match_hostname()`` function
for performing this check instead of requiring every application
to implement the check separately.
This backport brings ``match_hostname()`` to users
of earlier versions of Python.
Simply make this distribution a dependency of your package,
and then use it like this::
from backports.ssl_match_hostname import match_hostname, CertificateError
...
sslsock = ssl.wrap_socket(sock, ssl_version=ssl.PROTOCOL_SSLv3,
cert_reqs=ssl.CERT_REQUIRED, ca_certs=...)
try:
match_hostname(sslsock.getpeercert(), hostname)
except CertificateError, ce:
...
Note that the ``ssl`` module is only included in the Standard Library
for Python 2.6 and later;
users of Python 2.5 or earlier versions
will also need to install the ``ssl`` distribution
from the Python Package Index to use code like that shown above.
Brandon Craig Rhodes is merely the packager of this distribution;
the actual code inside comes verbatim from Python 3.4.
History
-------
* This function was introduced in python-3.2
* It was updated for python-3.4a1 for a CVE
(backports-ssl_match_hostname-3.4.0.1)
* It was updated from RFC2818 to RFC 6125 compliance in order to fix another
security flaw for python-3.3.3 and python-3.4a5
(backports-ssl_match_hostname-3.4.0.2)
.. _RFC2818: http://tools.ietf.org/html/rfc2818.html
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: License :: OSI Approved :: Python Software Foundation License
Classifier: Programming Language :: Python :: 2.4
Classifier: Programming Language :: Python :: 2.5
Classifier: Programming Language :: Python :: 2.6
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.0
Classifier: Programming Language :: Python :: 3.1
Classifier: Topic :: Security :: Cryptography

View file

@ -1,11 +0,0 @@
MANIFEST.in
setup.cfg
setup.py
src/backports/__init__.py
src/backports.ssl_match_hostname.egg-info/PKG-INFO
src/backports.ssl_match_hostname.egg-info/SOURCES.txt
src/backports.ssl_match_hostname.egg-info/dependency_links.txt
src/backports.ssl_match_hostname.egg-info/top_level.txt
src/backports/ssl_match_hostname/LICENSE.txt
src/backports/ssl_match_hostname/README.txt
src/backports/ssl_match_hostname/__init__.py

View file

@ -1,11 +0,0 @@
../backports/__init__.py
../backports/ssl_match_hostname/__init__.py
../backports/ssl_match_hostname/LICENSE.txt
../backports/ssl_match_hostname/README.txt
../backports/__init__.pyc
../backports/ssl_match_hostname/__init__.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt

View file

@ -1,3 +0,0 @@
# This is a Python "namespace package" http://www.python.org/dev/peps/pep-0382/
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

View file

@ -1,51 +0,0 @@
Python License (Python-2.0)
Python License, Version 2 (Python-2.0)
PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
--------------------------------------------
1. This LICENSE AGREEMENT is between the Python Software Foundation
("PSF"), and the Individual or Organization ("Licensee") accessing and
otherwise using this software ("Python") in source or binary form and
its associated documentation.
2. Subject to the terms and conditions of this License Agreement, PSF
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python
alone or in any derivative version, provided, however, that PSF's
License Agreement and PSF's notice of copyright, i.e., "Copyright (c)
2001-2013 Python Software Foundation; All Rights Reserved" are retained in
Python alone or in any derivative version prepared by Licensee.
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python.
4. PSF is making Python available to Licensee on an "AS IS"
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. Nothing in this License Agreement shall be deemed to create any
relationship of agency, partnership, or joint venture between PSF and
Licensee. This License Agreement does not grant permission to use PSF
trademarks or trade name in a trademark sense to endorse or promote
products or services of Licensee, or any third party.
8. By copying, installing or otherwise using Python, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.

View file

@ -1,52 +0,0 @@
The ssl.match_hostname() function from Python 3.4
=================================================
The Secure Sockets layer is only actually *secure*
if you check the hostname in the certificate returned
by the server to which you are connecting,
and verify that it matches to hostname
that you are trying to reach.
But the matching logic, defined in `RFC2818`_,
can be a bit tricky to implement on your own.
So the ``ssl`` package in the Standard Library of Python 3.2
and greater now includes a ``match_hostname()`` function
for performing this check instead of requiring every application
to implement the check separately.
This backport brings ``match_hostname()`` to users
of earlier versions of Python.
Simply make this distribution a dependency of your package,
and then use it like this::
from backports.ssl_match_hostname import match_hostname, CertificateError
...
sslsock = ssl.wrap_socket(sock, ssl_version=ssl.PROTOCOL_SSLv3,
cert_reqs=ssl.CERT_REQUIRED, ca_certs=...)
try:
match_hostname(sslsock.getpeercert(), hostname)
except CertificateError, ce:
...
Note that the ``ssl`` module is only included in the Standard Library
for Python 2.6 and later;
users of Python 2.5 or earlier versions
will also need to install the ``ssl`` distribution
from the Python Package Index to use code like that shown above.
Brandon Craig Rhodes is merely the packager of this distribution;
the actual code inside comes verbatim from Python 3.4.
History
-------
* This function was introduced in python-3.2
* It was updated for python-3.4a1 for a CVE
(backports-ssl_match_hostname-3.4.0.1)
* It was updated from RFC2818 to RFC 6125 compliance in order to fix another
security flaw for python-3.3.3 and python-3.4a5
(backports-ssl_match_hostname-3.4.0.2)
.. _RFC2818: http://tools.ietf.org/html/rfc2818.html

View file

@ -1,102 +0,0 @@
"""The match_hostname() function from Python 3.3.3, essential when using SSL."""
import re
__version__ = '3.4.0.2'
class CertificateError(ValueError):
pass
def _dnsname_match(dn, hostname, max_wildcards=1):
"""Matching according to RFC 6125, section 6.4.3
http://tools.ietf.org/html/rfc6125#section-6.4.3
"""
pats = []
if not dn:
return False
# Ported from python3-syntax:
# leftmost, *remainder = dn.split(r'.')
parts = dn.split(r'.')
leftmost = parts[0]
remainder = parts[1:]
wildcards = leftmost.count('*')
if wildcards > max_wildcards:
# Issue #17980: avoid denials of service by refusing more
# than one wildcard per fragment. A survey of established
# policy among SSL implementations showed it to be a
# reasonable choice.
raise CertificateError(
"too many wildcards in certificate DNS name: " + repr(dn))
# speed up common case w/o wildcards
if not wildcards:
return dn.lower() == hostname.lower()
# RFC 6125, section 6.4.3, subitem 1.
# The client SHOULD NOT attempt to match a presented identifier in which
# the wildcard character comprises a label other than the left-most label.
if leftmost == '*':
# When '*' is a fragment by itself, it matches a non-empty dotless
# fragment.
pats.append('[^.]+')
elif leftmost.startswith('xn--') or hostname.startswith('xn--'):
# RFC 6125, section 6.4.3, subitem 3.
# The client SHOULD NOT attempt to match a presented identifier
# where the wildcard character is embedded within an A-label or
# U-label of an internationalized domain name.
pats.append(re.escape(leftmost))
else:
# Otherwise, '*' matches any dotless string, e.g. www*
pats.append(re.escape(leftmost).replace(r'\*', '[^.]*'))
# add the remaining fragments, ignore any wildcards
for frag in remainder:
pats.append(re.escape(frag))
pat = re.compile(r'\A' + r'\.'.join(pats) + r'\Z', re.IGNORECASE)
return pat.match(hostname)
def match_hostname(cert, hostname):
"""Verify that *cert* (in decoded format as returned by
SSLSocket.getpeercert()) matches the *hostname*. RFC 2818 and RFC 6125
rules are followed, but IP addresses are not accepted for *hostname*.
CertificateError is raised on failure. On success, the function
returns nothing.
"""
if not cert:
raise ValueError("empty or no certificate")
dnsnames = []
san = cert.get('subjectAltName', ())
for key, value in san:
if key == 'DNS':
if _dnsname_match(value, hostname):
return
dnsnames.append(value)
if not dnsnames:
# The subject is only checked when there is no dNSName entry
# in subjectAltName
for sub in cert.get('subject', ()):
for key, value in sub:
# XXX according to RFC 2818, the most specific Common Name
# must be used.
if key == 'commonName':
if _dnsname_match(value, hostname):
return
dnsnames.append(value)
if len(dnsnames) > 1:
raise CertificateError("hostname %r "
"doesn't match either of %s"
% (hostname, ', '.join(map(repr, dnsnames))))
elif len(dnsnames) == 1:
raise CertificateError("hostname %r "
"doesn't match %r"
% (hostname, dnsnames[0]))
else:
raise CertificateError("no appropriate commonName or "
"subjectAltName fields were found")

View file

@ -1,78 +0,0 @@
../html5lib/utils.py
../html5lib/ihatexml.py
../html5lib/__init__.py
../html5lib/tokenizer.py
../html5lib/html5parser.py
../html5lib/sanitizer.py
../html5lib/inputstream.py
../html5lib/constants.py
../html5lib/serializer/__init__.py
../html5lib/serializer/htmlserializer.py
../html5lib/treebuilders/_base.py
../html5lib/treebuilders/__init__.py
../html5lib/treebuilders/etree_lxml.py
../html5lib/treebuilders/dom.py
../html5lib/treebuilders/etree.py
../html5lib/filters/whitespace.py
../html5lib/filters/_base.py
../html5lib/filters/__init__.py
../html5lib/filters/sanitizer.py
../html5lib/filters/lint.py
../html5lib/filters/optionaltags.py
../html5lib/filters/inject_meta_charset.py
../html5lib/filters/alphabeticalattributes.py
../html5lib/treewalkers/pulldom.py
../html5lib/treewalkers/_base.py
../html5lib/treewalkers/genshistream.py
../html5lib/treewalkers/__init__.py
../html5lib/treewalkers/dom.py
../html5lib/treewalkers/etree.py
../html5lib/treewalkers/lxmletree.py
../html5lib/trie/datrie.py
../html5lib/trie/_base.py
../html5lib/trie/__init__.py
../html5lib/trie/py.py
../html5lib/treeadapters/sax.py
../html5lib/treeadapters/__init__.py
../html5lib/utils.pyc
../html5lib/ihatexml.pyc
../html5lib/__init__.pyc
../html5lib/tokenizer.pyc
../html5lib/html5parser.pyc
../html5lib/sanitizer.pyc
../html5lib/inputstream.pyc
../html5lib/constants.pyc
../html5lib/serializer/__init__.pyc
../html5lib/serializer/htmlserializer.pyc
../html5lib/treebuilders/_base.pyc
../html5lib/treebuilders/__init__.pyc
../html5lib/treebuilders/etree_lxml.pyc
../html5lib/treebuilders/dom.pyc
../html5lib/treebuilders/etree.pyc
../html5lib/filters/whitespace.pyc
../html5lib/filters/_base.pyc
../html5lib/filters/__init__.pyc
../html5lib/filters/sanitizer.pyc
../html5lib/filters/lint.pyc
../html5lib/filters/optionaltags.pyc
../html5lib/filters/inject_meta_charset.pyc
../html5lib/filters/alphabeticalattributes.pyc
../html5lib/treewalkers/pulldom.pyc
../html5lib/treewalkers/_base.pyc
../html5lib/treewalkers/genshistream.pyc
../html5lib/treewalkers/__init__.pyc
../html5lib/treewalkers/dom.pyc
../html5lib/treewalkers/etree.pyc
../html5lib/treewalkers/lxmletree.pyc
../html5lib/trie/datrie.pyc
../html5lib/trie/_base.pyc
../html5lib/trie/__init__.pyc
../html5lib/trie/py.pyc
../html5lib/treeadapters/sax.pyc
../html5lib/treeadapters/__init__.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
requires.txt

View file

@ -1,172 +0,0 @@
../ox/image.py
../ox/location.py
../ox/cache.py
../ox/net.py
../ox/utils.py
../ox/jsonc.py
../ox/normalize.py
../ox/form.py
../ox/format.py
../ox/__init__.py
../ox/movie.py
../ox/text.py
../ox/geo.py
../ox/api.py
../ox/fixunicode.py
../ox/oembed.py
../ox/html.py
../ox/file.py
../ox/srt.py
../ox/js.py
../ox/iso.py
../ox/django/http.py
../ox/django/utils.py
../ox/django/monitor.py
../ox/django/__init__.py
../ox/django/middleware.py
../ox/django/decorators.py
../ox/django/fields.py
../ox/django/shortcuts.py
../ox/django/views.py
../ox/django/query.py
../ox/django/widgets.py
../ox/django/api/__init__.py
../ox/django/api/urls.py
../ox/django/api/views.py
../ox/django/api/actions.py
../ox/torrent/__init__.py
../ox/torrent/makemetafile.py
../ox/torrent/bencode.py
../ox/torrent/btformats.py
../ox/web/oxdb.py
../ox/web/lyricsfly.py
../ox/web/spiegel.py
../ox/web/allmovie.py
../ox/web/twitter.py
../ox/web/siteparser.py
../ox/web/ubu.py
../ox/web/epguides.py
../ox/web/lymbix.py
../ox/web/__init__.py
../ox/web/istockphoto.py
../ox/web/archive.py
../ox/web/freebase.py
../ox/web/reddit.py
../ox/web/vimeo.py
../ox/web/thepiratebay.py
../ox/web/auth.py
../ox/web/duckduckgo.py
../ox/web/flixter.py
../ox/web/rottentomatoes.py
../ox/web/criterion.py
../ox/web/lookupbyisbn.py
../ox/web/wikipedia.py
../ox/web/abebooks.py
../ox/web/amazon.py
../ox/web/impawards.py
../ox/web/tv.py
../ox/web/dailymotion.py
../ox/web/movieposterdb.py
../ox/web/filmsdivision.py
../ox/web/arsenalberlin.py
../ox/web/youtube.py
../ox/web/google.py
../ox/web/itunes.py
../ox/web/piratecinema.py
../ox/web/opensubtitles.py
../ox/web/openlibrary.py
../ox/web/gutenbergde.py
../ox/web/mininova.py
../ox/web/imdb.py
../ox/web/apple.py
../ox/web/torrent.py
../ox/web/metacritic.py
../ox/image.pyc
../ox/location.pyc
../ox/cache.pyc
../ox/net.pyc
../ox/utils.pyc
../ox/jsonc.pyc
../ox/normalize.pyc
../ox/form.pyc
../ox/format.pyc
../ox/__init__.pyc
../ox/movie.pyc
../ox/text.pyc
../ox/geo.pyc
../ox/api.pyc
../ox/fixunicode.pyc
../ox/oembed.pyc
../ox/html.pyc
../ox/file.pyc
../ox/srt.pyc
../ox/js.pyc
../ox/iso.pyc
../ox/django/http.pyc
../ox/django/utils.pyc
../ox/django/monitor.pyc
../ox/django/__init__.pyc
../ox/django/middleware.pyc
../ox/django/decorators.pyc
../ox/django/fields.pyc
../ox/django/shortcuts.pyc
../ox/django/views.pyc
../ox/django/query.pyc
../ox/django/widgets.pyc
../ox/django/api/__init__.pyc
../ox/django/api/urls.pyc
../ox/django/api/views.pyc
../ox/django/api/actions.pyc
../ox/torrent/__init__.pyc
../ox/torrent/makemetafile.pyc
../ox/torrent/bencode.pyc
../ox/torrent/btformats.pyc
../ox/web/oxdb.pyc
../ox/web/lyricsfly.pyc
../ox/web/spiegel.pyc
../ox/web/allmovie.pyc
../ox/web/twitter.pyc
../ox/web/siteparser.pyc
../ox/web/ubu.pyc
../ox/web/epguides.pyc
../ox/web/lymbix.pyc
../ox/web/__init__.pyc
../ox/web/istockphoto.pyc
../ox/web/archive.pyc
../ox/web/freebase.pyc
../ox/web/reddit.pyc
../ox/web/vimeo.pyc
../ox/web/thepiratebay.pyc
../ox/web/auth.pyc
../ox/web/duckduckgo.pyc
../ox/web/flixter.pyc
../ox/web/rottentomatoes.pyc
../ox/web/criterion.pyc
../ox/web/lookupbyisbn.pyc
../ox/web/wikipedia.pyc
../ox/web/abebooks.pyc
../ox/web/amazon.pyc
../ox/web/impawards.pyc
../ox/web/tv.pyc
../ox/web/dailymotion.pyc
../ox/web/movieposterdb.pyc
../ox/web/filmsdivision.pyc
../ox/web/arsenalberlin.pyc
../ox/web/youtube.pyc
../ox/web/google.pyc
../ox/web/itunes.pyc
../ox/web/piratecinema.pyc
../ox/web/opensubtitles.pyc
../ox/web/openlibrary.pyc
../ox/web/gutenbergde.pyc
../ox/web/mininova.pyc
../ox/web/imdb.pyc
../ox/web/apple.pyc
../ox/web/torrent.pyc
../ox/web/metacritic.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
requires.txt

View file

@ -1,2 +0,0 @@
chardet
feedparser

View file

@ -1,30 +0,0 @@
# -*- coding: utf-8 -*-
# vi:si:et:sw=4:sts=4:ts=4
# GPL 2011
__version__ = '2.1.651'
import cache
import js
import jsonc
import net
import srt
import utils
from api import *
from file import *
from form import *
from format import *
from geo import *
from html import *
#image depends on PIL, not easy enough to instal on osx
try:
from image import *
except:
pass
from location import *
from movie import *
from normalize import *
from oembed import *
from text import *
from torrent import *
from fixunicode import *

File diff suppressed because it is too large Load diff

View file

@ -1,9 +0,0 @@
# vi:si:et:sw=4:sts=4:ts=4
# encoding: utf-8
__version__ = '1.0.0'
import imdb
import wikipedia
import google
import piratecinema
import oxdb

View file

@ -1,92 +0,0 @@
# -*- coding: utf-8 -*-
# vi:si:et:sw=4:sts=4:ts=4
import re
import lxml.html
import ox
from ox.cache import read_url, cache_timeout
def get_id(url):
if url.startswith('http://gutenberg.spiegel.de/buch/'):
return url.split('/')[-2]
def get_url(id):
return 'http://gutenberg.spiegel.de/buch/%s/1' % id
def get_page(url):
data = ox.cache.read_url(url)
doc = lxml.html.document_fromstring(data)
gutenb = doc.body.get_element_by_id('gutenb')
html = lxml.html.tostring(gutenb)
return html.replace(' id="gutenb"', '')
def get_images(page, html, b=False):
img = []
base = ''
if '<img' in page:
base = ox.find_re(html, '<base href="(.*?)"')
for url in re.compile('<img.*?src="(.*?)"').findall(page):
url = base + url
img.append(url)
if b:
return base, img
return img
def get_book(id):
if isinstance(id, basestring) and id.startswith('http'):
url = id
else:
url = get_url(id)
html = ox.cache.read_url(url, unicode=True)
data = {}
data['url'] = url
pages = []
page = get_page(url)
pages.append(page)
data['base'], data['images'] = get_images(page, html, True)
info = ox.find_re(html, '<table>.*?</table>')
for i in re.compile('<tr.*?>(.*?)</tr>').findall(info):
key, value = i.split('</td><td>')
data[ox.strip_tags(key)] = ox.strip_tags(value)
links = re.compile('<a style="float: right;" href="(/buch/.*?)">').findall(html)
while links:
for l in links:
l = 'http://gutenberg.spiegel.de' + l
html = ox.cache.read_url(l)
links = re.compile('<a style="float: right;" href="(/buch/.*?)">').findall(html)
page = get_page(l)
pages.append(page)
data['images'] += get_images(page, html)
data['pages'] = pages
return data
def get_authors():
url = "http://gutenberg.spiegel.de/autor"
data = ox.cache.read_url(url, unicode=True)
authors = {}
for l,a in re.compile('<a href="/autor/(\d+)">(.*?)</a>').findall(data):
authors[l] = a
return authors
def get_books(author_id=None):
books = {}
if not author_id:
url = "http://gutenberg.spiegel.de/buch"
data = ox.cache.read_url(url, unicode=True)
for l, t in re.compile('<a href="(/buch.+?)">(.*?)</a>').findall(data):
l = l.split('/')[-2]
books[l] = t
else:
url = "http://gutenberg.spiegel.de/autor/%s" % author_id
data = ox.cache.read_url(url, unicode=True)
for l,t in re.compile('<a href="(.+.xml)">(.*?)</a>').findall(data):
while l.startswith('../'):
l = l[3:]
l = 'http://gutenberg.spiegel.de/' + l
books[l] = t
return books
def get_ids():
return get_books().keys()

View file

@ -1,49 +0,0 @@
# -*- coding: utf-8 -*-
# vi:si:et:sw=4:sts=4:ts=4
import re
import ox
from ox import strip_tags, find_re
def get_data(id):
base = 'http://www.istockphoto.com'
url = base + '/stock-photo-%s.php' % id
id = find_re(id, '\d+')
data = ox.cache.read_url(url, timeout=-1)
info = {}
info['title'] = ox.find_re(data, '<title>(.*?) \|')
info['thumbnail'] = base + ox.find_re(data, 'src="(/file_thumbview_approve/%s.*?)"'%id)
info['views'] = ox.find_re(data, '<tr><td>Views:</td><td>(\d+)</td>')
info['collections'] = strip_tags(ox.find_re(data, '<td>Collections:</td><td>(.*?)</td>')).split(', ')
info['collections'] = filter(lambda x: x.strip(), info['collections'])
info['keywords'] = map(lambda k: k.strip(), strip_tags(ox.find_re(data, '<td>Keywords:</td>.*?<td>(.*?)\.\.\.<')).split(', '))
info['keywords'] = ox.find_re(data, '<meta name="keywords" content="(.*?), stock image').split(', ')
info['keywords'].sort()
info['uploaded'] = ox.find_re(data, '<td>Uploaded on:</td>.*?<td>([\d\-]+)')
info['downloads'] = ox.find_re(data, '<span class="fl">.*?(\d+)&nbsp;</span>')
info['contributor'] = ox.find_re(data, '<td class="m">Contributor:</td>.*?<a href="user_view.php\?id=.*?">.*?alt="(.*?)"')
info['description'] = strip_tags(ox.find_re(data, 'artistsDescriptionData = \["(.*?)<br'))
info['description'] = info['description'].split('CLICK TO SEE')[0].strip()
info['similar'] = re.compile('size=1\&id=(\d+)').findall(data)
return info
def get_collection_ids(collection, timeout=-1):
url = "http://www.istockphoto.com/browse/%s/" % (collection)
data = ox.cache.read_url(url, timeout=timeout)
ids = []
ids += re.compile('<a href="/stock-photo-(.*?).php">').findall(data)
pages = re.compile('class="paginatorLink">(\d+)</a>').findall(data)
if pages:
for page in range(2, max(map(int, pages))):
url = "http://www.istockphoto.com/browse/%s/%s" % (collection, page)
data = ox.cache.read_url(url, timeout=timeout)
ids += re.compile('<a href="/stock-photo-(.*?).php">').findall(data)
return ids
def get_ids():
ids = []
for collection in ('vetta', 'agency', 'dollarbin', 'latest/photo', 'latest/photo/exclusive'):
ids += getCollectionIds(collection)
return ids

View file

@ -1,73 +0,0 @@
from urllib import urlencode
import sys
import auth
from ox.cache import read_url, cache_timeout
from ox import find_string, find_re,net
"""
Two arguments:
text_inp: str/list of str. Input text for which tonal data is to be retrieved.
req_index: Request type for the api command
0- tonalize
1- tonalize_detailed
2- tonalize_multiple
3- flag_response
"""
def set_initial_stuff():
"""
Just to setup the initial common variables
"""
global headers, base_url,req_type
try:
headers = {}
headers.update(net.DEFAULT_HEADERS)
headers.update({'AUTHENTICATION':auth.get('lymbix_api_key'),'Accept':'application/json','VERSION':'2.2'})
base_url = 'http://api.lymbix.com/'
req_type = ['tonalize','tonalize_detailed','tonalize_multiple','flag_response']
except Exception,e:
raise e
def get_lymbix_tonalize(text_inp,timeout = cache_timeout):
"""
"""
data = {'article':text_inp}
print data, base_url+req_type[0], headers
content = read_url(base_url+req_type[0],urlencode(data),headers,timeout, unicode=True)
return content
def get_lymbix_tonalize_multiple(text_inp,timeout = cache_timeout):
"""
"""
data = {'article':text_inp}
content = read_url(base_url+req_type[0],urlencode(data),headers,timeout)
return content
def get_lymbix_flag_response(text_inp,timeout = cache_timeout):
"""
"""
data = {'article':text_inp}
content = read_url(base_url+req_type[0],urlencode(data),headers,timeout)
return content
def main():
set_initial_stuff()
return get_lymbix_tonalize(sys.argv[1])
if __name__ == '__main__':
print main()

View file

@ -1,32 +0,0 @@
from ox.cache import read_url
from ox.utils import json
def find(query):
url = 'http://openlibrary.org/search.json?q=%s' % query
print url
data = json.loads(read_url(url))
return data
def authors_ol(authors):
r = []
for a in authors:
url = 'http://openlibrary.org%s.json' % a
data = json.loads(read_url(url))
r.append(data['name'])
return r
def get_data(isbn):
data = {}
ol = find(isbn)
if ol['docs']:
d = ol['docs'][0]
data['title'] = d['title']
data['author'] = authors_ol(d['authors'])
data['work'] = d['key']
data['edition'] = d['edition_key'][0]
url = 'https://openlibrary.org/books/%s.json' % data['edition']
info = json.load(read_url(url))
data['pages'] = info['number_of_pages']
if 'dewey_decimal_class' in info:
data['classification'] = info['dewey_decimal_class'][0]
return data

View file

@ -1,29 +0,0 @@
# -*- coding: utf-8 -*-
# vi:si:et:sw=4:sts=4:ts=4
import re
import ox
from ox.cache import read_url, cache_timeout
def subreddit(name, offset=0, n=0, timeout=cache_timeout):
url = 'http://www.reddit.com/r/%s/' % name
if offset:
url += '?count=%d' % offset
data = read_url(url, unicode=True, timeout=timeout)
more = True
links = []
while more:
l = re.compile('<a class="title " href="(.*?)".*?>(.*?)<\/a>').findall(data)
if l:
links += [{
'url': ox.decode_html(a[0]),
'title': ox.decode_html(a[1])
} for a in l]
more = re.compile('<a href="(.*?)" rel="nofollow next" >next &rsaquo;<\/a>').findall(data)
if more and (n == 0 or len(links) < n):
url = ox.decode_html(more[0].split('"')[-1])
data = read_url(url, unicode=True)
else:
more = False
return links

View file

@ -1,11 +0,0 @@
README
pyPdf/__init__.py
pyPdf/filters.py
pyPdf/generic.py
pyPdf/pdf.py
pyPdf/utils.py
pyPdf/xmp.py
pyPdf.egg-info/PKG-INFO
pyPdf.egg-info/SOURCES.txt
pyPdf.egg-info/dependency_links.txt
pyPdf.egg-info/top_level.txt

View file

@ -1,17 +0,0 @@
../pyPdf/xmp.py
../pyPdf/utils.py
../pyPdf/filters.py
../pyPdf/__init__.py
../pyPdf/generic.py
../pyPdf/pdf.py
../pyPdf/xmp.pyc
../pyPdf/utils.pyc
../pyPdf/filters.pyc
../pyPdf/__init__.pyc
../pyPdf/generic.pyc
../pyPdf/pdf.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt

View file

@ -1,2 +0,0 @@
from pdf import PdfFileReader, PdfFileWriter
__all__ = ["pdf"]

View file

@ -1,800 +0,0 @@
# vim: sw=4:expandtab:foldmethod=marker
#
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
"""
Implementation of generic PDF objects (dictionary, number, string, and so on)
"""
__author__ = "Mathieu Fenniak"
__author_email__ = "biziqe@mathieu.fenniak.net"
import re
from utils import readNonWhitespace, RC4_encrypt
import filters
import utils
import decimal
import codecs
def readObject(stream, pdf):
tok = stream.read(1)
stream.seek(-1, 1) # reset to start
if tok == 't' or tok == 'f':
# boolean object
return BooleanObject.readFromStream(stream)
elif tok == '(':
# string object
return readStringFromStream(stream)
elif tok == '/':
# name object
return NameObject.readFromStream(stream)
elif tok == '[':
# array object
return ArrayObject.readFromStream(stream, pdf)
elif tok == 'n':
# null object
return NullObject.readFromStream(stream)
elif tok == '<':
# hexadecimal string OR dictionary
peek = stream.read(2)
stream.seek(-2, 1) # reset to start
if peek == '<<':
return DictionaryObject.readFromStream(stream, pdf)
else:
return readHexStringFromStream(stream)
elif tok == '%':
# comment
while tok not in ('\r', '\n'):
tok = stream.read(1)
tok = readNonWhitespace(stream)
stream.seek(-1, 1)
return readObject(stream, pdf)
else:
# number object OR indirect reference
if tok == '+' or tok == '-':
# number
return NumberObject.readFromStream(stream)
peek = stream.read(20)
stream.seek(-len(peek), 1) # reset to start
if re.match(r"(\d+)\s(\d+)\sR[^a-zA-Z]", peek) != None:
return IndirectObject.readFromStream(stream, pdf)
else:
return NumberObject.readFromStream(stream)
class PdfObject(object):
def getObject(self):
"""Resolves indirect references."""
return self
class NullObject(PdfObject):
def writeToStream(self, stream, encryption_key):
stream.write("null")
def readFromStream(stream):
nulltxt = stream.read(4)
if nulltxt != "null":
raise utils.PdfReadError, "error reading null object"
return NullObject()
readFromStream = staticmethod(readFromStream)
class BooleanObject(PdfObject):
def __init__(self, value):
self.value = value
def writeToStream(self, stream, encryption_key):
if self.value:
stream.write("true")
else:
stream.write("false")
def readFromStream(stream):
word = stream.read(4)
if word == "true":
return BooleanObject(True)
elif word == "fals":
stream.read(1)
return BooleanObject(False)
assert False
readFromStream = staticmethod(readFromStream)
class ArrayObject(list, PdfObject):
def writeToStream(self, stream, encryption_key):
stream.write("[")
for data in self:
stream.write(" ")
data.writeToStream(stream, encryption_key)
stream.write(" ]")
def readFromStream(stream, pdf):
arr = ArrayObject()
tmp = stream.read(1)
if tmp != "[":
raise utils.PdfReadError, "error reading array"
while True:
# skip leading whitespace
tok = stream.read(1)
while tok.isspace():
tok = stream.read(1)
stream.seek(-1, 1)
# check for array ending
peekahead = stream.read(1)
if peekahead == "]":
break
stream.seek(-1, 1)
# read and append obj
arr.append(readObject(stream, pdf))
return arr
readFromStream = staticmethod(readFromStream)
class IndirectObject(PdfObject):
def __init__(self, idnum, generation, pdf):
self.idnum = idnum
self.generation = generation
self.pdf = pdf
def getObject(self):
return self.pdf.getObject(self).getObject()
def __repr__(self):
return "IndirectObject(%r, %r)" % (self.idnum, self.generation)
def __eq__(self, other):
return (
other != None and
isinstance(other, IndirectObject) and
self.idnum == other.idnum and
self.generation == other.generation and
self.pdf is other.pdf
)
def __ne__(self, other):
return not self.__eq__(other)
def writeToStream(self, stream, encryption_key):
stream.write("%s %s R" % (self.idnum, self.generation))
def readFromStream(stream, pdf):
idnum = ""
while True:
tok = stream.read(1)
if tok.isspace():
break
idnum += tok
generation = ""
while True:
tok = stream.read(1)
if tok.isspace():
break
generation += tok
r = stream.read(1)
if r != "R":
raise utils.PdfReadError("error reading indirect object reference")
return IndirectObject(int(idnum), int(generation), pdf)
readFromStream = staticmethod(readFromStream)
class FloatObject(decimal.Decimal, PdfObject):
def __new__(cls, value="0", context=None):
return decimal.Decimal.__new__(cls, str(value), context)
def __repr__(self):
if self == self.to_integral():
return str(self.quantize(decimal.Decimal(1)))
else:
# XXX: this adds useless extraneous zeros.
return "%.5f" % self
def writeToStream(self, stream, encryption_key):
stream.write(repr(self))
class NumberObject(int, PdfObject):
def __init__(self, value):
int.__init__(value)
def writeToStream(self, stream, encryption_key):
stream.write(repr(self))
def readFromStream(stream):
name = ""
while True:
tok = stream.read(1)
if tok != '+' and tok != '-' and tok != '.' and not tok.isdigit():
stream.seek(-1, 1)
break
name += tok
if name.find(".") != -1:
return FloatObject(name)
else:
return NumberObject(name)
readFromStream = staticmethod(readFromStream)
##
# Given a string (either a "str" or "unicode"), create a ByteStringObject or a
# TextStringObject to represent the string.
def createStringObject(string):
if isinstance(string, unicode):
return TextStringObject(string)
elif isinstance(string, str):
if string.startswith(codecs.BOM_UTF16_BE):
retval = TextStringObject(string.decode("utf-16"))
retval.autodetect_utf16 = True
return retval
else:
# This is probably a big performance hit here, but we need to
# convert string objects into the text/unicode-aware version if
# possible... and the only way to check if that's possible is
# to try. Some strings are strings, some are just byte arrays.
try:
retval = TextStringObject(decode_pdfdocencoding(string))
retval.autodetect_pdfdocencoding = True
return retval
except UnicodeDecodeError:
return ByteStringObject(string)
else:
raise TypeError("createStringObject should have str or unicode arg")
def readHexStringFromStream(stream):
stream.read(1)
txt = ""
x = ""
while True:
tok = readNonWhitespace(stream)
if tok == ">":
break
x += tok
if len(x) == 2:
txt += chr(int(x, base=16))
x = ""
if len(x) == 1:
x += "0"
if len(x) == 2:
txt += chr(int(x, base=16))
return createStringObject(txt)
def readStringFromStream(stream):
tok = stream.read(1)
parens = 1
txt = ""
while True:
tok = stream.read(1)
if tok == "(":
parens += 1
elif tok == ")":
parens -= 1
if parens == 0:
break
elif tok == "\\":
tok = stream.read(1)
if tok == "n":
tok = "\n"
elif tok == "r":
tok = "\r"
elif tok == "t":
tok = "\t"
elif tok == "b":
tok = "\b"
elif tok == "f":
tok = "\f"
elif tok == "(":
tok = "("
elif tok == ")":
tok = ")"
elif tok == "\\":
tok = "\\"
elif tok.isdigit():
# "The number ddd may consist of one, two, or three
# octal digits; high-order overflow shall be ignored.
# Three octal digits shall be used, with leading zeros
# as needed, if the next character of the string is also
# a digit." (PDF reference 7.3.4.2, p 16)
for i in range(2):
ntok = stream.read(1)
if ntok.isdigit():
tok += ntok
else:
break
tok = chr(int(tok, base=8))
elif tok in "\n\r":
# This case is hit when a backslash followed by a line
# break occurs. If it's a multi-char EOL, consume the
# second character:
tok = stream.read(1)
if not tok in "\n\r":
stream.seek(-1, 1)
# Then don't add anything to the actual string, since this
# line break was escaped:
tok = ''
else:
raise utils.PdfReadError("Unexpected escaped string")
txt += tok
return createStringObject(txt)
##
# Represents a string object where the text encoding could not be determined.
# This occurs quite often, as the PDF spec doesn't provide an alternate way to
# represent strings -- for example, the encryption data stored in files (like
# /O) is clearly not text, but is still stored in a "String" object.
class ByteStringObject(str, PdfObject):
##
# For compatibility with TextStringObject.original_bytes. This method
# returns self.
original_bytes = property(lambda self: self)
def writeToStream(self, stream, encryption_key):
bytearr = self
if encryption_key:
bytearr = RC4_encrypt(encryption_key, bytearr)
stream.write("<")
stream.write(bytearr.encode("hex"))
stream.write(">")
##
# Represents a string object that has been decoded into a real unicode string.
# If read from a PDF document, this string appeared to match the
# PDFDocEncoding, or contained a UTF-16BE BOM mark to cause UTF-16 decoding to
# occur.
class TextStringObject(unicode, PdfObject):
autodetect_pdfdocencoding = False
autodetect_utf16 = False
##
# It is occasionally possible that a text string object gets created where
# a byte string object was expected due to the autodetection mechanism --
# if that occurs, this "original_bytes" property can be used to
# back-calculate what the original encoded bytes were.
original_bytes = property(lambda self: self.get_original_bytes())
def get_original_bytes(self):
# We're a text string object, but the library is trying to get our raw
# bytes. This can happen if we auto-detected this string as text, but
# we were wrong. It's pretty common. Return the original bytes that
# would have been used to create this object, based upon the autodetect
# method.
if self.autodetect_utf16:
return codecs.BOM_UTF16_BE + self.encode("utf-16be")
elif self.autodetect_pdfdocencoding:
return encode_pdfdocencoding(self)
else:
raise Exception("no information about original bytes")
def writeToStream(self, stream, encryption_key):
# Try to write the string out as a PDFDocEncoding encoded string. It's
# nicer to look at in the PDF file. Sadly, we take a performance hit
# here for trying...
try:
bytearr = encode_pdfdocencoding(self)
except UnicodeEncodeError:
bytearr = codecs.BOM_UTF16_BE + self.encode("utf-16be")
if encryption_key:
bytearr = RC4_encrypt(encryption_key, bytearr)
obj = ByteStringObject(bytearr)
obj.writeToStream(stream, None)
else:
stream.write("(")
for c in bytearr:
if not c.isalnum() and c != ' ':
stream.write("\\%03o" % ord(c))
else:
stream.write(c)
stream.write(")")
class NameObject(str, PdfObject):
delimiterCharacters = "(", ")", "<", ">", "[", "]", "{", "}", "/", "%"
def __init__(self, data):
str.__init__(data)
def writeToStream(self, stream, encryption_key):
stream.write(self)
def readFromStream(stream):
name = stream.read(1)
if name != "/":
raise utils.PdfReadError, "name read error"
while True:
tok = stream.read(1)
if tok.isspace() or tok in NameObject.delimiterCharacters:
stream.seek(-1, 1)
break
name += tok
return NameObject(name)
readFromStream = staticmethod(readFromStream)
class DictionaryObject(dict, PdfObject):
def __init__(self, *args, **kwargs):
if len(args) == 0:
self.update(kwargs)
elif len(args) == 1:
arr = args[0]
# If we're passed a list/tuple, make a dict out of it
if not hasattr(arr, "iteritems"):
newarr = {}
for k, v in arr:
newarr[k] = v
arr = newarr
self.update(arr)
else:
raise TypeError("dict expected at most 1 argument, got 3")
def update(self, arr):
# note, a ValueError halfway through copying values
# will leave half the values in this dict.
for k, v in arr.iteritems():
self.__setitem__(k, v)
def raw_get(self, key):
return dict.__getitem__(self, key)
def __setitem__(self, key, value):
if not isinstance(key, PdfObject):
raise ValueError("key must be PdfObject")
if not isinstance(value, PdfObject):
raise ValueError("value must be PdfObject")
return dict.__setitem__(self, key, value)
def setdefault(self, key, value=None):
if not isinstance(key, PdfObject):
raise ValueError("key must be PdfObject")
if not isinstance(value, PdfObject):
raise ValueError("value must be PdfObject")
return dict.setdefault(self, key, value)
def __getitem__(self, key):
return dict.__getitem__(self, key).getObject()
##
# Retrieves XMP (Extensible Metadata Platform) data relevant to the
# this object, if available.
# <p>
# Stability: Added in v1.12, will exist for all future v1.x releases.
# @return Returns a {@link #xmp.XmpInformation XmlInformation} instance
# that can be used to access XMP metadata from the document. Can also
# return None if no metadata was found on the document root.
def getXmpMetadata(self):
metadata = self.get("/Metadata", None)
if metadata == None:
return None
metadata = metadata.getObject()
import xmp
if not isinstance(metadata, xmp.XmpInformation):
metadata = xmp.XmpInformation(metadata)
self[NameObject("/Metadata")] = metadata
return metadata
##
# Read-only property that accesses the {@link
# #DictionaryObject.getXmpData getXmpData} function.
# <p>
# Stability: Added in v1.12, will exist for all future v1.x releases.
xmpMetadata = property(lambda self: self.getXmpMetadata(), None, None)
def writeToStream(self, stream, encryption_key):
stream.write("<<\n")
for key, value in self.items():
key.writeToStream(stream, encryption_key)
stream.write(" ")
value.writeToStream(stream, encryption_key)
stream.write("\n")
stream.write(">>")
def readFromStream(stream, pdf):
tmp = stream.read(2)
if tmp != "<<":
raise utils.PdfReadError, "dictionary read error"
data = {}
while True:
tok = readNonWhitespace(stream)
if tok == ">":
stream.read(1)
break
stream.seek(-1, 1)
key = readObject(stream, pdf)
tok = readNonWhitespace(stream)
stream.seek(-1, 1)
value = readObject(stream, pdf)
if data.has_key(key):
# multiple definitions of key not permitted
raise utils.PdfReadError, "multiple definitions in dictionary"
data[key] = value
pos = stream.tell()
s = readNonWhitespace(stream)
if s == 's' and stream.read(5) == 'tream':
eol = stream.read(1)
# odd PDF file output has spaces after 'stream' keyword but before EOL.
# patch provided by Danial Sandler
while eol == ' ':
eol = stream.read(1)
assert eol in ("\n", "\r")
if eol == "\r":
# read \n after
stream.read(1)
# this is a stream object, not a dictionary
assert data.has_key("/Length")
length = data["/Length"]
if isinstance(length, IndirectObject):
t = stream.tell()
length = pdf.getObject(length)
stream.seek(t, 0)
data["__streamdata__"] = stream.read(length)
e = readNonWhitespace(stream)
ndstream = stream.read(8)
if (e + ndstream) != "endstream":
# (sigh) - the odd PDF file has a length that is too long, so
# we need to read backwards to find the "endstream" ending.
# ReportLab (unknown version) generates files with this bug,
# and Python users into PDF files tend to be our audience.
# we need to do this to correct the streamdata and chop off
# an extra character.
pos = stream.tell()
stream.seek(-10, 1)
end = stream.read(9)
if end == "endstream":
# we found it by looking back one character further.
data["__streamdata__"] = data["__streamdata__"][:-1]
else:
stream.seek(pos, 0)
raise utils.PdfReadError, "Unable to find 'endstream' marker after stream."
else:
stream.seek(pos, 0)
if data.has_key("__streamdata__"):
return StreamObject.initializeFromDictionary(data)
else:
retval = DictionaryObject()
retval.update(data)
return retval
readFromStream = staticmethod(readFromStream)
class StreamObject(DictionaryObject):
def __init__(self):
self._data = None
self.decodedSelf = None
def writeToStream(self, stream, encryption_key):
self[NameObject("/Length")] = NumberObject(len(self._data))
DictionaryObject.writeToStream(self, stream, encryption_key)
del self["/Length"]
stream.write("\nstream\n")
data = self._data
if encryption_key:
data = RC4_encrypt(encryption_key, data)
stream.write(data)
stream.write("\nendstream")
def initializeFromDictionary(data):
if data.has_key("/Filter"):
retval = EncodedStreamObject()
else:
retval = DecodedStreamObject()
retval._data = data["__streamdata__"]
del data["__streamdata__"]
del data["/Length"]
retval.update(data)
return retval
initializeFromDictionary = staticmethod(initializeFromDictionary)
def flateEncode(self):
if self.has_key("/Filter"):
f = self["/Filter"]
if isinstance(f, ArrayObject):
f.insert(0, NameObject("/FlateDecode"))
else:
newf = ArrayObject()
newf.append(NameObject("/FlateDecode"))
newf.append(f)
f = newf
else:
f = NameObject("/FlateDecode")
retval = EncodedStreamObject()
retval[NameObject("/Filter")] = f
retval._data = filters.FlateDecode.encode(self._data)
return retval
class DecodedStreamObject(StreamObject):
def getData(self):
return self._data
def setData(self, data):
self._data = data
class EncodedStreamObject(StreamObject):
def __init__(self):
self.decodedSelf = None
def getData(self):
if self.decodedSelf:
# cached version of decoded object
return self.decodedSelf.getData()
else:
# create decoded object
decoded = DecodedStreamObject()
decoded._data = filters.decodeStreamData(self)
for key, value in self.items():
if not key in ("/Length", "/Filter", "/DecodeParms"):
decoded[key] = value
self.decodedSelf = decoded
return decoded._data
def setData(self, data):
raise utils.PdfReadError, "Creating EncodedStreamObject is not currently supported"
class RectangleObject(ArrayObject):
def __init__(self, arr):
# must have four points
assert len(arr) == 4
# automatically convert arr[x] into NumberObject(arr[x]) if necessary
ArrayObject.__init__(self, [self.ensureIsNumber(x) for x in arr])
def ensureIsNumber(self, value):
if not isinstance(value, (NumberObject, FloatObject)):
value = FloatObject(value)
return value
def __repr__(self):
return "RectangleObject(%s)" % repr(list(self))
def getLowerLeft_x(self):
return self[0]
def getLowerLeft_y(self):
return self[1]
def getUpperRight_x(self):
return self[2]
def getUpperRight_y(self):
return self[3]
def getUpperLeft_x(self):
return self.getLowerLeft_x()
def getUpperLeft_y(self):
return self.getUpperRight_y()
def getLowerRight_x(self):
return self.getUpperRight_x()
def getLowerRight_y(self):
return self.getLowerLeft_y()
def getLowerLeft(self):
return self.getLowerLeft_x(), self.getLowerLeft_y()
def getLowerRight(self):
return self.getLowerRight_x(), self.getLowerRight_y()
def getUpperLeft(self):
return self.getUpperLeft_x(), self.getUpperLeft_y()
def getUpperRight(self):
return self.getUpperRight_x(), self.getUpperRight_y()
def setLowerLeft(self, value):
self[0], self[1] = [self.ensureIsNumber(x) for x in value]
def setLowerRight(self, value):
self[2], self[1] = [self.ensureIsNumber(x) for x in value]
def setUpperLeft(self, value):
self[0], self[3] = [self.ensureIsNumber(x) for x in value]
def setUpperRight(self, value):
self[2], self[3] = [self.ensureIsNumber(x) for x in value]
def getWidth(self):
return self.getUpperRight_x() - self.getLowerLeft_x()
def getHeight(self):
return self.getUpperRight_y() - self.getLowerLeft_x()
lowerLeft = property(getLowerLeft, setLowerLeft, None, None)
lowerRight = property(getLowerRight, setLowerRight, None, None)
upperLeft = property(getUpperLeft, setUpperLeft, None, None)
upperRight = property(getUpperRight, setUpperRight, None, None)
def encode_pdfdocencoding(unicode_string):
retval = ''
for c in unicode_string:
try:
retval += chr(_pdfDocEncoding_rev[c])
except KeyError:
raise UnicodeEncodeError("pdfdocencoding", c, -1, -1,
"does not exist in translation table")
return retval
def decode_pdfdocencoding(byte_array):
retval = u''
for b in byte_array:
c = _pdfDocEncoding[ord(b)]
if c == u'\u0000':
raise UnicodeDecodeError("pdfdocencoding", b, -1, -1,
"does not exist in translation table")
retval += c
return retval
_pdfDocEncoding = (
u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000',
u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000',
u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000', u'\u0000',
u'\u02d8', u'\u02c7', u'\u02c6', u'\u02d9', u'\u02dd', u'\u02db', u'\u02da', u'\u02dc',
u'\u0020', u'\u0021', u'\u0022', u'\u0023', u'\u0024', u'\u0025', u'\u0026', u'\u0027',
u'\u0028', u'\u0029', u'\u002a', u'\u002b', u'\u002c', u'\u002d', u'\u002e', u'\u002f',
u'\u0030', u'\u0031', u'\u0032', u'\u0033', u'\u0034', u'\u0035', u'\u0036', u'\u0037',
u'\u0038', u'\u0039', u'\u003a', u'\u003b', u'\u003c', u'\u003d', u'\u003e', u'\u003f',
u'\u0040', u'\u0041', u'\u0042', u'\u0043', u'\u0044', u'\u0045', u'\u0046', u'\u0047',
u'\u0048', u'\u0049', u'\u004a', u'\u004b', u'\u004c', u'\u004d', u'\u004e', u'\u004f',
u'\u0050', u'\u0051', u'\u0052', u'\u0053', u'\u0054', u'\u0055', u'\u0056', u'\u0057',
u'\u0058', u'\u0059', u'\u005a', u'\u005b', u'\u005c', u'\u005d', u'\u005e', u'\u005f',
u'\u0060', u'\u0061', u'\u0062', u'\u0063', u'\u0064', u'\u0065', u'\u0066', u'\u0067',
u'\u0068', u'\u0069', u'\u006a', u'\u006b', u'\u006c', u'\u006d', u'\u006e', u'\u006f',
u'\u0070', u'\u0071', u'\u0072', u'\u0073', u'\u0074', u'\u0075', u'\u0076', u'\u0077',
u'\u0078', u'\u0079', u'\u007a', u'\u007b', u'\u007c', u'\u007d', u'\u007e', u'\u0000',
u'\u2022', u'\u2020', u'\u2021', u'\u2026', u'\u2014', u'\u2013', u'\u0192', u'\u2044',
u'\u2039', u'\u203a', u'\u2212', u'\u2030', u'\u201e', u'\u201c', u'\u201d', u'\u2018',
u'\u2019', u'\u201a', u'\u2122', u'\ufb01', u'\ufb02', u'\u0141', u'\u0152', u'\u0160',
u'\u0178', u'\u017d', u'\u0131', u'\u0142', u'\u0153', u'\u0161', u'\u017e', u'\u0000',
u'\u20ac', u'\u00a1', u'\u00a2', u'\u00a3', u'\u00a4', u'\u00a5', u'\u00a6', u'\u00a7',
u'\u00a8', u'\u00a9', u'\u00aa', u'\u00ab', u'\u00ac', u'\u0000', u'\u00ae', u'\u00af',
u'\u00b0', u'\u00b1', u'\u00b2', u'\u00b3', u'\u00b4', u'\u00b5', u'\u00b6', u'\u00b7',
u'\u00b8', u'\u00b9', u'\u00ba', u'\u00bb', u'\u00bc', u'\u00bd', u'\u00be', u'\u00bf',
u'\u00c0', u'\u00c1', u'\u00c2', u'\u00c3', u'\u00c4', u'\u00c5', u'\u00c6', u'\u00c7',
u'\u00c8', u'\u00c9', u'\u00ca', u'\u00cb', u'\u00cc', u'\u00cd', u'\u00ce', u'\u00cf',
u'\u00d0', u'\u00d1', u'\u00d2', u'\u00d3', u'\u00d4', u'\u00d5', u'\u00d6', u'\u00d7',
u'\u00d8', u'\u00d9', u'\u00da', u'\u00db', u'\u00dc', u'\u00dd', u'\u00de', u'\u00df',
u'\u00e0', u'\u00e1', u'\u00e2', u'\u00e3', u'\u00e4', u'\u00e5', u'\u00e6', u'\u00e7',
u'\u00e8', u'\u00e9', u'\u00ea', u'\u00eb', u'\u00ec', u'\u00ed', u'\u00ee', u'\u00ef',
u'\u00f0', u'\u00f1', u'\u00f2', u'\u00f3', u'\u00f4', u'\u00f5', u'\u00f6', u'\u00f7',
u'\u00f8', u'\u00f9', u'\u00fa', u'\u00fb', u'\u00fc', u'\u00fd', u'\u00fe', u'\u00ff'
)
assert len(_pdfDocEncoding) == 256
_pdfDocEncoding_rev = {}
for i in xrange(256):
char = _pdfDocEncoding[i]
if char == u"\u0000":
continue
assert char not in _pdfDocEncoding_rev
_pdfDocEncoding_rev[char] = i

File diff suppressed because it is too large Load diff

View file

@ -1,122 +0,0 @@
# vim: sw=4:expandtab:foldmethod=marker
#
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
"""
Utility functions for PDF library.
"""
__author__ = "Mathieu Fenniak"
__author_email__ = "biziqe@mathieu.fenniak.net"
#ENABLE_PSYCO = False
#if ENABLE_PSYCO:
# try:
# import psyco
# except ImportError:
# ENABLE_PSYCO = False
#
#if not ENABLE_PSYCO:
# class psyco:
# def proxy(func):
# return func
# proxy = staticmethod(proxy)
def readUntilWhitespace(stream, maxchars=None):
txt = ""
while True:
tok = stream.read(1)
if tok.isspace() or not tok:
break
txt += tok
if len(txt) == maxchars:
break
return txt
def readNonWhitespace(stream):
tok = ' '
while tok == '\n' or tok == '\r' or tok == ' ' or tok == '\t':
tok = stream.read(1)
return tok
class ConvertFunctionsToVirtualList(object):
def __init__(self, lengthFunction, getFunction):
self.lengthFunction = lengthFunction
self.getFunction = getFunction
def __len__(self):
return self.lengthFunction()
def __getitem__(self, index):
if not isinstance(index, int):
raise TypeError, "sequence indices must be integers"
len_self = len(self)
if index < 0:
# support negative indexes
index = len_self + index
if index < 0 or index >= len_self:
raise IndexError, "sequence index out of range"
return self.getFunction(index)
def RC4_encrypt(key, plaintext):
S = [i for i in range(256)]
j = 0
for i in range(256):
j = (j + S[i] + ord(key[i % len(key)])) % 256
S[i], S[j] = S[j], S[i]
i, j = 0, 0
retval = ""
for x in range(len(plaintext)):
i = (i + 1) % 256
j = (j + S[i]) % 256
S[i], S[j] = S[j], S[i]
t = S[(S[i] + S[j]) % 256]
retval += chr(ord(plaintext[x]) ^ t)
return retval
def matrixMultiply(a, b):
return [[sum([float(i)*float(j)
for i, j in zip(row, col)]
) for col in zip(*b)]
for row in a]
class PyPdfError(Exception):
pass
class PdfReadError(PyPdfError):
pass
class PageSizeNotDefinedError(PyPdfError):
pass
if __name__ == "__main__":
# test RC4
out = RC4_encrypt("Key", "Plaintext")
print repr(out)
pt = RC4_encrypt("Key", out)
print repr(pt)

View file

@ -1,228 +0,0 @@
../stdnum/meid.py
../stdnum/imei.py
../stdnum/verhoeff.py
../stdnum/iban.py
../stdnum/grid.py
../stdnum/__init__.py
../stdnum/exceptions.py
../stdnum/ismn.py
../stdnum/imsi.py
../stdnum/isan.py
../stdnum/isbn.py
../stdnum/issn.py
../stdnum/isil.py
../stdnum/numdb.py
../stdnum/luhn.py
../stdnum/ean.py
../stdnum/util.py
../stdnum/bg/pnf.py
../stdnum/bg/egn.py
../stdnum/bg/__init__.py
../stdnum/bg/vat.py
../stdnum/at/__init__.py
../stdnum/at/uid.py
../stdnum/lv/pvn.py
../stdnum/lv/__init__.py
../stdnum/ie/pps.py
../stdnum/ie/__init__.py
../stdnum/ie/vat.py
../stdnum/gr/__init__.py
../stdnum/gr/vat.py
../stdnum/br/cpf.py
../stdnum/br/__init__.py
../stdnum/ee/__init__.py
../stdnum/ee/kmkr.py
../stdnum/eu/__init__.py
../stdnum/eu/vat.py
../stdnum/pl/__init__.py
../stdnum/pl/nip.py
../stdnum/fr/tva.py
../stdnum/fr/__init__.py
../stdnum/fr/siren.py
../stdnum/ro/__init__.py
../stdnum/ro/cnp.py
../stdnum/ro/cf.py
../stdnum/es/__init__.py
../stdnum/es/nif.py
../stdnum/es/nie.py
../stdnum/es/dni.py
../stdnum/es/cif.py
../stdnum/mt/__init__.py
../stdnum/mt/vat.py
../stdnum/dk/__init__.py
../stdnum/dk/cpr.py
../stdnum/dk/cvr.py
../stdnum/it/__init__.py
../stdnum/it/iva.py
../stdnum/be/__init__.py
../stdnum/be/vat.py
../stdnum/iso7064/mod_11_10.py
../stdnum/iso7064/__init__.py
../stdnum/iso7064/mod_37_36.py
../stdnum/iso7064/mod_11_2.py
../stdnum/iso7064/mod_97_10.py
../stdnum/iso7064/mod_37_2.py
../stdnum/lt/__init__.py
../stdnum/lt/pvm.py
../stdnum/fi/hetu.py
../stdnum/fi/__init__.py
../stdnum/fi/alv.py
../stdnum/si/__init__.py
../stdnum/si/ddv.py
../stdnum/se/__init__.py
../stdnum/se/vat.py
../stdnum/my/__init__.py
../stdnum/my/nric.py
../stdnum/nl/__init__.py
../stdnum/nl/btw.py
../stdnum/nl/onderwijsnummer.py
../stdnum/nl/bsn.py
../stdnum/nl/brin.py
../stdnum/nl/postcode.py
../stdnum/hr/oib.py
../stdnum/hr/__init__.py
../stdnum/hu/__init__.py
../stdnum/hu/anum.py
../stdnum/us/tin.py
../stdnum/us/ptin.py
../stdnum/us/__init__.py
../stdnum/us/atin.py
../stdnum/us/ein.py
../stdnum/us/itin.py
../stdnum/us/ssn.py
../stdnum/gb/__init__.py
../stdnum/gb/vat.py
../stdnum/de/__init__.py
../stdnum/de/vat.py
../stdnum/cz/__init__.py
../stdnum/cz/rc.py
../stdnum/cz/dic.py
../stdnum/sk/__init__.py
../stdnum/sk/rc.py
../stdnum/sk/dph.py
../stdnum/pt/__init__.py
../stdnum/pt/nif.py
../stdnum/lu/tva.py
../stdnum/lu/__init__.py
../stdnum/cy/__init__.py
../stdnum/cy/vat.py
../stdnum/isbn.dat
../stdnum/iban.dat
../stdnum/isil.dat
../stdnum/imsi.dat
../stdnum/my/bp.dat
../stdnum/us/ein.dat
../stdnum/meid.pyc
../stdnum/imei.pyc
../stdnum/verhoeff.pyc
../stdnum/iban.pyc
../stdnum/grid.pyc
../stdnum/__init__.pyc
../stdnum/exceptions.pyc
../stdnum/ismn.pyc
../stdnum/imsi.pyc
../stdnum/isan.pyc
../stdnum/isbn.pyc
../stdnum/issn.pyc
../stdnum/isil.pyc
../stdnum/numdb.pyc
../stdnum/luhn.pyc
../stdnum/ean.pyc
../stdnum/util.pyc
../stdnum/bg/pnf.pyc
../stdnum/bg/egn.pyc
../stdnum/bg/__init__.pyc
../stdnum/bg/vat.pyc
../stdnum/at/__init__.pyc
../stdnum/at/uid.pyc
../stdnum/lv/pvn.pyc
../stdnum/lv/__init__.pyc
../stdnum/ie/pps.pyc
../stdnum/ie/__init__.pyc
../stdnum/ie/vat.pyc
../stdnum/gr/__init__.pyc
../stdnum/gr/vat.pyc
../stdnum/br/cpf.pyc
../stdnum/br/__init__.pyc
../stdnum/ee/__init__.pyc
../stdnum/ee/kmkr.pyc
../stdnum/eu/__init__.pyc
../stdnum/eu/vat.pyc
../stdnum/pl/__init__.pyc
../stdnum/pl/nip.pyc
../stdnum/fr/tva.pyc
../stdnum/fr/__init__.pyc
../stdnum/fr/siren.pyc
../stdnum/ro/__init__.pyc
../stdnum/ro/cnp.pyc
../stdnum/ro/cf.pyc
../stdnum/es/__init__.pyc
../stdnum/es/nif.pyc
../stdnum/es/nie.pyc
../stdnum/es/dni.pyc
../stdnum/es/cif.pyc
../stdnum/mt/__init__.pyc
../stdnum/mt/vat.pyc
../stdnum/dk/__init__.pyc
../stdnum/dk/cpr.pyc
../stdnum/dk/cvr.pyc
../stdnum/it/__init__.pyc
../stdnum/it/iva.pyc
../stdnum/be/__init__.pyc
../stdnum/be/vat.pyc
../stdnum/iso7064/mod_11_10.pyc
../stdnum/iso7064/__init__.pyc
../stdnum/iso7064/mod_37_36.pyc
../stdnum/iso7064/mod_11_2.pyc
../stdnum/iso7064/mod_97_10.pyc
../stdnum/iso7064/mod_37_2.pyc
../stdnum/lt/__init__.pyc
../stdnum/lt/pvm.pyc
../stdnum/fi/hetu.pyc
../stdnum/fi/__init__.pyc
../stdnum/fi/alv.pyc
../stdnum/si/__init__.pyc
../stdnum/si/ddv.pyc
../stdnum/se/__init__.pyc
../stdnum/se/vat.pyc
../stdnum/my/__init__.pyc
../stdnum/my/nric.pyc
../stdnum/nl/__init__.pyc
../stdnum/nl/btw.pyc
../stdnum/nl/onderwijsnummer.pyc
../stdnum/nl/bsn.pyc
../stdnum/nl/brin.pyc
../stdnum/nl/postcode.pyc
../stdnum/hr/oib.pyc
../stdnum/hr/__init__.pyc
../stdnum/hu/__init__.pyc
../stdnum/hu/anum.pyc
../stdnum/us/tin.pyc
../stdnum/us/ptin.pyc
../stdnum/us/__init__.pyc
../stdnum/us/atin.pyc
../stdnum/us/ein.pyc
../stdnum/us/itin.pyc
../stdnum/us/ssn.pyc
../stdnum/gb/__init__.pyc
../stdnum/gb/vat.pyc
../stdnum/de/__init__.pyc
../stdnum/de/vat.pyc
../stdnum/cz/__init__.pyc
../stdnum/cz/rc.pyc
../stdnum/cz/dic.pyc
../stdnum/sk/__init__.pyc
../stdnum/sk/rc.pyc
../stdnum/sk/dph.pyc
../stdnum/pt/__init__.pyc
../stdnum/pt/nif.pyc
../stdnum/lu/tva.pyc
../stdnum/lu/__init__.pyc
../stdnum/cy/__init__.pyc
../stdnum/cy/vat.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
requires.txt

View file

@ -1,150 +0,0 @@
easy_install.py,sha256=MDC9vt5AxDsXX5qcKlBz2TnW6Tpuv_AobnfhCJ9X3PM,126
pkg_resources.py,sha256=WJZPZvwcfxzVnrRoGsA58Us7hA3HdzKMq0HyB5m8vVo,101233
setuptools/archive_util.py,sha256=xr6Xl-PbXymPMuuq5TokDMs7SvKMVKsjUegw8mpj7_g,6556
setuptools/cli-32.exe,sha256=dfEuovMNnA2HLa3jRfMPVi5tk4R7alCbpTvuxtCyw0Y,65536
setuptools/cli-64.exe,sha256=KLABu5pyrnokJCv6skjXZ6GsXeyYHGcqOUT3oHI3Xpo,74752
setuptools/cli-arm-32.exe,sha256=0pFNIi2SmY2gdY91Y4LRhj1wuBsnv5cG1fus3iBJv40,69120
setuptools/cli.exe,sha256=dfEuovMNnA2HLa3jRfMPVi5tk4R7alCbpTvuxtCyw0Y,65536
setuptools/compat.py,sha256=-Hl58PuLOEHUDM3-qsgmk50qNiyLLF3RgBgJ-eGGZG0,2094
setuptools/depends.py,sha256=gMRnrqQSr_Yp_wf09O88vKSQah1YjjEi5PsDNezM2Hs,6370
setuptools/dist.py,sha256=tzUrozVQmFJxuvRVyisQqT_JURxR5OgJ3zHaLvFq2Is,33406
setuptools/extension.py,sha256=ovUmlg52d7t7LZHi5ywxFYG7EitWQbLBURv58hXPnoE,1731
setuptools/gui-32.exe,sha256=XBr0bHMA6Hpz2s9s9Bzjl-PwXfa9nH4ie0rFn4V2kWA,65536
setuptools/gui-64.exe,sha256=aYKMhX1IJLn4ULHgWX0sE0yREUt6B3TEHf_jOw6yNyE,75264
setuptools/gui-arm-32.exe,sha256=R5gRWLkY7wvO_CVGxoi7LZVTv0h-DKsKScy6fkbp4XI,69120
setuptools/gui.exe,sha256=XBr0bHMA6Hpz2s9s9Bzjl-PwXfa9nH4ie0rFn4V2kWA,65536
setuptools/lib2to3_ex.py,sha256=6jPF9sJuHiz0cyg4cwIBLl2VMAxcl3GYSZwWAOuJplU,1998
setuptools/package_index.py,sha256=IOHE81H91eEcR80gMquyQuU9Rh9cv7_6JOLLB0iNbhs,38943
setuptools/py26compat.py,sha256=ggKS8_aZWWHHS792vF3uXp5vmUkGNk3vjLreLTHr_-Q,431
setuptools/py27compat.py,sha256=CGj-jZcFgHUkrEdLvArkxHj96tAaMbG2-yJtUVU7QVI,306
setuptools/py31compat.py,sha256=O3X_wdWrvXTifeSFbRaCMuc23cDhMHJn7QlITb5cQ8E,1637
setuptools/sandbox.py,sha256=U5QwtrByF-ITHzGl1FyxhoRYdvNeic6-Ie-N6XW0Ybc,10430
setuptools/script (dev).tmpl,sha256=f7MR17dTkzaqkCMSVseyOCMVrPVSMdmTQsaB8cZzfuI,201
setuptools/script.tmpl,sha256=WGTt5piezO27c-Dbx6l5Q4T3Ff20A5z7872hv3aAhYY,138
setuptools/site-patch.py,sha256=K-0-cAx36mX_PG-qPZwosG9ZLCliRjquKQ4nHiJvvzg,2389
setuptools/ssl_support.py,sha256=FASqXlRCmXAi6LUWLUIo0u14MpJqHBgkOc5KPHSRrtI,8044
setuptools/svn_utils.py,sha256=SDSOSVpbAvWoswgKdTQAOxd8pEd1ZdRtqD4skntW6-E,18855
setuptools/unicode_utils.py,sha256=gvhAHRj1LELCz-1MP3rfXGi__O1CAm5aksO9Njd2lpU,981
setuptools/version.py,sha256=w08yLI4kDNb48cnJrD2luXeyodco47-Nk4_c8i-4xDY,20
setuptools/__init__.py,sha256=CliC3Pe-ej7j-iPQ1fu9Rkh2DdrfkBzRcSIJ3vF5dmg,5195
setuptools/command/alias.py,sha256=1sLQxZcNh6dDQpDmm4G7UGGTol83nY1NTPmNBbm2siI,2381
setuptools/command/bdist_egg.py,sha256=vGysGAHsTGSbSUwEBtHZ-Mtz54nU_tSwcc8DlnHfM7A,17606
setuptools/command/bdist_rpm.py,sha256=B7l0TnzCGb-0nLlm6rS00jWLkojASwVmdhW2w5Qz_Ak,1508
setuptools/command/bdist_wininst.py,sha256=_6dz3lpB1tY200LxKPLM7qgwTCceOMgaWFF-jW2-pm0,637
setuptools/command/build_ext.py,sha256=cukClSx-aLSuWmhWLmXPNDNr9ZHpFL7a01z71bwy9Go,12324
setuptools/command/build_py.py,sha256=FtFC7cRlKQCHT4-1yDey3igltuiGwFh8VsFg2U0B0kw,8644
setuptools/command/develop.py,sha256=uyRwABU1JnhQhZO9rS8-nenkzLwKKJt2P7WPnsXrHd4,6610
setuptools/command/easy_install.py,sha256=Nsa4Zj-oUL4GAI5KDxy1HTJ1KCJbm6sDPUEPv036xf4,82661
setuptools/command/egg_info.py,sha256=CY_h53JLZXafopM-E8-WbxKZd_xlp4SUqRWPQsJAYA4,15204
setuptools/command/install.py,sha256=QwaFiZRU3ytIHoPh8uJ9EqV3Fu9C4ca4B7UGAo95tws,4685
setuptools/command/install_egg_info.py,sha256=KXNB8O34-rK-cZZZr2fnT8kqNktDmTfUA88X2Iln66c,4001
setuptools/command/install_lib.py,sha256=a_xzDZ0v8x59YbIYH8x6wuI-1tV0W2DKWzMP994Jnw0,2129
setuptools/command/install_scripts.py,sha256=evsgRosqRxlww6l7BBx43RINpAbLd1raVkW2-dmFCyU,2041
setuptools/command/launcher manifest.xml,sha256=xlLbjWrB01tKC0-hlVkOKkiSPbzMml2eOPtJ_ucCnbE,628
setuptools/command/register.py,sha256=bHlMm1qmBbSdahTOT8w6UhA-EgeQIz7p6cD-qOauaiI,270
setuptools/command/rotate.py,sha256=Qm7SOa32L9XG5b_C7_SSYvKM5rqFXroeQ6w8GXIsY2o,2038
setuptools/command/saveopts.py,sha256=za7QCBcQimKKriWcoCcbhxPjUz30gSB74zuTL47xpP4,658
setuptools/command/sdist.py,sha256=EKzb8zEYkx-FFEN71kGWKrxUKsLHhXEOQHjmnLe28oY,8559
setuptools/command/setopt.py,sha256=Z3_kav60D2XHZjM0cRhGo7wbBYo7nr4U_U-wMMbpmu8,5080
setuptools/command/test.py,sha256=-ZbBhuoKrfOx_pgA-53H-qRw6Z9evRP8_z1KdKI7yvw,6481
setuptools/command/upload_docs.py,sha256=617bECKkdDizKaV_kN62hXjgmwKMbpY4l3zecNEdjNk,6811
setuptools/command/__init__.py,sha256=gQMXoLa0TtUtmUZY0ptSouWWA5kcTArWyDQ6QwkjoVQ,554
setuptools/tests/environment.py,sha256=Sl9Pok7ZEakC6YXP8urDkh9-XxQpoJAzyzzZUqOGOXQ,4658
setuptools/tests/py26compat.py,sha256=i_JBukWMEat4AM1FtU8DAd06r0gjZR3uma_jb_gxEXU,267
setuptools/tests/script-with-bom.py,sha256=nWOGL62VEQBsH5GaZvCyRyYqobziynGaqQJisffifsc,46
setuptools/tests/server.py,sha256=Fqk53860mwB_wDNOzixZLjCq_c8_1kaGiXBm5fHco2c,2651
setuptools/tests/test_bdist_egg.py,sha256=yD2dP8ApUtyo4rpwzSEGgeLjxihU7xHVAK2O2bCL52U,1962
setuptools/tests/test_build_ext.py,sha256=mfWDSPPR2auCi3AbZILJp-175Dj0551d6xpvM0md4zE,650
setuptools/tests/test_develop.py,sha256=JFvKRFbjzsExQBmg1kN-dWijhPY4uGO1TMQFDy9QFoc,3496
setuptools/tests/test_dist_info.py,sha256=ZgVLERe6WZpWUcwrLGk2b_cSOMZMASLV1hTV4hgQLG0,2615
setuptools/tests/test_easy_install.py,sha256=vjr2KNv78vgLg7otZeqz9yPXe0M7gOxiOE5MRvARblk,15704
setuptools/tests/test_egg_info.py,sha256=mJ8BDZuIWe_4-2mo_OQGAdi0DG1mS3pm0ZAEPdS_2vc,6745
setuptools/tests/test_find_packages.py,sha256=McPOROBbIR7JK5a2tMWTA88S6LrbcGMnPAZ66Oq32AQ,6005
setuptools/tests/test_integration.py,sha256=gXHi9iQ9LMqAjWcoGe4yvS-H3UczoOW3ij-x-afP0a4,2506
setuptools/tests/test_markerlib.py,sha256=UYBTjaug56cWxIwlCubdSTGZ-s9bqB1co54636x0xfo,2506
setuptools/tests/test_packageindex.py,sha256=8vay-a5Dry_cg71WFA1rlofIWO5c9KI6D6PKN4WKKiI,7625
setuptools/tests/test_resources.py,sha256=-YyuIR1EGofI44SycRIhF59NCxP_r8MD8iZj7aBGXHg,23639
setuptools/tests/test_sandbox.py,sha256=bdUzddjT6dojVbFOHQPFqqYaLGAMcLKVlWI_ADcY9SE,2364
setuptools/tests/test_sdist.py,sha256=fafOcXHGriVAp1VwUEKEyzmyQ-9in7utziPWHC8iOLU,17986
setuptools/tests/test_svn.py,sha256=UI9rRTWQDw7NGRzSKmgcsnkDGX8M4QDj447kCXvf5D8,7806
setuptools/tests/test_test.py,sha256=pnN_pLgda5uEJU0-X8n04g90OaxQ5_CTsLTxy2SqR80,3697
setuptools/tests/test_upload_docs.py,sha256=N__IVGihRBRqA3PetRcIDmNFu1XOhR7ix0e50xMuq_M,2139
setuptools/tests/__init__.py,sha256=6dOlIFbLaq185Y7B7wNUXwMGl1UCGGdM98pYTzMDmYQ,12531
setuptools-5.7.data/scripts/easy_install-3.4.pya,sha256=GO64Kc5hETG1XlgL9RPKEU-MmRtKCizdVOXwZNGwyOw,323
setuptools-5.7.data/scripts/easy_install.pya,sha256=IrlNq6Jg2UNYc00f_UxvLd3fae6pEYPkep8Si68OtpE,315
setuptools-5.7.dist-info/dependency_links.txt,sha256=UaFV2I99Rbdie_2lV4pEX6M2jKNDN7RhFSbiL1-PDiY,221
setuptools-5.7.dist-info/DESCRIPTION.rst,sha256=zgrbbjWmkgYqGkDa072PyEhYWn9N7hcAXHhIRxSoV_8,81665
setuptools-5.7.dist-info/entry_points.txt,sha256=wxdNYALrJxn2PJtYZXehrEYO05eRXr7j2I3vzIu3hK4,2872
setuptools-5.7.dist-info/METADATA,sha256=kR3p0B8KaxDP3nyqjJbuWNqC-JluIg9jBPFQL27Rd14,83043
setuptools-5.7.dist-info/pydist.json,sha256=HdDXqmNAdp5461-ee_dilBHGYaW2g1yRptP8U4OjAAk,4754
setuptools-5.7.dist-info/RECORD,,
setuptools-5.7.dist-info/top_level.txt,sha256=6rcdMHd5Qcgf05whbmEOKbeuInimBMJfU7vla2t56ZE,49
setuptools-5.7.dist-info/WHEEL,sha256=2QrAqABq1X2mxKM4JY32oZUIM268GesCrPKhzE9RspQ,116
setuptools-5.7.dist-info/zip-safe,sha256=frcCV1k9oG9oKj3dpUqdJg1PxRT2RSN_XKdLCPjaYaY,2
_markerlib/markers.py,sha256=YuFp0-osufFIoqnzG3L0Z2fDCx4Vln3VUDeXJ2DA_1I,3979
_markerlib/__init__.py,sha256=GSmhZqvAitLJHhSgtqqusfq2nJ_ClP3oy3Lm0uZLIsU,552
/srv/openmedialibrary/platform/Shared/home/.local/bin/easy_install,sha256=SluicxzCDiWYLUcagvxZrIjOWfQ2ReLyJVw8764k6e0,232
/srv/openmedialibrary/platform/Shared/home/.local/bin/easy_install-2.7,sha256=SluicxzCDiWYLUcagvxZrIjOWfQ2ReLyJVw8764k6e0,232
setuptools/tests/test_upload_docs.pyc,,
setuptools/ssl_support.pyc,,
setuptools/tests/environment.pyc,,
setuptools/tests/test_easy_install.pyc,,
setuptools/tests/server.pyc,,
setuptools/unicode_utils.pyc,,
setuptools/tests/test_find_packages.pyc,,
setuptools/tests/test_integration.pyc,,
setuptools/command/bdist_wininst.pyc,,
setuptools/tests/py26compat.pyc,,
setuptools/command/develop.pyc,,
setuptools/compat.pyc,,
setuptools/tests/test_bdist_egg.pyc,,
_markerlib/__init__.pyc,,
setuptools/tests/test_packageindex.pyc,,
setuptools/command/build_py.pyc,,
easy_install.pyc,,
setuptools/py26compat.pyc,,
setuptools/command/install_egg_info.pyc,,
setuptools/tests/test_svn.pyc,,
_markerlib/markers.pyc,,
setuptools/site-patch.pyc,,
setuptools/extension.pyc,,
setuptools/command/bdist_egg.pyc,,
setuptools/py31compat.pyc,,
pkg_resources.pyc,,
setuptools/command/sdist.pyc,,
setuptools/command/saveopts.pyc,,
setuptools/command/egg_info.pyc,,
setuptools/command/install_scripts.pyc,,
setuptools/command/install.pyc,,
setuptools/command/alias.pyc,,
setuptools/tests/test_develop.pyc,,
setuptools/__init__.pyc,,
setuptools/command/easy_install.pyc,,
setuptools/tests/test_sdist.pyc,,
setuptools/py27compat.pyc,,
setuptools/command/test.pyc,,
setuptools/command/build_ext.pyc,,
setuptools/version.pyc,,
setuptools/package_index.pyc,,
setuptools/command/bdist_rpm.pyc,,
setuptools/tests/script-with-bom.pyc,,
setuptools/tests/test_test.pyc,,
setuptools/command/__init__.pyc,,
setuptools/command/rotate.pyc,,
setuptools/archive_util.pyc,,
setuptools/command/setopt.pyc,,
setuptools/tests/test_build_ext.pyc,,
setuptools/command/install_lib.pyc,,
setuptools/tests/test_markerlib.pyc,,
setuptools/lib2to3_ex.pyc,,
setuptools/sandbox.pyc,,
setuptools/svn_utils.pyc,,
setuptools/command/upload_docs.pyc,,
setuptools/tests/test_egg_info.pyc,,
setuptools/tests/test_dist_info.pyc,,
setuptools/depends.pyc,,
setuptools/dist.pyc,,
setuptools/tests/test_resources.pyc,,
setuptools/tests/__init__.pyc,,
setuptools/tests/test_sandbox.pyc,,
setuptools/command/register.pyc,,

View file

@ -1 +0,0 @@
{"summary": "Easily download, build, install, upgrade, and uninstall Python packages", "extras": ["certs", "ssl:sys_platform=='win32'"], "commands": {"wrap_console": {"easy_install": "setuptools.command.easy_install:main", "easy_install-3.4": "setuptools.command.easy_install:main"}}, "contacts": [{"role": "author", "email": "distutils-sig@python.org", "name": "Python Packaging Authority"}], "classifiers": ["Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: Python Software Foundation License", "License :: OSI Approved :: Zope Public License", "Operating System :: OS Independent", "Programming Language :: Python :: 2.6", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.1", "Programming Language :: Python :: 3.2", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Topic :: Software Development :: Libraries :: Python Modules", "Topic :: System :: Archiving :: Packaging", "Topic :: System :: Systems Administration", "Topic :: Utilities"], "license": "PSF or ZPL", "document_names": {"description": "DESCRIPTION.rst"}, "metadata_version": "2.0", "run_requires": [{"environment": "extra == \"ssl:sys_platform=='win32'\"", "requires": ["wincertstore (==0.2)"]}, {"extra": "certs", "requires": ["certifi (==1.0.1)"]}], "exports": {"console_scripts": {"easy_install": "setuptools.command.easy_install:main", "easy_install-3.4": "setuptools.command.easy_install:main"}, "egg_info.writers": {"dependency_links.txt": "setuptools.command.egg_info:overwrite_arg", "namespace_packages.txt": "setuptools.command.egg_info:overwrite_arg", "depends.txt": "setuptools.command.egg_info:warn_depends_obsolete", "entry_points.txt": "setuptools.command.egg_info:write_entries", "PKG-INFO": "setuptools.command.egg_info:write_pkg_info", "eager_resources.txt": "setuptools.command.egg_info:overwrite_arg", "top_level.txt": "setuptools.command.egg_info:write_toplevel_names", "requires.txt": "setuptools.command.egg_info:write_requirements"}, "setuptools.installation": {"eggsecutable": "setuptools.command.easy_install:bootstrap"}, "distutils.setup_keywords": {"use_2to3_fixers": "setuptools.dist:assert_string_list", "convert_2to3_doctests": "setuptools.dist:assert_string_list", "package_data": "setuptools.dist:check_package_data", "zip_safe": "setuptools.dist:assert_bool", "include_package_data": "setuptools.dist:assert_bool", "test_loader": "setuptools.dist:check_importable", "install_requires": "setuptools.dist:check_requirements", "eager_resources": "setuptools.dist:assert_string_list", "dependency_links": "setuptools.dist:assert_string_list", "namespace_packages": "setuptools.dist:check_nsp", "test_runner": "setuptools.dist:check_importable", "tests_require": "setuptools.dist:check_requirements", "use_2to3": "setuptools.dist:assert_bool", "extras_require": "setuptools.dist:check_extras", "packages": "setuptools.dist:check_packages", "exclude_package_data": "setuptools.dist:check_package_data", "use_2to3_exclude_fixers": "setuptools.dist:assert_string_list", "entry_points": "setuptools.dist:check_entry_points", "test_suite": "setuptools.dist:check_test_suite", "setup_requires": "setuptools.dist:check_requirements"}, "distutils.commands": {"easy_install": "setuptools.command.easy_install:easy_install", "egg_info": "setuptools.command.egg_info:egg_info", "bdist_egg": "setuptools.command.bdist_egg:bdist_egg", "install_lib": "setuptools.command.install_lib:install_lib", "install_scripts": "setuptools.command.install_scripts:install_scripts", "install_egg_info": "setuptools.command.install_egg_info:install_egg_info", "register": "setuptools.command.register:register", "alias": "setuptools.command.alias:alias", "test": "setuptools.command.test:test", "bdist_rpm": "setuptools.command.bdist_rpm:bdist_rpm", "upload_docs": "setuptools.command.upload_docs:upload_docs", "install": "setuptools.command.install:install", "build_py": "setuptools.command.build_py:build_py", "build_ext": "setuptools.command.build_ext:build_ext", "sdist": "setuptools.command.sdist:sdist", "bdist_wininst": "setuptools.command.bdist_wininst:bdist_wininst", "saveopts": "setuptools.command.saveopts:saveopts", "develop": "setuptools.command.develop:develop", "setopt": "setuptools.command.setopt:setopt", "rotate": "setuptools.command.rotate:rotate"}, "setuptools.file_finders": {"svn_cvs": "setuptools.command.sdist:_default_revctrl"}}, "version": "5.7", "generator": "bdist_wheel (0.22.0)", "keywords": "CPAN PyPI distutils eggs package management", "test_requires": [{"requires": ["setuptools[ssl]", "pytest"]}], "name": "setuptools", "project_urls": {"Home": "https://bitbucket.org/pypa/setuptools"}}

View file

@ -1,65 +0,0 @@
import distutils.command.install_lib as orig
import os
class install_lib(orig.install_lib):
"""Don't add compiled flags to filenames of non-Python files"""
def run(self):
self.build()
outfiles = self.install()
if outfiles is not None:
# always compile, in case we have any extension stubs to deal with
self.byte_compile(outfiles)
def get_exclusions(self):
exclude = {}
nsp = self.distribution.namespace_packages
svem = (nsp and self.get_finalized_command('install')
.single_version_externally_managed)
if svem:
for pkg in nsp:
parts = pkg.split('.')
while parts:
pkgdir = os.path.join(self.install_dir, *parts)
for f in '__init__.py', '__init__.pyc', '__init__.pyo':
exclude[os.path.join(pkgdir, f)] = 1
parts.pop()
return exclude
def copy_tree(
self, infile, outfile,
preserve_mode=1, preserve_times=1, preserve_symlinks=0, level=1
):
assert preserve_mode and preserve_times and not preserve_symlinks
exclude = self.get_exclusions()
if not exclude:
return orig.install_lib.copy_tree(self, infile, outfile)
# Exclude namespace package __init__.py* files from the output
from setuptools.archive_util import unpack_directory
from distutils import log
outfiles = []
def pf(src, dst):
if dst in exclude:
log.warn("Skipping installation of %s (namespace package)",
dst)
return False
log.info("copying %s -> %s", src, os.path.dirname(dst))
outfiles.append(dst)
return dst
unpack_directory(infile, outfile, pf)
return outfiles
def get_outputs(self):
outputs = orig.install_lib.get_outputs(self)
exclude = self.get_exclusions()
if exclude:
return [f for f in outputs if f not in exclude]
return outputs

View file

@ -1 +0,0 @@
__version__ = '5.7'

View file

@ -1,8 +0,0 @@
six.py,sha256=SHMbYUbZdimEb9Q7FBGMUtwaLCRC7qYQgXXSS_PHQss,26518
six-1.7.3.dist-info/METADATA,sha256=J5bgiEeMfq7UWjeoY8UJSndsDcCGQlaAVNrEkDHgE0g,1279
six-1.7.3.dist-info/top_level.txt,sha256=_iVH_iYEtEXnD8nYGQYpYFUvkUW9sEO1GYbkeKSAais,4
six-1.7.3.dist-info/DESCRIPTION.rst,sha256=wDIPS0rnIMXICM3qxqUg2g5ozzQyOpCLh8oaca9UgFQ,771
six-1.7.3.dist-info/metadata.json,sha256=BLxlZMZlGI6548ILR5YYtDZ_U35REqChoRN1SCjGy5M,621
six-1.7.3.dist-info/RECORD,,
six-1.7.3.dist-info/WHEEL,sha256=6lxp_S3wZGmTBtGMVmNNLyvKFcp7HqQw2Wn4YYk-Suo,110
six.pyc,,

View file

@ -1 +0,0 @@
{"version": "1.7.3", "license": "MIT", "metadata_version": "2.0", "project_urls": {"Home": "http://pypi.python.org/pypi/six/"}, "contacts": [{"email": "benjamin@python.org", "role": "author", "name": "Benjamin Peterson"}], "generator": "bdist_wheel (0.23.0)", "name": "six", "document_names": {"description": "DESCRIPTION.rst"}, "summary": "Python 2 and 3 compatibility utilities", "classifiers": ["Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Topic :: Software Development :: Libraries", "Topic :: Utilities"]}

View file

@ -1,159 +0,0 @@
../tornado/netutil.py
../tornado/httpclient.py
../tornado/web.py
../tornado/iostream.py
../tornado/simple_httpclient.py
../tornado/template.py
../tornado/websocket.py
../tornado/concurrent.py
../tornado/testing.py
../tornado/gen.py
../tornado/__init__.py
../tornado/locale.py
../tornado/auth.py
../tornado/http1connection.py
../tornado/httpserver.py
../tornado/stack_context.py
../tornado/httputil.py
../tornado/options.py
../tornado/tcpclient.py
../tornado/tcpserver.py
../tornado/wsgi.py
../tornado/curl_httpclient.py
../tornado/escape.py
../tornado/util.py
../tornado/log.py
../tornado/process.py
../tornado/ioloop.py
../tornado/autoreload.py
../tornado/test/template_test.py
../tornado/test/twisted_test.py
../tornado/test/escape_test.py
../tornado/test/import_test.py
../tornado/test/ioloop_test.py
../tornado/test/log_test.py
../tornado/test/util_test.py
../tornado/test/concurrent_test.py
../tornado/test/simple_httpclient_test.py
../tornado/test/__init__.py
../tornado/test/stack_context_test.py
../tornado/test/wsgi_test.py
../tornado/test/web_test.py
../tornado/test/gen_test.py
../tornado/test/websocket_test.py
../tornado/test/__main__.py
../tornado/test/resolve_test_helper.py
../tornado/test/locale_test.py
../tornado/test/testing_test.py
../tornado/test/tcpclient_test.py
../tornado/test/httpclient_test.py
../tornado/test/iostream_test.py
../tornado/test/options_test.py
../tornado/test/curl_httpclient_test.py
../tornado/test/runtests.py
../tornado/test/httpserver_test.py
../tornado/test/process_test.py
../tornado/test/httputil_test.py
../tornado/test/netutil_test.py
../tornado/test/util.py
../tornado/test/auth_test.py
../tornado/platform/interface.py
../tornado/platform/kqueue.py
../tornado/platform/__init__.py
../tornado/platform/windows.py
../tornado/platform/asyncio.py
../tornado/platform/twisted.py
../tornado/platform/posix.py
../tornado/platform/select.py
../tornado/platform/epoll.py
../tornado/platform/caresresolver.py
../tornado/platform/auto.py
../tornado/platform/common.py
../tornado/test/README
../tornado/test/csv_translations/fr_FR.csv
../tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.mo
../tornado/test/gettext_translations/fr_FR/LC_MESSAGES/tornado_test.po
../tornado/test/options_test.cfg
../tornado/test/static/robots.txt
../tornado/test/static/dir/index.html
../tornado/test/templates/utf8.html
../tornado/test/test.crt
../tornado/test/test.key
../tornado/netutil.pyc
../tornado/httpclient.pyc
../tornado/web.pyc
../tornado/iostream.pyc
../tornado/simple_httpclient.pyc
../tornado/template.pyc
../tornado/websocket.pyc
../tornado/concurrent.pyc
../tornado/testing.pyc
../tornado/gen.pyc
../tornado/__init__.pyc
../tornado/locale.pyc
../tornado/auth.pyc
../tornado/http1connection.pyc
../tornado/httpserver.pyc
../tornado/stack_context.pyc
../tornado/httputil.pyc
../tornado/options.pyc
../tornado/tcpclient.pyc
../tornado/tcpserver.pyc
../tornado/wsgi.pyc
../tornado/curl_httpclient.pyc
../tornado/escape.pyc
../tornado/util.pyc
../tornado/log.pyc
../tornado/process.pyc
../tornado/ioloop.pyc
../tornado/autoreload.pyc
../tornado/test/template_test.pyc
../tornado/test/twisted_test.pyc
../tornado/test/escape_test.pyc
../tornado/test/import_test.pyc
../tornado/test/ioloop_test.pyc
../tornado/test/log_test.pyc
../tornado/test/util_test.pyc
../tornado/test/concurrent_test.pyc
../tornado/test/simple_httpclient_test.pyc
../tornado/test/__init__.pyc
../tornado/test/stack_context_test.pyc
../tornado/test/wsgi_test.pyc
../tornado/test/web_test.pyc
../tornado/test/gen_test.pyc
../tornado/test/websocket_test.pyc
../tornado/test/__main__.pyc
../tornado/test/resolve_test_helper.pyc
../tornado/test/locale_test.pyc
../tornado/test/testing_test.pyc
../tornado/test/tcpclient_test.pyc
../tornado/test/httpclient_test.pyc
../tornado/test/iostream_test.pyc
../tornado/test/options_test.pyc
../tornado/test/curl_httpclient_test.pyc
../tornado/test/runtests.pyc
../tornado/test/httpserver_test.pyc
../tornado/test/process_test.pyc
../tornado/test/httputil_test.pyc
../tornado/test/netutil_test.pyc
../tornado/test/util.pyc
../tornado/test/auth_test.pyc
../tornado/platform/interface.pyc
../tornado/platform/kqueue.pyc
../tornado/platform/__init__.pyc
../tornado/platform/windows.pyc
../tornado/platform/asyncio.pyc
../tornado/platform/twisted.pyc
../tornado/platform/posix.pyc
../tornado/platform/select.pyc
../tornado/platform/epoll.pyc
../tornado/platform/caresresolver.pyc
../tornado/platform/auto.pyc
../tornado/platform/common.pyc
../tornado/speedups.so
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt
requires.txt

View file

@ -1,2 +0,0 @@
certifi
backports.ssl_match_hostname

View file

@ -1,21 +1,21 @@
Metadata-Version: 1.1
Name: pyPdf
Version: 1.13
Name: PyPDF2
Version: 1.23
Summary: PDF toolkit
Home-page: http://pybrary.net/pyPdf/
Author: Mathieu Fenniak
Author-email: biziqe@mathieu.fenniak.net
Home-page: http://mstamy2.github.com/PyPDF2
Author: Phaseit, Inc.
Author-email: PyPDF2@phaseit.net
License: UNKNOWN
Download-URL: http://pybrary.net/pyPdf/pyPdf-1.13.tar.gz
Description:
A Pure-Python library built as a PDF toolkit. It is capable of:
- extracting document information (title, author, ...),
- splitting documents page by page,
- merging documents page by page,
- cropping pages,
- merging multiple pages into a single page,
- encrypting and decrypting PDF files.
- extracting document information (title, author, ...)
- splitting documents page by page
- merging documents page by page
- cropping pages
- merging multiple pages into a single page
- encrypting and decrypting PDF files
- and more!
By being Pure-Python, it should run on any Python platform without any
dependencies on external libraries. It can also work entirely on StringIO
@ -26,6 +26,7 @@ Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Classifier: Topic :: Software Development :: Libraries :: Python Modules

View file

@ -0,0 +1,15 @@
CHANGELOG
MANIFEST.in
PyPDF2/__init__.py
PyPDF2/_version.py
PyPDF2/filters.py
PyPDF2/generic.py
PyPDF2/merger.py
PyPDF2/pagerange.py
PyPDF2/pdf.py
PyPDF2/utils.py
PyPDF2/xmp.py
PyPDF2.egg-info/PKG-INFO
PyPDF2.egg-info/SOURCES.txt
PyPDF2.egg-info/dependency_links.txt
PyPDF2.egg-info/top_level.txt

View file

@ -0,0 +1,23 @@
../PyPDF2/xmp.py
../PyPDF2/utils.py
../PyPDF2/filters.py
../PyPDF2/__init__.py
../PyPDF2/_version.py
../PyPDF2/generic.py
../PyPDF2/pagerange.py
../PyPDF2/pdf.py
../PyPDF2/merger.py
../PyPDF2/__pycache__/xmp.cpython-34.pyc
../PyPDF2/__pycache__/utils.cpython-34.pyc
../PyPDF2/__pycache__/filters.cpython-34.pyc
../PyPDF2/__pycache__/__init__.cpython-34.pyc
../PyPDF2/__pycache__/_version.cpython-34.pyc
../PyPDF2/__pycache__/generic.cpython-34.pyc
../PyPDF2/__pycache__/pagerange.cpython-34.pyc
../PyPDF2/__pycache__/pdf.cpython-34.pyc
../PyPDF2/__pycache__/merger.cpython-34.pyc
./
dependency_links.txt
PKG-INFO
SOURCES.txt
top_level.txt

View file

@ -0,0 +1 @@
PyPDF2

View file

@ -0,0 +1,5 @@
from .pdf import PdfFileReader, PdfFileWriter
from .merger import PdfFileMerger
from .pagerange import PageRange, parse_filename_page_ranges
from ._version import __version__
__all__ = ["pdf", "PdfFileMerger"]

View file

@ -0,0 +1,2 @@
__version__ = '1.23'

View file

@ -1,252 +1,327 @@
# vim: sw=4:expandtab:foldmethod=marker
#
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
"""
Implementation of stream filters for PDF.
"""
__author__ = "Mathieu Fenniak"
__author_email__ = "biziqe@mathieu.fenniak.net"
from utils import PdfReadError
try:
from cStringIO import StringIO
except ImportError:
from StringIO import StringIO
try:
import zlib
def decompress(data):
return zlib.decompress(data)
def compress(data):
return zlib.compress(data)
except ImportError:
# Unable to import zlib. Attempt to use the System.IO.Compression
# library from the .NET framework. (IronPython only)
import System
from System import IO, Collections, Array
def _string_to_bytearr(buf):
retval = Array.CreateInstance(System.Byte, len(buf))
for i in range(len(buf)):
retval[i] = ord(buf[i])
return retval
def _bytearr_to_string(bytes):
retval = ""
for i in range(bytes.Length):
retval += chr(bytes[i])
return retval
def _read_bytes(stream):
ms = IO.MemoryStream()
buf = Array.CreateInstance(System.Byte, 2048)
while True:
bytes = stream.Read(buf, 0, buf.Length)
if bytes == 0:
break
else:
ms.Write(buf, 0, bytes)
retval = ms.ToArray()
ms.Close()
return retval
def decompress(data):
bytes = _string_to_bytearr(data)
ms = IO.MemoryStream()
ms.Write(bytes, 0, bytes.Length)
ms.Position = 0 # fseek 0
gz = IO.Compression.DeflateStream(ms, IO.Compression.CompressionMode.Decompress)
bytes = _read_bytes(gz)
retval = _bytearr_to_string(bytes)
gz.Close()
return retval
def compress(data):
bytes = _string_to_bytearr(data)
ms = IO.MemoryStream()
gz = IO.Compression.DeflateStream(ms, IO.Compression.CompressionMode.Compress, True)
gz.Write(bytes, 0, bytes.Length)
gz.Close()
ms.Position = 0 # fseek 0
bytes = ms.ToArray()
retval = _bytearr_to_string(bytes)
ms.Close()
return retval
class FlateDecode(object):
def decode(data, decodeParms):
data = decompress(data)
predictor = 1
if decodeParms:
predictor = decodeParms.get("/Predictor", 1)
# predictor 1 == no predictor
if predictor != 1:
columns = decodeParms["/Columns"]
# PNG prediction:
if predictor >= 10 and predictor <= 15:
output = StringIO()
# PNG prediction can vary from row to row
rowlength = columns + 1
assert len(data) % rowlength == 0
prev_rowdata = (0,) * rowlength
for row in xrange(len(data) / rowlength):
rowdata = [ord(x) for x in data[(row*rowlength):((row+1)*rowlength)]]
filterByte = rowdata[0]
if filterByte == 0:
pass
elif filterByte == 1:
for i in range(2, rowlength):
rowdata[i] = (rowdata[i] + rowdata[i-1]) % 256
elif filterByte == 2:
for i in range(1, rowlength):
rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256
else:
# unsupported PNG filter
raise PdfReadError("Unsupported PNG filter %r" % filterByte)
prev_rowdata = rowdata
output.write(''.join([chr(x) for x in rowdata[1:]]))
data = output.getvalue()
else:
# unsupported predictor
raise PdfReadError("Unsupported flatedecode predictor %r" % predictor)
return data
decode = staticmethod(decode)
def encode(data):
return compress(data)
encode = staticmethod(encode)
class ASCIIHexDecode(object):
def decode(data, decodeParms=None):
retval = ""
char = ""
x = 0
while True:
c = data[x]
if c == ">":
break
elif c.isspace():
x += 1
continue
char += c
if len(char) == 2:
retval += chr(int(char, base=16))
char = ""
x += 1
assert char == ""
return retval
decode = staticmethod(decode)
class ASCII85Decode(object):
def decode(data, decodeParms=None):
retval = ""
group = []
x = 0
hitEod = False
# remove all whitespace from data
data = [y for y in data if not (y in ' \n\r\t')]
while not hitEod:
c = data[x]
if len(retval) == 0 and c == "<" and data[x+1] == "~":
x += 2
continue
#elif c.isspace():
# x += 1
# continue
elif c == 'z':
assert len(group) == 0
retval += '\x00\x00\x00\x00'
continue
elif c == "~" and data[x+1] == ">":
if len(group) != 0:
# cannot have a final group of just 1 char
assert len(group) > 1
cnt = len(group) - 1
group += [ 85, 85, 85 ]
hitEod = cnt
else:
break
else:
c = ord(c) - 33
assert c >= 0 and c < 85
group += [ c ]
if len(group) >= 5:
b = group[0] * (85**4) + \
group[1] * (85**3) + \
group[2] * (85**2) + \
group[3] * 85 + \
group[4]
assert b < (2**32 - 1)
c4 = chr((b >> 0) % 256)
c3 = chr((b >> 8) % 256)
c2 = chr((b >> 16) % 256)
c1 = chr(b >> 24)
retval += (c1 + c2 + c3 + c4)
if hitEod:
retval = retval[:-4+hitEod]
group = []
x += 1
return retval
decode = staticmethod(decode)
def decodeStreamData(stream):
from generic import NameObject
filters = stream.get("/Filter", ())
if len(filters) and not isinstance(filters[0], NameObject):
# we have a single filter instance
filters = (filters,)
data = stream._data
for filterType in filters:
if filterType == "/FlateDecode":
data = FlateDecode.decode(data, stream.get("/DecodeParms"))
elif filterType == "/ASCIIHexDecode":
data = ASCIIHexDecode.decode(data)
elif filterType == "/ASCII85Decode":
data = ASCII85Decode.decode(data)
elif filterType == "/Crypt":
decodeParams = stream.get("/DecodeParams", {})
if "/Name" not in decodeParams and "/Type" not in decodeParams:
pass
else:
raise NotImplementedError("/Crypt filter with /Name or /Type not supported yet")
else:
# unsupported filter
raise NotImplementedError("unsupported filter %s" % filterType)
return data
if __name__ == "__main__":
assert "abc" == ASCIIHexDecode.decode('61\n626\n3>')
ascii85Test = """
<~9jqo^BlbD-BleB1DJ+*+F(f,q/0JhKF<GL>Cj@.4Gp$d7F!,L7@<6@)/0JDEF<G%<+EV:2F!,
O<DJ+*.@<*K0@<6L(Df-\\0Ec5e;DffZ(EZee.Bl.9pF"AGXBPCsi+DGm>@3BB/F*&OCAfu2/AKY
i(DIb:@FD,*)+C]U=@3BN#EcYf8ATD3s@q?d$AftVqCh[NqF<G:8+EV:.+Cf>-FD5W8ARlolDIa
l(DId<j@<?3r@:F%a+D58'ATD4$Bl@l3De:,-DJs`8ARoFb/0JMK@qB4^F!,R<AKZ&-DfTqBG%G
>uD.RTpAKYo'+CT/5+Cei#DII?(E,9)oF*2M7/c~>
"""
ascii85_originalText="Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure."
assert ASCII85Decode.decode(ascii85Test) == ascii85_originalText
# vim: sw=4:expandtab:foldmethod=marker
#
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
"""
Implementation of stream filters for PDF.
"""
__author__ = "Mathieu Fenniak"
__author_email__ = "biziqe@mathieu.fenniak.net"
from .utils import PdfReadError, ord_, chr_
from sys import version_info
if version_info < ( 3, 0 ):
from cStringIO import StringIO
else:
from io import StringIO
try:
import zlib
def decompress(data):
return zlib.decompress(data)
def compress(data):
return zlib.compress(data)
except ImportError:
# Unable to import zlib. Attempt to use the System.IO.Compression
# library from the .NET framework. (IronPython only)
import System
from System import IO, Collections, Array
def _string_to_bytearr(buf):
retval = Array.CreateInstance(System.Byte, len(buf))
for i in range(len(buf)):
retval[i] = ord(buf[i])
return retval
def _bytearr_to_string(bytes):
retval = ""
for i in range(bytes.Length):
retval += chr(bytes[i])
return retval
def _read_bytes(stream):
ms = IO.MemoryStream()
buf = Array.CreateInstance(System.Byte, 2048)
while True:
bytes = stream.Read(buf, 0, buf.Length)
if bytes == 0:
break
else:
ms.Write(buf, 0, bytes)
retval = ms.ToArray()
ms.Close()
return retval
def decompress(data):
bytes = _string_to_bytearr(data)
ms = IO.MemoryStream()
ms.Write(bytes, 0, bytes.Length)
ms.Position = 0 # fseek 0
gz = IO.Compression.DeflateStream(ms, IO.Compression.CompressionMode.Decompress)
bytes = _read_bytes(gz)
retval = _bytearr_to_string(bytes)
gz.Close()
return retval
def compress(data):
bytes = _string_to_bytearr(data)
ms = IO.MemoryStream()
gz = IO.Compression.DeflateStream(ms, IO.Compression.CompressionMode.Compress, True)
gz.Write(bytes, 0, bytes.Length)
gz.Close()
ms.Position = 0 # fseek 0
bytes = ms.ToArray()
retval = _bytearr_to_string(bytes)
ms.Close()
return retval
class FlateDecode(object):
def decode(data, decodeParms):
data = decompress(data)
predictor = 1
if decodeParms:
try:
predictor = decodeParms.get("/Predictor", 1)
except AttributeError:
pass # usually an array with a null object was read
# predictor 1 == no predictor
if predictor != 1:
columns = decodeParms["/Columns"]
# PNG prediction:
if predictor >= 10 and predictor <= 15:
output = StringIO()
# PNG prediction can vary from row to row
rowlength = columns + 1
assert len(data) % rowlength == 0
prev_rowdata = (0,) * rowlength
for row in range(len(data) // rowlength):
rowdata = [ord_(x) for x in data[(row*rowlength):((row+1)*rowlength)]]
filterByte = rowdata[0]
if filterByte == 0:
pass
elif filterByte == 1:
for i in range(2, rowlength):
rowdata[i] = (rowdata[i] + rowdata[i-1]) % 256
elif filterByte == 2:
for i in range(1, rowlength):
rowdata[i] = (rowdata[i] + prev_rowdata[i]) % 256
else:
# unsupported PNG filter
raise PdfReadError("Unsupported PNG filter %r" % filterByte)
prev_rowdata = rowdata
output.write(''.join([chr(x) for x in rowdata[1:]]))
data = output.getvalue()
else:
# unsupported predictor
raise PdfReadError("Unsupported flatedecode predictor %r" % predictor)
return data
decode = staticmethod(decode)
def encode(data):
return compress(data)
encode = staticmethod(encode)
class ASCIIHexDecode(object):
def decode(data, decodeParms=None):
retval = ""
char = ""
x = 0
while True:
c = data[x]
if c == ">":
break
elif c.isspace():
x += 1
continue
char += c
if len(char) == 2:
retval += chr(int(char, base=16))
char = ""
x += 1
assert char == ""
return retval
decode = staticmethod(decode)
class LZWDecode(object):
"""Taken from:
http://www.java2s.com/Open-Source/Java-Document/PDF/PDF-Renderer/com/sun/pdfview/decode/LZWDecode.java.htm
"""
class decoder(object):
def __init__(self, data):
self.STOP=257
self.CLEARDICT=256
self.data=data
self.bytepos=0
self.bitpos=0
self.dict=[""]*4096
for i in range(256):
self.dict[i]=chr(i)
self.resetDict()
def resetDict(self):
self.dictlen=258
self.bitspercode=9
def nextCode(self):
fillbits=self.bitspercode
value=0
while fillbits>0 :
if self.bytepos >= len(self.data):
return -1
nextbits=ord(self.data[self.bytepos])
bitsfromhere=8-self.bitpos
if bitsfromhere>fillbits:
bitsfromhere=fillbits
value |= (((nextbits >> (8-self.bitpos-bitsfromhere)) &
(0xff >> (8-bitsfromhere))) <<
(fillbits-bitsfromhere))
fillbits -= bitsfromhere
self.bitpos += bitsfromhere
if self.bitpos >=8:
self.bitpos=0
self.bytepos = self.bytepos+1
return value
def decode(self):
""" algorithm derived from:
http://www.rasip.fer.hr/research/compress/algorithms/fund/lz/lzw.html
and the PDFReference
"""
cW = self.CLEARDICT;
baos=""
while True:
pW = cW;
cW = self.nextCode();
if cW == -1:
raise PdfReadError("Missed the stop code in LZWDecode!")
if cW == self.STOP:
break;
elif cW == self.CLEARDICT:
self.resetDict();
elif pW == self.CLEARDICT:
baos+=self.dict[cW]
else:
if cW < self.dictlen:
baos += self.dict[cW]
p=self.dict[pW]+self.dict[cW][0]
self.dict[self.dictlen]=p
self.dictlen+=1
else:
p=self.dict[pW]+self.dict[pW][0]
baos+=p
self.dict[self.dictlen] = p;
self.dictlen+=1
if (self.dictlen >= (1 << self.bitspercode) - 1 and
self.bitspercode < 12):
self.bitspercode+=1
return baos
@staticmethod
def decode(data,decodeParams=None):
return LZWDecode.decoder(data).decode()
class ASCII85Decode(object):
def decode(data, decodeParms=None):
retval = ""
group = []
x = 0
hitEod = False
# remove all whitespace from data
data = [y for y in data if not (y in ' \n\r\t')]
while not hitEod:
c = data[x]
if len(retval) == 0 and c == "<" and data[x+1] == "~":
x += 2
continue
#elif c.isspace():
# x += 1
# continue
elif c == 'z':
assert len(group) == 0
retval += '\x00\x00\x00\x00'
x += 1
continue
elif c == "~" and data[x+1] == ">":
if len(group) != 0:
# cannot have a final group of just 1 char
assert len(group) > 1
cnt = len(group) - 1
group += [ 85, 85, 85 ]
hitEod = cnt
else:
break
else:
c = ord(c) - 33
assert c >= 0 and c < 85
group += [ c ]
if len(group) >= 5:
b = group[0] * (85**4) + \
group[1] * (85**3) + \
group[2] * (85**2) + \
group[3] * 85 + \
group[4]
assert b < (2**32 - 1)
c4 = chr((b >> 0) % 256)
c3 = chr((b >> 8) % 256)
c2 = chr((b >> 16) % 256)
c1 = chr(b >> 24)
retval += (c1 + c2 + c3 + c4)
if hitEod:
retval = retval[:-4+hitEod]
group = []
x += 1
return retval
decode = staticmethod(decode)
def decodeStreamData(stream):
from .generic import NameObject
filters = stream.get("/Filter", ())
if len(filters) and not isinstance(filters[0], NameObject):
# we have a single filter instance
filters = (filters,)
data = stream._data
for filterType in filters:
if filterType == "/FlateDecode":
data = FlateDecode.decode(data, stream.get("/DecodeParms"))
elif filterType == "/ASCIIHexDecode":
data = ASCIIHexDecode.decode(data)
elif filterType == "/LZWDecode":
data = LZWDecode.decode(data, stream.get("/DecodeParms"))
elif filterType == "/ASCII85Decode":
data = ASCII85Decode.decode(data)
elif filterType == "/Crypt":
decodeParams = stream.get("/DecodeParams", {})
if "/Name" not in decodeParams and "/Type" not in decodeParams:
pass
else:
raise NotImplementedError("/Crypt filter with /Name or /Type not supported yet")
else:
# unsupported filter
raise NotImplementedError("unsupported filter %s" % filterType)
return data

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,554 @@
# vim: sw=4:expandtab:foldmethod=marker
#
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
from .generic import *
from .utils import string_type
from .pdf import PdfFileReader, PdfFileWriter
from .pagerange import PageRange
from sys import version_info
if version_info < ( 3, 0 ):
from cStringIO import StringIO
StreamIO = StringIO
else:
from io import BytesIO
from io import FileIO as file
StreamIO = BytesIO
class _MergedPage(object):
"""
_MergedPage is used internally by PdfFileMerger to collect necessary
information on each page that is being merged.
"""
def __init__(self, pagedata, src, id):
self.src = src
self.pagedata = pagedata
self.out_pagedata = None
self.id = id
class PdfFileMerger(object):
"""
Initializes a PdfFileMerger object. PdfFileMerger merges multiple PDFs
into a single PDF. It can concatenate, slice, insert, or any combination
of the above.
See the functions :meth:`merge()<merge>` (or :meth:`append()<append>`)
and :meth:`write()<write>` for usage information.
:param bool strict: Determines whether user should be warned of all
problems and also causes some correctable problems to be fatal.
Defaults to ``True``.
"""
def __init__(self, strict=True):
self.inputs = []
self.pages = []
self.output = PdfFileWriter()
self.bookmarks = []
self.named_dests = []
self.id_count = 0
self.strict = strict
def merge(self, position, fileobj, bookmark=None, pages=None, import_bookmarks=True):
"""
Merges the pages from the given file into the output file at the
specified page number.
:param int position: The *page number* to insert this file. File will
be inserted after the given number.
:param fileobj: A File Object or an object that supports the standard read
and seek methods similar to a File Object. Could also be a
string representing a path to a PDF file.
:param str bookmark: Optionally, you may specify a bookmark to be applied at
the beginning of the included file by supplying the text of the bookmark.
:param pages: can be a :ref:`Page Range <page-range>` or a ``(start, stop[, step])`` tuple
to merge only the specified range of pages from the source
document into the output document.
:param bool import_bookmarks: You may prevent the source document's bookmarks
from being imported by specifying this as ``False``.
"""
# This parameter is passed to self.inputs.append and means
# that the stream used was created in this method.
my_file = False
# If the fileobj parameter is a string, assume it is a path
# and create a file object at that location. If it is a file,
# copy the file's contents into a BytesIO (or StreamIO) stream object; if
# it is a PdfFileReader, copy that reader's stream into a
# BytesIO (or StreamIO) stream.
# If fileobj is none of the above types, it is not modified
if type(fileobj) == string_type:
fileobj = file(fileobj, 'rb')
my_file = True
elif isinstance(fileobj, file):
fileobj.seek(0)
filecontent = fileobj.read()
fileobj = StreamIO(filecontent)
my_file = True
elif isinstance(fileobj, PdfFileReader):
orig_tell = fileobj.stream.tell()
fileobj.stream.seek(0)
filecontent = StreamIO(fileobj.stream.read())
fileobj.stream.seek(orig_tell) # reset the stream to its original location
fileobj = filecontent
my_file = True
# Create a new PdfFileReader instance using the stream
# (either file or BytesIO or StringIO) created above
pdfr = PdfFileReader(fileobj, strict=self.strict)
# Find the range of pages to merge.
if pages == None:
pages = (0, pdfr.getNumPages())
elif isinstance(pages, PageRange):
pages = pages.indices(pdfr.getNumPages())
elif not isinstance(pages, tuple):
raise TypeError('"pages" must be a tuple of (start, stop[, step])')
srcpages = []
if bookmark:
bookmark = Bookmark(TextStringObject(bookmark), NumberObject(self.id_count), NameObject('/Fit'))
outline = []
if import_bookmarks:
outline = pdfr.getOutlines()
outline = self._trim_outline(pdfr, outline, pages)
if bookmark:
self.bookmarks += [bookmark, outline]
else:
self.bookmarks += outline
dests = pdfr.namedDestinations
dests = self._trim_dests(pdfr, dests, pages)
self.named_dests += dests
# Gather all the pages that are going to be merged
for i in range(*pages):
pg = pdfr.getPage(i)
id = self.id_count
self.id_count += 1
mp = _MergedPage(pg, pdfr, id)
srcpages.append(mp)
self._associate_dests_to_pages(srcpages)
self._associate_bookmarks_to_pages(srcpages)
# Slice to insert the pages at the specified position
self.pages[position:position] = srcpages
# Keep track of our input files so we can close them later
self.inputs.append((fileobj, pdfr, my_file))
def append(self, fileobj, bookmark=None, pages=None, import_bookmarks=True):
"""
Identical to the :meth:`merge()<merge>` method, but assumes you want to concatenate
all pages onto the end of the file instead of specifying a position.
:param fileobj: A File Object or an object that supports the standard read
and seek methods similar to a File Object. Could also be a
string representing a path to a PDF file.
:param str bookmark: Optionally, you may specify a bookmark to be applied at
the beginning of the included file by supplying the text of the bookmark.
:param pages: can be a :ref:`Page Range <page-range>` or a ``(start, stop[, step])`` tuple
to merge only the specified range of pages from the source
document into the output document.
:param bool import_bookmarks: You may prevent the source document's bookmarks
from being imported by specifying this as ``False``.
"""
self.merge(len(self.pages), fileobj, bookmark, pages, import_bookmarks)
def write(self, fileobj):
"""
Writes all data that has been merged to the given output file.
:param fileobj: Output file. Can be a filename or any kind of
file-like object.
"""
my_file = False
if type(fileobj) in (str, str):
fileobj = file(fileobj, 'wb')
my_file = True
# Add pages to the PdfFileWriter
# The commented out line below was replaced with the two lines below it to allow PdfFileMerger to work with PyPdf 1.13
for page in self.pages:
self.output.addPage(page.pagedata)
page.out_pagedata = self.output.getReference(self.output._pages.getObject()["/Kids"][-1].getObject())
#idnum = self.output._objects.index(self.output._pages.getObject()["/Kids"][-1].getObject()) + 1
#page.out_pagedata = IndirectObject(idnum, 0, self.output)
# Once all pages are added, create bookmarks to point at those pages
self._write_dests()
self._write_bookmarks()
# Write the output to the file
self.output.write(fileobj)
if my_file:
fileobj.close()
def close(self):
"""
Shuts all file descriptors (input and output) and clears all memory
usage.
"""
self.pages = []
for fo, pdfr, mine in self.inputs:
if mine:
fo.close()
self.inputs = []
self.output = None
def addMetadata(self, infos):
"""
Add custom metadata to the output.
:param dict infos: a Python dictionary where each key is a field
and each value is your new metadata.
Example: ``{u'/Title': u'My title'}``
"""
self.output.addMetadata(infos)
def setPageLayout(self, layout):
"""
Set the page layout
:param str layout: The page layout to be used
Valid layouts are:
/NoLayout Layout explicitly not specified
/SinglePage Show one page at a time
/OneColumn Show one column at a time
/TwoColumnLeft Show pages in two columns, odd-numbered pages on the left
/TwoColumnRight Show pages in two columns, odd-numbered pages on the right
/TwoPageLeft Show two pages at a time, odd-numbered pages on the left
/TwoPageRight Show two pages at a time, odd-numbered pages on the right
"""
self.output.setPageLayout(layout)
def setPageMode(self, mode):
"""
Set the page mode.
:param str mode: The page mode to use.
Valid modes are:
/UseNone Do not show outlines or thumbnails panels
/UseOutlines Show outlines (aka bookmarks) panel
/UseThumbs Show page thumbnails panel
/FullScreen Fullscreen view
/UseOC Show Optional Content Group (OCG) panel
/UseAttachments Show attachments panel
"""
self.output.setPageMode(mode)
def _trim_dests(self, pdf, dests, pages):
"""
Removes any named destinations that are not a part of the specified
page set.
"""
new_dests = []
prev_header_added = True
for k, o in list(dests.items()):
for j in range(*pages):
if pdf.getPage(j).getObject() == o['/Page'].getObject():
o[NameObject('/Page')] = o['/Page'].getObject()
assert str(k) == str(o['/Title'])
new_dests.append(o)
break
return new_dests
def _trim_outline(self, pdf, outline, pages):
"""
Removes any outline/bookmark entries that are not a part of the
specified page set.
"""
new_outline = []
prev_header_added = True
for i, o in enumerate(outline):
if isinstance(o, list):
sub = self._trim_outline(pdf, o, pages)
if sub:
if not prev_header_added:
new_outline.append(outline[i-1])
new_outline.append(sub)
else:
prev_header_added = False
for j in range(*pages):
if pdf.getPage(j).getObject() == o['/Page'].getObject():
o[NameObject('/Page')] = o['/Page'].getObject()
new_outline.append(o)
prev_header_added = True
break
return new_outline
def _write_dests(self):
dests = self.named_dests
for v in dests:
pageno = None
pdf = None
if '/Page' in v:
for i, p in enumerate(self.pages):
if p.id == v['/Page']:
v[NameObject('/Page')] = p.out_pagedata
pageno = i
pdf = p.src
break
if pageno != None:
self.output.addNamedDestinationObject(v)
def _write_bookmarks(self, bookmarks=None, parent=None):
if bookmarks == None:
bookmarks = self.bookmarks
last_added = None
for b in bookmarks:
if isinstance(b, list):
self._write_bookmarks(b, last_added)
continue
pageno = None
pdf = None
if '/Page' in b:
for i, p in enumerate(self.pages):
if p.id == b['/Page']:
#b[NameObject('/Page')] = p.out_pagedata
args = [NumberObject(p.id), NameObject(b['/Type'])]
#nothing more to add
#if b['/Type'] == '/Fit' or b['/Type'] == '/FitB'
if b['/Type'] == '/FitH' or b['/Type'] == '/FitBH':
if '/Top' in b and not isinstance(b['/Top'], NullObject):
args.append(FloatObject(b['/Top']))
else:
args.append(FloatObject(0))
del b['/Top']
elif b['/Type'] == '/FitV' or b['/Type'] == '/FitBV':
if '/Left' in b and not isinstance(b['/Left'], NullObject):
args.append(FloatObject(b['/Left']))
else:
args.append(FloatObject(0))
del b['/Left']
elif b['/Type'] == '/XYZ':
if '/Left' in b and not isinstance(b['/Left'], NullObject):
args.append(FloatObject(b['/Left']))
else:
args.append(FloatObject(0))
if '/Top' in b and not isinstance(b['/Top'], NullObject):
args.append(FloatObject(b['/Top']))
else:
args.append(FloatObject(0))
if '/Zoom' in b and not isinstance(b['/Zoom'], NullObject):
args.append(FloatObject(b['/Zoom']))
else:
args.append(FloatObject(0))
del b['/Top'], b['/Zoom'], b['/Left']
elif b['/Type'] == '/FitR':
if '/Left' in b and not isinstance(b['/Left'], NullObject):
args.append(FloatObject(b['/Left']))
else:
args.append(FloatObject(0))
if '/Bottom' in b and not isinstance(b['/Bottom'], NullObject):
args.append(FloatObject(b['/Bottom']))
else:
args.append(FloatObject(0))
if '/Right' in b and not isinstance(b['/Right'], NullObject):
args.append(FloatObject(b['/Right']))
else:
args.append(FloatObject(0))
if '/Top' in b and not isinstance(b['/Top'], NullObject):
args.append(FloatObject(b['/Top']))
else:
args.append(FloatObject(0))
del b['/Left'], b['/Right'], b['/Bottom'], b['/Top']
b[NameObject('/A')] = DictionaryObject({NameObject('/S'): NameObject('/GoTo'), NameObject('/D'): ArrayObject(args)})
pageno = i
pdf = p.src
break
if pageno != None:
del b['/Page'], b['/Type']
last_added = self.output.addBookmarkDict(b, parent)
def _associate_dests_to_pages(self, pages):
for nd in self.named_dests:
pageno = None
np = nd['/Page']
if isinstance(np, NumberObject):
continue
for p in pages:
if np.getObject() == p.pagedata.getObject():
pageno = p.id
if pageno != None:
nd[NameObject('/Page')] = NumberObject(pageno)
else:
raise ValueError("Unresolved named destination '%s'" % (nd['/Title'],))
def _associate_bookmarks_to_pages(self, pages, bookmarks=None):
if bookmarks == None:
bookmarks = self.bookmarks
for b in bookmarks:
if isinstance(b, list):
self._associate_bookmarks_to_pages(pages, b)
continue
pageno = None
bp = b['/Page']
if isinstance(bp, NumberObject):
continue
for p in pages:
if bp.getObject() == p.pagedata.getObject():
pageno = p.id
if pageno != None:
b[NameObject('/Page')] = NumberObject(pageno)
else:
raise ValueError("Unresolved bookmark '%s'" % (b['/Title'],))
def findBookmark(self, bookmark, root=None):
if root == None:
root = self.bookmarks
for i, b in enumerate(root):
if isinstance(b, list):
res = self.findBookmark(bookmark, b)
if res:
return [i] + res
elif b == bookmark or b['/Title'] == bookmark:
return [i]
return None
def addBookmark(self, title, pagenum, parent=None):
"""
Add a bookmark to this PDF file.
:param str title: Title to use for this bookmark.
:param int pagenum: Page number this bookmark will point to.
:param parent: A reference to a parent bookmark to create nested
bookmarks.
"""
if parent == None:
iloc = [len(self.bookmarks)-1]
elif isinstance(parent, list):
iloc = parent
else:
iloc = self.findBookmark(parent)
dest = Bookmark(TextStringObject(title), NumberObject(pagenum), NameObject('/FitH'), NumberObject(826))
if parent == None:
self.bookmarks.append(dest)
else:
bmparent = self.bookmarks
for i in iloc[:-1]:
bmparent = bmparent[i]
npos = iloc[-1]+1
if npos < len(bmparent) and isinstance(bmparent[npos], list):
bmparent[npos].append(dest)
else:
bmparent.insert(npos, [dest])
return dest
def addNamedDestination(self, title, pagenum):
"""
Add a destination to the output.
:param str title: Title to use
:param int pagenum: Page number this destination points at.
"""
dest = Destination(TextStringObject(title), NumberObject(pagenum), NameObject('/FitH'), NumberObject(826))
self.named_dests.append(dest)
class OutlinesObject(list):
def __init__(self, pdf, tree, parent=None):
list.__init__(self)
self.tree = tree
self.pdf = pdf
self.parent = parent
def remove(self, index):
obj = self[index]
del self[index]
self.tree.removeChild(obj)
def add(self, title, pagenum):
pageRef = self.pdf.getObject(self.pdf._pages)['/Kids'][pagenum]
action = DictionaryObject()
action.update({
NameObject('/D') : ArrayObject([pageRef, NameObject('/FitH'), NumberObject(826)]),
NameObject('/S') : NameObject('/GoTo')
})
actionRef = self.pdf._addObject(action)
bookmark = TreeObject()
bookmark.update({
NameObject('/A'): actionRef,
NameObject('/Title'): createStringObject(title),
})
self.pdf._addObject(bookmark)
self.tree.addChild(bookmark)
def removeAll(self):
for child in [x for x in self.tree.children()]:
self.tree.removeChild(child)
self.pop()

View file

@ -0,0 +1,152 @@
#!/usr/bin/env python
"""
Representation and utils for ranges of PDF file pages.
Copyright (c) 2014, Steve Witham <switham_github@mac-guyver.com>.
All rights reserved. This software is available under a BSD license;
see https://github.com/mstamy2/PyPDF2/blob/master/LICENSE
"""
import re
from .utils import Str
_INT_RE = r"(0|-?[1-9]\d*)" # A decimal int, don't allow "-0".
PAGE_RANGE_RE = "^({int}|({int}?(:{int}?(:{int}?)?)))$".format(int=_INT_RE)
# groups: 12 34 5 6 7 8
class ParseError(Exception):
pass
PAGE_RANGE_HELP = """Remember, page indices start with zero.
Page range expression examples:
: all pages. -1 last page.
22 just the 23rd page. :-1 all but the last page.
0:3 the first three pages. -2 second-to-last page.
:3 the first three pages. -2: last two pages.
5: from the sixth page onward. -3:-1 third & second to last.
The third, "stride" or "step" number is also recognized.
::2 0 2 4 ... to the end. 3:0:-1 3 2 1 but not 0.
1:10:2 1 3 5 7 9 2::-1 2 1 0.
::-1 all pages in reverse order.
"""
class PageRange(object):
"""
A slice-like representation of a range of page indices,
i.e. page numbers, only starting at zero.
The syntax is like what you would put between brackets [ ].
The slice is one of the few Python types that can't be subclassed,
but this class converts to and from slices, and allows similar use.
o PageRange(str) parses a string representing a page range.
o PageRange(slice) directly "imports" a slice.
o to_slice() gives the equivalent slice.
o str() and repr() allow printing.
o indices(n) is like slice.indices(n).
"""
def __init__(self, arg):
"""
Initialize with either a slice -- giving the equivalent page range,
or a PageRange object -- making a copy,
or a string like
"int", "[int]:[int]" or "[int]:[int]:[int]",
where the brackets indicate optional ints.
{page_range_help}
Note the difference between this notation and arguments to slice():
slice(3) means the first three pages;
PageRange("3") means the range of only the fourth page.
However PageRange(slice(3)) means the first three pages.
"""
if isinstance(arg, slice):
self._slice = arg
return
if isinstance(arg, PageRange):
self._slice = arg.to_slice()
return
m = isinstance(arg, Str) and re.match(PAGE_RANGE_RE, arg)
if not m:
raise ParseError(arg)
elif m.group(2):
# Special case: just an int means a range of one page.
start = int(m.group(2))
stop = start + 1 if start != -1 else None
self._slice = slice(start, stop)
else:
self._slice = slice(*[int(g) if g else None
for g in m.group(4, 6, 8)])
# Just formatting this when there is __doc__ for __init__
if __init__.__doc__:
__init__.__doc__ = __init__.__doc__.format(page_range_help=PAGE_RANGE_HELP)
@staticmethod
def valid(input):
""" True if input is a valid initializer for a PageRange. """
return isinstance(input, slice) or \
isinstance(input, PageRange) or \
(isinstance(input, Str)
and bool(re.match(PAGE_RANGE_RE, input)))
def to_slice(self):
""" Return the slice equivalent of this page range. """
return self._slice
def __str__(self):
""" A string like "1:2:3". """
s = self._slice
if s.step == None:
if s.start != None and s.stop == s.start + 1:
return str(s.start)
indices = s.start, s.stop
else:
indices = s.start, s.stop, s.step
return ':'.join("" if i == None else str(i) for i in indices)
def __repr__(self):
""" A string like "PageRange('1:2:3')". """
return "PageRange(" + repr(str(self)) + ")"
def indices(self, n):
"""
n is the length of the list of pages to choose from.
Returns arguments for range(). See help(slice.indices).
"""
return self._slice.indices(n)
PAGE_RANGE_ALL = PageRange(":") # The range of all pages.
def parse_filename_page_ranges(args):
"""
Given a list of filenames and page ranges, return a list of
(filename, page_range) pairs.
First arg must be a filename; other ags are filenames, page-range
expressions, slice objects, or PageRange objects.
A filename not followed by a page range indicates all pages of the file.
"""
pairs = []
pdf_filename = None
did_page_range = False
for arg in args + [None]:
if PageRange.valid(arg):
if not pdf_filename:
raise ValueError("The first argument must be a filename, " \
"not a page range.")
pairs.append( (pdf_filename, PageRange(arg)) )
did_page_range = True
else:
# New filename or end of list--do all of the previous file?
if pdf_filename and not did_page_range:
pairs.append( (pdf_filename, PAGE_RANGE_ALL) )
pdf_filename = arg
did_page_range = False
return pairs

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,263 @@
# Copyright (c) 2006, Mathieu Fenniak
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright notice,
# this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# * The name of the author may not be used to endorse or promote products
# derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
# ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
# LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
# CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
# SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
# INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
# ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
"""
Utility functions for PDF library.
"""
__author__ = "Mathieu Fenniak"
__author_email__ = "biziqe@mathieu.fenniak.net"
import sys
# "Str" maintains compatibility with Python 2.x.
# The next line is obfuscated like this so 2to3 won't change it.
try:
import __builtin__ as builtins
except ImportError: # Py3
import builtins
if sys.version_info[0] < 3:
string_type = unicode
bytes_type = str
int_types = (int, long)
else:
string_type = str
bytes_type = bytes
int_types = (int,)
Xrange = getattr(builtins, "xrange", range)
Str = getattr(builtins, "basestring", str)
#custom implementation of warnings.formatwarning
def formatWarning(message, category, filename, lineno, line=None):
file = filename.replace("/", "\\").rsplit("\\", 1)[1] # find the file name
return "%s: %s [%s:%s]\n" % (category.__name__, message, file, lineno)
def readUntilWhitespace(stream, maxchars=None):
"""
Reads non-whitespace characters and returns them.
Stops upon encountering whitespace or when maxchars is reached.
"""
txt = b_("")
while True:
tok = stream.read(1)
if tok.isspace() or not tok:
break
txt += tok
if len(txt) == maxchars:
break
return txt
def readNonWhitespace(stream):
"""
Finds and reads the next non-whitespace character (ignores whitespace).
"""
tok = WHITESPACES[0]
while tok in WHITESPACES:
tok = stream.read(1)
return tok
def skipOverWhitespace(stream):
"""
Similar to readNonWhitespace, but returns a Boolean if more than
one whitespace character was read.
"""
tok = WHITESPACES[0]
cnt = 0;
while tok in WHITESPACES:
tok = stream.read(1)
cnt+=1
return (cnt > 1)
def skipOverComment(stream):
tok = stream.read(1)
stream.seek(-1, 1)
if tok == b_('%'):
while tok not in (b_('\n'), b_('\r')):
tok = stream.read(1)
def readUntilRegex(stream, regex, ignore_eof=False):
"""
Reads until the regular expression pattern matched (ignore the match)
Raise PdfStreamError on premature end-of-file.
:param bool ignore_eof: If true, ignore end-of-line and return immediately
"""
name = b_('')
while True:
tok = stream.read(16)
if not tok:
# stream has truncated prematurely
if ignore_eof == True:
return name
else:
raise PdfStreamError("Stream has ended unexpectedly")
m = regex.search(tok)
if m is not None:
name += tok[:m.start()]
stream.seek(m.start()-len(tok), 1)
break
name += tok
return name
class ConvertFunctionsToVirtualList(object):
def __init__(self, lengthFunction, getFunction):
self.lengthFunction = lengthFunction
self.getFunction = getFunction
def __len__(self):
return self.lengthFunction()
def __getitem__(self, index):
if isinstance(index, slice):
indices = Xrange(*index.indices(len(self)))
cls = type(self)
return cls(indices.__len__, lambda idx: self[indices[idx]])
if not isinstance(index, int_types):
raise TypeError("sequence indices must be integers")
len_self = len(self)
if index < 0:
# support negative indexes
index = len_self + index
if index < 0 or index >= len_self:
raise IndexError("sequence index out of range")
return self.getFunction(index)
def RC4_encrypt(key, plaintext):
S = [i for i in range(256)]
j = 0
for i in range(256):
j = (j + S[i] + ord_(key[i % len(key)])) % 256
S[i], S[j] = S[j], S[i]
i, j = 0, 0
retval = b_("")
for x in range(len(plaintext)):
i = (i + 1) % 256
j = (j + S[i]) % 256
S[i], S[j] = S[j], S[i]
t = S[(S[i] + S[j]) % 256]
retval += b_(chr(ord_(plaintext[x]) ^ t))
return retval
def matrixMultiply(a, b):
return [[sum([float(i)*float(j)
for i, j in zip(row, col)]
) for col in zip(*b)]
for row in a]
def markLocation(stream):
"""Creates text file showing current location in context."""
# Mainly for debugging
RADIUS = 5000
stream.seek(-RADIUS, 1)
outputDoc = open('PyPDF2_pdfLocation.txt', 'w')
outputDoc.write(stream.read(RADIUS))
outputDoc.write('HERE')
outputDoc.write(stream.read(RADIUS))
outputDoc.close()
stream.seek(-RADIUS, 1)
class PyPdfError(Exception):
pass
class PdfReadError(PyPdfError):
pass
class PageSizeNotDefinedError(PyPdfError):
pass
class PdfReadWarning(UserWarning):
pass
class PdfStreamError(PdfReadError):
pass
if sys.version_info[0] < 3:
def b_(s):
return s
else:
B_CACHE = {}
def b_(s):
bc = B_CACHE
if s in bc:
return bc[s]
if type(s) == bytes:
return s
else:
r = s.encode('latin-1')
if len(s) < 2:
bc[s] = r
return r
def u_(s):
if sys.version_info[0] < 3:
return unicode(s, 'unicode_escape')
else:
return s
def str_(b):
if sys.version_info[0] < 3:
return b
else:
if type(b) == bytes:
return b.decode('latin-1')
else:
return b
def ord_(b):
if sys.version_info[0] < 3 or type(b) == str:
return ord(b)
else:
return b
def chr_(c):
if sys.version_info[0] < 3:
return c
else:
return chr(c)
def barray(b):
if sys.version_info[0] < 3:
return b
else:
return bytearray(b)
def hexencode(b):
if sys.version_info[0] < 3:
return b.encode('hex')
else:
import codecs
coder = codecs.getencoder('hex_codec')
return coder(b)[0]
def hexStr(num):
return hex(num).replace('L', '')
WHITESPACES = [b_(x) for x in [' ', '\n', '\r', '\t', '\x00']]

View file

@ -1,355 +1,359 @@
import re
import datetime
import decimal
from generic import PdfObject
from xml.dom import getDOMImplementation
from xml.dom.minidom import parseString
RDF_NAMESPACE = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
DC_NAMESPACE = "http://purl.org/dc/elements/1.1/"
XMP_NAMESPACE = "http://ns.adobe.com/xap/1.0/"
PDF_NAMESPACE = "http://ns.adobe.com/pdf/1.3/"
XMPMM_NAMESPACE = "http://ns.adobe.com/xap/1.0/mm/"
# What is the PDFX namespace, you might ask? I might ask that too. It's
# a completely undocumented namespace used to place "custom metadata"
# properties, which are arbitrary metadata properties with no semantic or
# documented meaning. Elements in the namespace are key/value-style storage,
# where the element name is the key and the content is the value. The keys
# are transformed into valid XML identifiers by substituting an invalid
# identifier character with \u2182 followed by the unicode hex ID of the
# original character. A key like "my car" is therefore "my\u21820020car".
#
# \u2182, in case you're wondering, is the unicode character
# \u{ROMAN NUMERAL TEN THOUSAND}, a straightforward and obvious choice for
# escaping characters.
#
# Intentional users of the pdfx namespace should be shot on sight. A
# custom data schema and sensical XML elements could be used instead, as is
# suggested by Adobe's own documentation on XMP (under "Extensibility of
# Schemas").
#
# Information presented here on the /pdfx/ schema is a result of limited
# reverse engineering, and does not constitute a full specification.
PDFX_NAMESPACE = "http://ns.adobe.com/pdfx/1.3/"
iso8601 = re.compile("""
(?P<year>[0-9]{4})
(-
(?P<month>[0-9]{2})
(-
(?P<day>[0-9]+)
(T
(?P<hour>[0-9]{2}):
(?P<minute>[0-9]{2})
(:(?P<second>[0-9]{2}(.[0-9]+)?))?
(?P<tzd>Z|[-+][0-9]{2}:[0-9]{2})
)?
)?
)?
""", re.VERBOSE)
##
# An object that represents Adobe XMP metadata.
class XmpInformation(PdfObject):
def __init__(self, stream):
self.stream = stream
docRoot = parseString(self.stream.getData())
self.rdfRoot = docRoot.getElementsByTagNameNS(RDF_NAMESPACE, "RDF")[0]
self.cache = {}
def writeToStream(self, stream, encryption_key):
self.stream.writeToStream(stream, encryption_key)
def getElement(self, aboutUri, namespace, name):
for desc in self.rdfRoot.getElementsByTagNameNS(RDF_NAMESPACE, "Description"):
if desc.getAttributeNS(RDF_NAMESPACE, "about") == aboutUri:
attr = desc.getAttributeNodeNS(namespace, name)
if attr != None:
yield attr
for element in desc.getElementsByTagNameNS(namespace, name):
yield element
def getNodesInNamespace(self, aboutUri, namespace):
for desc in self.rdfRoot.getElementsByTagNameNS(RDF_NAMESPACE, "Description"):
if desc.getAttributeNS(RDF_NAMESPACE, "about") == aboutUri:
for i in range(desc.attributes.length):
attr = desc.attributes.item(i)
if attr.namespaceURI == namespace:
yield attr
for child in desc.childNodes:
if child.namespaceURI == namespace:
yield child
def _getText(self, element):
text = ""
for child in element.childNodes:
if child.nodeType == child.TEXT_NODE:
text += child.data
return text
def _converter_string(value):
return value
def _converter_date(value):
m = iso8601.match(value)
year = int(m.group("year"))
month = int(m.group("month") or "1")
day = int(m.group("day") or "1")
hour = int(m.group("hour") or "0")
minute = int(m.group("minute") or "0")
second = decimal.Decimal(m.group("second") or "0")
seconds = second.to_integral(decimal.ROUND_FLOOR)
milliseconds = (second - seconds) * 1000000
tzd = m.group("tzd") or "Z"
dt = datetime.datetime(year, month, day, hour, minute, seconds, milliseconds)
if tzd != "Z":
tzd_hours, tzd_minutes = [int(x) for x in tzd.split(":")]
tzd_hours *= -1
if tzd_hours < 0:
tzd_minutes *= -1
dt = dt + datetime.timedelta(hours=tzd_hours, minutes=tzd_minutes)
return dt
_test_converter_date = staticmethod(_converter_date)
def _getter_bag(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = []
for element in self.getElement("", namespace, name):
bags = element.getElementsByTagNameNS(RDF_NAMESPACE, "Bag")
if len(bags):
for bag in bags:
for item in bag.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval.append(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_seq(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = []
for element in self.getElement("", namespace, name):
seqs = element.getElementsByTagNameNS(RDF_NAMESPACE, "Seq")
if len(seqs):
for seq in seqs:
for item in seq.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval.append(value)
else:
value = converter(self._getText(element))
retval.append(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_langalt(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = {}
for element in self.getElement("", namespace, name):
alts = element.getElementsByTagNameNS(RDF_NAMESPACE, "Alt")
if len(alts):
for alt in alts:
for item in alt.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval[item.getAttribute("xml:lang")] = value
else:
retval["x-default"] = converter(self._getText(element))
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_single(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
value = None
for element in self.getElement("", namespace, name):
if element.nodeType == element.ATTRIBUTE_NODE:
value = element.nodeValue
else:
value = self._getText(element)
break
if value != None:
value = converter(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = value
return value
return get
##
# Contributors to the resource (other than the authors). An unsorted
# array of names.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_contributor = property(_getter_bag(DC_NAMESPACE, "contributor", _converter_string))
##
# Text describing the extent or scope of the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_coverage = property(_getter_single(DC_NAMESPACE, "coverage", _converter_string))
##
# A sorted array of names of the authors of the resource, listed in order
# of precedence.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_creator = property(_getter_seq(DC_NAMESPACE, "creator", _converter_string))
##
# A sorted array of dates (datetime.datetime instances) of signifigance to
# the resource. The dates and times are in UTC.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_date = property(_getter_seq(DC_NAMESPACE, "date", _converter_date))
##
# A language-keyed dictionary of textual descriptions of the content of the
# resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_description = property(_getter_langalt(DC_NAMESPACE, "description", _converter_string))
##
# The mime-type of the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_format = property(_getter_single(DC_NAMESPACE, "format", _converter_string))
##
# Unique identifier of the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_identifier = property(_getter_single(DC_NAMESPACE, "identifier", _converter_string))
##
# An unordered array specifying the languages used in the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_language = property(_getter_bag(DC_NAMESPACE, "language", _converter_string))
##
# An unordered array of publisher names.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_publisher = property(_getter_bag(DC_NAMESPACE, "publisher", _converter_string))
##
# An unordered array of text descriptions of relationships to other
# documents.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_relation = property(_getter_bag(DC_NAMESPACE, "relation", _converter_string))
##
# A language-keyed dictionary of textual descriptions of the rights the
# user has to this resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_rights = property(_getter_langalt(DC_NAMESPACE, "rights", _converter_string))
##
# Unique identifier of the work from which this resource was derived.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_source = property(_getter_single(DC_NAMESPACE, "source", _converter_string))
##
# An unordered array of descriptive phrases or keywrods that specify the
# topic of the content of the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_subject = property(_getter_bag(DC_NAMESPACE, "subject", _converter_string))
##
# A language-keyed dictionary of the title of the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_title = property(_getter_langalt(DC_NAMESPACE, "title", _converter_string))
##
# An unordered array of textual descriptions of the document type.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
dc_type = property(_getter_bag(DC_NAMESPACE, "type", _converter_string))
##
# An unformatted text string representing document keywords.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
pdf_keywords = property(_getter_single(PDF_NAMESPACE, "Keywords", _converter_string))
##
# The PDF file version, for example 1.0, 1.3.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
pdf_pdfversion = property(_getter_single(PDF_NAMESPACE, "PDFVersion", _converter_string))
##
# The name of the tool that created the PDF document.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
pdf_producer = property(_getter_single(PDF_NAMESPACE, "Producer", _converter_string))
##
# The date and time the resource was originally created. The date and
# time are returned as a UTC datetime.datetime object.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmp_createDate = property(_getter_single(XMP_NAMESPACE, "CreateDate", _converter_date))
##
# The date and time the resource was last modified. The date and time
# are returned as a UTC datetime.datetime object.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmp_modifyDate = property(_getter_single(XMP_NAMESPACE, "ModifyDate", _converter_date))
##
# The date and time that any metadata for this resource was last
# changed. The date and time are returned as a UTC datetime.datetime
# object.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmp_metadataDate = property(_getter_single(XMP_NAMESPACE, "MetadataDate", _converter_date))
##
# The name of the first known tool used to create the resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmp_creatorTool = property(_getter_single(XMP_NAMESPACE, "CreatorTool", _converter_string))
##
# The common identifier for all versions and renditions of this resource.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmpmm_documentId = property(_getter_single(XMPMM_NAMESPACE, "DocumentID", _converter_string))
##
# An identifier for a specific incarnation of a document, updated each
# time a file is saved.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
xmpmm_instanceId = property(_getter_single(XMPMM_NAMESPACE, "InstanceID", _converter_string))
def custom_properties(self):
if not hasattr(self, "_custom_properties"):
self._custom_properties = {}
for node in self.getNodesInNamespace("", PDFX_NAMESPACE):
key = node.localName
while True:
# see documentation about PDFX_NAMESPACE earlier in file
idx = key.find(u"\u2182")
if idx == -1:
break
key = key[:idx] + chr(int(key[idx+1:idx+5], base=16)) + key[idx+5:]
if node.nodeType == node.ATTRIBUTE_NODE:
value = node.nodeValue
else:
value = self._getText(node)
self._custom_properties[key] = value
return self._custom_properties
##
# Retrieves custom metadata properties defined in the undocumented pdfx
# metadata schema.
# <p>Stability: Added in v1.12, will exist for all future v1.x releases.
# @return Returns a dictionary of key/value items for custom metadata
# properties.
custom_properties = property(custom_properties)
import re
import datetime
import decimal
from .generic import PdfObject
from xml.dom import getDOMImplementation
from xml.dom.minidom import parseString
from .utils import u_
RDF_NAMESPACE = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
DC_NAMESPACE = "http://purl.org/dc/elements/1.1/"
XMP_NAMESPACE = "http://ns.adobe.com/xap/1.0/"
PDF_NAMESPACE = "http://ns.adobe.com/pdf/1.3/"
XMPMM_NAMESPACE = "http://ns.adobe.com/xap/1.0/mm/"
# What is the PDFX namespace, you might ask? I might ask that too. It's
# a completely undocumented namespace used to place "custom metadata"
# properties, which are arbitrary metadata properties with no semantic or
# documented meaning. Elements in the namespace are key/value-style storage,
# where the element name is the key and the content is the value. The keys
# are transformed into valid XML identifiers by substituting an invalid
# identifier character with \u2182 followed by the unicode hex ID of the
# original character. A key like "my car" is therefore "my\u21820020car".
#
# \u2182, in case you're wondering, is the unicode character
# \u{ROMAN NUMERAL TEN THOUSAND}, a straightforward and obvious choice for
# escaping characters.
#
# Intentional users of the pdfx namespace should be shot on sight. A
# custom data schema and sensical XML elements could be used instead, as is
# suggested by Adobe's own documentation on XMP (under "Extensibility of
# Schemas").
#
# Information presented here on the /pdfx/ schema is a result of limited
# reverse engineering, and does not constitute a full specification.
PDFX_NAMESPACE = "http://ns.adobe.com/pdfx/1.3/"
iso8601 = re.compile("""
(?P<year>[0-9]{4})
(-
(?P<month>[0-9]{2})
(-
(?P<day>[0-9]+)
(T
(?P<hour>[0-9]{2}):
(?P<minute>[0-9]{2})
(:(?P<second>[0-9]{2}(.[0-9]+)?))?
(?P<tzd>Z|[-+][0-9]{2}:[0-9]{2})
)?
)?
)?
""", re.VERBOSE)
class XmpInformation(PdfObject):
"""
An object that represents Adobe XMP metadata.
Usually accessed by :meth:`getXmpMetadata()<PyPDF2.PdfFileReader.getXmpMetadata>`
"""
def __init__(self, stream):
self.stream = stream
docRoot = parseString(self.stream.getData())
self.rdfRoot = docRoot.getElementsByTagNameNS(RDF_NAMESPACE, "RDF")[0]
self.cache = {}
def writeToStream(self, stream, encryption_key):
self.stream.writeToStream(stream, encryption_key)
def getElement(self, aboutUri, namespace, name):
for desc in self.rdfRoot.getElementsByTagNameNS(RDF_NAMESPACE, "Description"):
if desc.getAttributeNS(RDF_NAMESPACE, "about") == aboutUri:
attr = desc.getAttributeNodeNS(namespace, name)
if attr != None:
yield attr
for element in desc.getElementsByTagNameNS(namespace, name):
yield element
def getNodesInNamespace(self, aboutUri, namespace):
for desc in self.rdfRoot.getElementsByTagNameNS(RDF_NAMESPACE, "Description"):
if desc.getAttributeNS(RDF_NAMESPACE, "about") == aboutUri:
for i in range(desc.attributes.length):
attr = desc.attributes.item(i)
if attr.namespaceURI == namespace:
yield attr
for child in desc.childNodes:
if child.namespaceURI == namespace:
yield child
def _getText(self, element):
text = ""
for child in element.childNodes:
if child.nodeType == child.TEXT_NODE:
text += child.data
return text
def _converter_string(value):
return value
def _converter_date(value):
m = iso8601.match(value)
year = int(m.group("year"))
month = int(m.group("month") or "1")
day = int(m.group("day") or "1")
hour = int(m.group("hour") or "0")
minute = int(m.group("minute") or "0")
second = decimal.Decimal(m.group("second") or "0")
seconds = second.to_integral(decimal.ROUND_FLOOR)
milliseconds = (second - seconds) * 1000000
tzd = m.group("tzd") or "Z"
dt = datetime.datetime(year, month, day, hour, minute, seconds, milliseconds)
if tzd != "Z":
tzd_hours, tzd_minutes = [int(x) for x in tzd.split(":")]
tzd_hours *= -1
if tzd_hours < 0:
tzd_minutes *= -1
dt = dt + datetime.timedelta(hours=tzd_hours, minutes=tzd_minutes)
return dt
_test_converter_date = staticmethod(_converter_date)
def _getter_bag(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = []
for element in self.getElement("", namespace, name):
bags = element.getElementsByTagNameNS(RDF_NAMESPACE, "Bag")
if len(bags):
for bag in bags:
for item in bag.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval.append(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_seq(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = []
for element in self.getElement("", namespace, name):
seqs = element.getElementsByTagNameNS(RDF_NAMESPACE, "Seq")
if len(seqs):
for seq in seqs:
for item in seq.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval.append(value)
else:
value = converter(self._getText(element))
retval.append(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_langalt(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
retval = {}
for element in self.getElement("", namespace, name):
alts = element.getElementsByTagNameNS(RDF_NAMESPACE, "Alt")
if len(alts):
for alt in alts:
for item in alt.getElementsByTagNameNS(RDF_NAMESPACE, "li"):
value = self._getText(item)
value = converter(value)
retval[item.getAttribute("xml:lang")] = value
else:
retval["x-default"] = converter(self._getText(element))
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = retval
return retval
return get
def _getter_single(namespace, name, converter):
def get(self):
cached = self.cache.get(namespace, {}).get(name)
if cached:
return cached
value = None
for element in self.getElement("", namespace, name):
if element.nodeType == element.ATTRIBUTE_NODE:
value = element.nodeValue
else:
value = self._getText(element)
break
if value != None:
value = converter(value)
ns_cache = self.cache.setdefault(namespace, {})
ns_cache[name] = value
return value
return get
dc_contributor = property(_getter_bag(DC_NAMESPACE, "contributor", _converter_string))
"""
Contributors to the resource (other than the authors). An unsorted
array of names.
"""
dc_coverage = property(_getter_single(DC_NAMESPACE, "coverage", _converter_string))
"""
Text describing the extent or scope of the resource.
"""
dc_creator = property(_getter_seq(DC_NAMESPACE, "creator", _converter_string))
"""
A sorted array of names of the authors of the resource, listed in order
of precedence.
"""
dc_date = property(_getter_seq(DC_NAMESPACE, "date", _converter_date))
"""
A sorted array of dates (datetime.datetime instances) of signifigance to
the resource. The dates and times are in UTC.
"""
dc_description = property(_getter_langalt(DC_NAMESPACE, "description", _converter_string))
"""
A language-keyed dictionary of textual descriptions of the content of the
resource.
"""
dc_format = property(_getter_single(DC_NAMESPACE, "format", _converter_string))
"""
The mime-type of the resource.
"""
dc_identifier = property(_getter_single(DC_NAMESPACE, "identifier", _converter_string))
"""
Unique identifier of the resource.
"""
dc_language = property(_getter_bag(DC_NAMESPACE, "language", _converter_string))
"""
An unordered array specifying the languages used in the resource.
"""
dc_publisher = property(_getter_bag(DC_NAMESPACE, "publisher", _converter_string))
"""
An unordered array of publisher names.
"""
dc_relation = property(_getter_bag(DC_NAMESPACE, "relation", _converter_string))
"""
An unordered array of text descriptions of relationships to other
documents.
"""
dc_rights = property(_getter_langalt(DC_NAMESPACE, "rights", _converter_string))
"""
A language-keyed dictionary of textual descriptions of the rights the
user has to this resource.
"""
dc_source = property(_getter_single(DC_NAMESPACE, "source", _converter_string))
"""
Unique identifier of the work from which this resource was derived.
"""
dc_subject = property(_getter_bag(DC_NAMESPACE, "subject", _converter_string))
"""
An unordered array of descriptive phrases or keywrods that specify the
topic of the content of the resource.
"""
dc_title = property(_getter_langalt(DC_NAMESPACE, "title", _converter_string))
"""
A language-keyed dictionary of the title of the resource.
"""
dc_type = property(_getter_bag(DC_NAMESPACE, "type", _converter_string))
"""
An unordered array of textual descriptions of the document type.
"""
pdf_keywords = property(_getter_single(PDF_NAMESPACE, "Keywords", _converter_string))
"""
An unformatted text string representing document keywords.
"""
pdf_pdfversion = property(_getter_single(PDF_NAMESPACE, "PDFVersion", _converter_string))
"""
The PDF file version, for example 1.0, 1.3.
"""
pdf_producer = property(_getter_single(PDF_NAMESPACE, "Producer", _converter_string))
"""
The name of the tool that created the PDF document.
"""
xmp_createDate = property(_getter_single(XMP_NAMESPACE, "CreateDate", _converter_date))
"""
The date and time the resource was originally created. The date and
time are returned as a UTC datetime.datetime object.
"""
xmp_modifyDate = property(_getter_single(XMP_NAMESPACE, "ModifyDate", _converter_date))
"""
The date and time the resource was last modified. The date and time
are returned as a UTC datetime.datetime object.
"""
xmp_metadataDate = property(_getter_single(XMP_NAMESPACE, "MetadataDate", _converter_date))
"""
The date and time that any metadata for this resource was last
changed. The date and time are returned as a UTC datetime.datetime
object.
"""
xmp_creatorTool = property(_getter_single(XMP_NAMESPACE, "CreatorTool", _converter_string))
"""
The name of the first known tool used to create the resource.
"""
xmpmm_documentId = property(_getter_single(XMPMM_NAMESPACE, "DocumentID", _converter_string))
"""
The common identifier for all versions and renditions of this resource.
"""
xmpmm_instanceId = property(_getter_single(XMPMM_NAMESPACE, "InstanceID", _converter_string))
"""
An identifier for a specific incarnation of a document, updated each
time a file is saved.
"""
def custom_properties(self):
if not hasattr(self, "_custom_properties"):
self._custom_properties = {}
for node in self.getNodesInNamespace("", PDFX_NAMESPACE):
key = node.localName
while True:
# see documentation about PDFX_NAMESPACE earlier in file
idx = key.find(u_("\u2182"))
if idx == -1:
break
key = key[:idx] + chr(int(key[idx+1:idx+5], base=16)) + key[idx+5:]
if node.nodeType == node.ATTRIBUTE_NODE:
value = node.nodeValue
else:
value = self._getText(node)
self._custom_properties[key] = value
return self._custom_properties
custom_properties = property(custom_properties)
"""
Retrieves custom metadata properties defined in the undocumented pdfx
metadata schema.
:return: a dictionary of key/value items for custom metadata properties.
:rtype: dict
"""

View file

@ -2,9 +2,9 @@
../certifi/core.py
../certifi/__main__.py
../certifi/cacert.pem
../certifi/__init__.pyc
../certifi/core.pyc
../certifi/__main__.pyc
../certifi/__pycache__/__init__.cpython-34.pyc
../certifi/__pycache__/core.cpython-34.pyc
../certifi/__pycache__/__main__.cpython-34.pyc
./
dependency_links.txt
PKG-INFO

View file

@ -44,42 +44,42 @@ chardet-2.2.1.dist-info/top_level.txt,sha256=AowzBbZy4x8EirABDdJSLJZMkJ_53iIag8x
chardet-2.2.1.dist-info/DESCRIPTION.rst,sha256=m1CcXHsjUJRXdWB4svHusBa6otO4GdUW6LgirEk4V2k,1344
chardet-2.2.1.dist-info/entry_points.txt,sha256=2T00JXwbiQBZQFSKyCFxud4LEQ3_8TKuOwUsSXT-kUI,56
chardet-2.2.1.dist-info/METADATA,sha256=Pzpbxhm72oav1pTeA7pAjXPWGZ_gmYRm9bwvXM8umaw,2013
/srv/openmedialibrary/platform/Shared/home/.local/bin/chardetect,sha256=z0dFm-ZCMpjOlo9_DhdtBGIHCYRBSJ_vacJxK2YxoGg,219
chardet/latin1prober.pyc,,
chardet/euckrprober.pyc,,
chardet/sbcsgroupprober.pyc,,
chardet/langcyrillicmodel.pyc,,
chardet/codingstatemachine.pyc,,
chardet/big5freq.pyc,,
chardet/chardetect.pyc,,
chardet/langthaimodel.pyc,,
chardet/euctwprober.pyc,,
chardet/escprober.pyc,,
chardet/chardistribution.pyc,,
chardet/__init__.pyc,,
chardet/langhebrewmodel.pyc,,
chardet/mbcharsetprober.pyc,,
chardet/jpcntx.pyc,,
chardet/big5prober.pyc,,
chardet/universaldetector.pyc,,
chardet/escsm.pyc,,
chardet/mbcsgroupprober.pyc,,
chardet/langhungarianmodel.pyc,,
chardet/euctwfreq.pyc,,
chardet/constants.pyc,,
chardet/cp949prober.pyc,,
chardet/langgreekmodel.pyc,,
chardet/hebrewprober.pyc,,
chardet/sbcharsetprober.pyc,,
chardet/utf8prober.pyc,,
chardet/charsetprober.pyc,,
chardet/euckrfreq.pyc,,
chardet/sjisprober.pyc,,
chardet/gb2312prober.pyc,,
chardet/compat.pyc,,
chardet/gb2312freq.pyc,,
chardet/eucjpprober.pyc,,
chardet/jisfreq.pyc,,
chardet/mbcssm.pyc,,
chardet/langbulgarianmodel.pyc,,
chardet/charsetgroupprober.pyc,,
/srv/openmedialibrary/platform/Shared/home/.local/bin/chardetect,sha256=zPsthwHzIOlO2Mxw0wdp5F7cfd7xSyEpiv11jcEgaEE,220
chardet/__pycache__/sjisprober.cpython-34.pyc,,
chardet/__pycache__/sbcharsetprober.cpython-34.pyc,,
chardet/__pycache__/compat.cpython-34.pyc,,
chardet/__pycache__/jisfreq.cpython-34.pyc,,
chardet/__pycache__/cp949prober.cpython-34.pyc,,
chardet/__pycache__/latin1prober.cpython-34.pyc,,
chardet/__pycache__/sbcsgroupprober.cpython-34.pyc,,
chardet/__pycache__/chardetect.cpython-34.pyc,,
chardet/__pycache__/mbcsgroupprober.cpython-34.pyc,,
chardet/__pycache__/escprober.cpython-34.pyc,,
chardet/__pycache__/langhebrewmodel.cpython-34.pyc,,
chardet/__pycache__/mbcssm.cpython-34.pyc,,
chardet/__pycache__/charsetgroupprober.cpython-34.pyc,,
chardet/__pycache__/langhungarianmodel.cpython-34.pyc,,
chardet/__pycache__/codingstatemachine.cpython-34.pyc,,
chardet/__pycache__/eucjpprober.cpython-34.pyc,,
chardet/__pycache__/euctwfreq.cpython-34.pyc,,
chardet/__pycache__/utf8prober.cpython-34.pyc,,
chardet/__pycache__/langbulgarianmodel.cpython-34.pyc,,
chardet/__pycache__/langthaimodel.cpython-34.pyc,,
chardet/__pycache__/constants.cpython-34.pyc,,
chardet/__pycache__/big5freq.cpython-34.pyc,,
chardet/__pycache__/mbcharsetprober.cpython-34.pyc,,
chardet/__pycache__/euckrfreq.cpython-34.pyc,,
chardet/__pycache__/escsm.cpython-34.pyc,,
chardet/__pycache__/chardistribution.cpython-34.pyc,,
chardet/__pycache__/jpcntx.cpython-34.pyc,,
chardet/__pycache__/charsetprober.cpython-34.pyc,,
chardet/__pycache__/gb2312freq.cpython-34.pyc,,
chardet/__pycache__/universaldetector.cpython-34.pyc,,
chardet/__pycache__/euctwprober.cpython-34.pyc,,
chardet/__pycache__/euckrprober.cpython-34.pyc,,
chardet/__pycache__/__init__.cpython-34.pyc,,
chardet/__pycache__/langcyrillicmodel.cpython-34.pyc,,
chardet/__pycache__/big5prober.cpython-34.pyc,,
chardet/__pycache__/hebrewprober.cpython-34.pyc,,
chardet/__pycache__/gb2312prober.cpython-34.pyc,,
chardet/__pycache__/langgreekmodel.cpython-34.pyc,,

Some files were not shown because too many files have changed in this diff Show more