Coverage for pdfrw/pdfrw/toreportlab.py: 0%

Shortcuts on this page

r m x   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

70 statements  

1# A part of pdfrw (https://github.com/pmaupin/pdfrw) 

2# Copyright (C) 2006-2015 Patrick Maupin, Austin, Texas 

3# MIT license -- See LICENSE.txt for details 

4 

5''' 

6Converts pdfrw objects into reportlab objects. 

7 

8Designed for and tested with rl 2.3. 

9 

10Knows too much about reportlab internals. 

11What can you do? 

12 

13The interface to this function is through the makerl() function. 

14 

15Parameters: 

16 canv - a reportlab "canvas" (also accepts a "document") 

17 pdfobj - a pdfrw PDF object 

18 

19Returns: 

20 A corresponding reportlab object, or if the 

21 object is a PDF Form XObject, the name to 

22 use with reportlab for the object. 

23 

24 Will recursively convert all necessary objects. 

25 Be careful when converting a page -- if /Parent is set, 

26 will recursively convert all pages! 

27 

28Notes: 

29 1) Original objects are annotated with a 

30 derived_rl_obj attribute which points to the 

31 reportlab object. This keeps multiple reportlab 

32 objects from being generated for the same pdfobj 

33 via repeated calls to makerl. This is great for 

34 not putting too many objects into the 

35 new PDF, but not so good if you are modifying 

36 objects for different pages. Then you 

37 need to do your own deep copying (of circular 

38 structures). You're on your own. 

39 

40 2) ReportLab seems weird about FormXObjects. 

41 They pass around a partial name instead of the 

42 object or a reference to it. So we have to 

43 reach into reportlab and get a number for 

44 a unique name. I guess this is to make it 

45 where you can combine page streams with 

46 impunity, but that's just a guess. 

47 

48 3) Updated 1/23/2010 to handle multipass documents 

49 (e.g. with a table of contents). These have 

50 a different doc object on every pass. 

51 

52''' 

53 

54from reportlab.pdfbase import pdfdoc as rldocmodule 

55from .objects import PdfDict, PdfArray, PdfName 

56from .py23_diffs import convert_store 

57 

58RLStream = rldocmodule.PDFStream 

59RLDict = rldocmodule.PDFDictionary 

60RLArray = rldocmodule.PDFArray 

61 

62 

63def _makedict(rldoc, pdfobj): 

64 rlobj = rldict = RLDict() 

65 if pdfobj.indirect: 

66 rlobj.__RefOnly__ = 1 

67 rlobj = rldoc.Reference(rlobj) 

68 pdfobj.derived_rl_obj[rldoc] = rlobj, None 

69 

70 for key, value in pdfobj.iteritems(): 

71 rldict[key[1:]] = makerl_recurse(rldoc, value) 

72 

73 return rlobj 

74 

75 

76def _makestream(rldoc, pdfobj, xobjtype=PdfName.XObject): 

77 rldict = RLDict() 

78 rlobj = RLStream(rldict, convert_store(pdfobj.stream)) 

79 

80 if pdfobj.Type == xobjtype: 

81 shortname = 'pdfrw_%s' % (rldoc.objectcounter + 1) 

82 fullname = rldoc.getXObjectName(shortname) 

83 else: 

84 shortname = fullname = None 

85 result = rldoc.Reference(rlobj, fullname) 

86 pdfobj.derived_rl_obj[rldoc] = result, shortname 

87 

88 for key, value in pdfobj.iteritems(): 

89 rldict[key[1:]] = makerl_recurse(rldoc, value) 

90 

91 return result 

92 

93 

94def _makearray(rldoc, pdfobj): 

95 rlobj = rlarray = RLArray([]) 

96 if pdfobj.indirect: 

97 rlobj.__RefOnly__ = 1 

98 rlobj = rldoc.Reference(rlobj) 

99 pdfobj.derived_rl_obj[rldoc] = rlobj, None 

100 

101 mylist = rlarray.sequence 

102 for value in pdfobj: 

103 mylist.append(makerl_recurse(rldoc, value)) 

104 

105 return rlobj 

106 

107 

108def _makestr(rldoc, pdfobj): 

109 assert isinstance(pdfobj, (float, int, str)), repr(pdfobj) 

110 # TODO: Add fix for float like in pdfwriter 

111 return str(getattr(pdfobj, 'encoded', None) or pdfobj) 

112 

113 

114def makerl_recurse(rldoc, pdfobj): 

115 docdict = getattr(pdfobj, 'derived_rl_obj', None) 

116 if docdict is not None: 

117 value = docdict.get(rldoc) 

118 if value is not None: 

119 return value[0] 

120 if isinstance(pdfobj, PdfDict): 

121 if pdfobj.stream is not None: 

122 func = _makestream 

123 else: 

124 func = _makedict 

125 if docdict is None: 

126 pdfobj.private.derived_rl_obj = {} 

127 elif isinstance(pdfobj, PdfArray): 

128 func = _makearray 

129 if docdict is None: 

130 pdfobj.derived_rl_obj = {} 

131 else: 

132 func = _makestr 

133 return func(rldoc, pdfobj) 

134 

135 

136def makerl(canv, pdfobj): 

137 try: 

138 rldoc = canv._doc 

139 except AttributeError: 

140 rldoc = canv 

141 rlobj = makerl_recurse(rldoc, pdfobj) 

142 try: 

143 name = pdfobj.derived_rl_obj[rldoc][1] 

144 except AttributeError: 

145 name = None 

146 return name or rlobj