You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+27-7Lines changed: 27 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -17,12 +17,11 @@ The simplest usage at this stage is to open a document, reading the words from e
17
17
18
18
using (PdfDocument document = PdfDocument.Open(@"C:\Documents\document.pdf"))
19
19
{
20
-
for (var i = 0; i < document.NumberOfPages; i++)
20
+
foreach (Page page in document.GetPages())
21
21
{
22
-
// This starts at 1 rather than 0.
23
-
var page = document.GetPage(i + 1);
22
+
string pageText = page.Text;
24
23
25
-
foreach (var word in page.GetWords())
24
+
foreach (Word word in page.GetWords())
26
25
{
27
26
Console.WriteLine(word.Text);
28
27
}
@@ -69,6 +68,7 @@ The ```PdfDocument``` class provides access to the contents of a document loaded
69
68
{
70
69
int pageCount = document.NumberOfPages;
71
70
71
+
// Page number starts from 1, not 0.
72
72
Page page = document.GetPage(1);
73
73
74
74
decimal widthInPoints = page.Width;
@@ -79,7 +79,9 @@ The ```PdfDocument``` class provides access to the contents of a document loaded
79
79
80
80
```PdfDocument``` should only be used in a ```using``` statement since it implements ```IDisposable``` (unless the consumer disposes of it elsewhere).
81
81
82
-
Documents which are encrypted using the RC4 algorithm can be opened with PdfPig (AES is unsupported at the moment). To provide an owner or user password provide the optional `ParsingOptions` when calling `Open` with the `Password` property defined.
82
+
Encrypted documents can be opened by PdfPig. To provide an owner or user password provide the optional `ParsingOptions` when calling `Open` with the `Password` property defined. For example:
83
+
84
+
using (PdfDocument document = PdfDocument.Open(@"C:\my-file.pdf", new ParsingOptions { Password = "password here" }))
83
85
84
86
Since this is alpha software the consumer should wrap all access in a ```try catch``` block since it is extremely likely to throw exceptions. As a fallback you can try running PDFBox using [IKVM](https://www.ikvm.net/) or using [PDFsharp](http://www.pdfsharp.net) or by a native library wrapper using [docnet](https://github.com/GowenGit/docnet).
85
87
@@ -213,14 +215,32 @@ These letters contain:
213
215
214
216
Letter position is measured in PDF coordinates where the origin is the lower left corner of the page. Therefore a higher Y value means closer to the top of the page.
215
217
216
-
### Annotations ###
218
+
### Annotations (0.0.5)###
217
219
218
-
New in v0.0.5 - Early support for retrieving annotations on each page is provided using the method:
220
+
Early support for retrieving annotations on each page is provided using the method:
219
221
220
222
page.ExperimentalAccess.GetAnnotations()
221
223
222
224
This call is not cached and the document must not have been disposed prior to use. The annotations API may change in future.
223
225
226
+
### Bookmarks (0.0.10) ###
227
+
228
+
The bookmarks (outlines) of a document may be retrieved at the document level:
This will return `false` if the document does not contain a form.
241
+
242
+
The fields can be accessed using the `AcroForm`'s `Fields` property. Since the form is defined at the document level this will return fields from all pages in the document. Fields are of the types defined by the enum `AcroFieldType`, for example `PushButton`, `Checkbox`, `Text`, etc.
243
+
224
244
## Issues ##
225
245
226
246
At this stage the software is in Alpha. In order to proceed to Beta and production we need to see a wide variety of document types.
0 commit comments