16 Commits

Author SHA1 Message Date
Matthias Bisping
bb79f9dd55 applied black 2022-04-11 13:57:32 +02:00
Matthias Bisping
585cdf5c70 integrated stitching into parsable pdf extractor 2022-04-11 13:57:10 +02:00
Matthias Bisping
d80af336eb refactoring 2022-04-11 13:28:39 +02:00
Matthias Bisping
37ee086b5d applied black 2022-04-05 17:55:38 +02:00
Matthias Bisping
2c908162f1 refactoring 2022-04-05 16:31:57 +02:00
Matthias Bisping
4756b8c9bd refactoring 2022-04-05 13:03:22 +02:00
Matthias Bisping
ce69f7d160 removed obsolete imports 2022-04-04 21:50:10 +02:00
Matthias Bisping
8f61c4cba2 doc.extract_image(xref) can yield None; hence added filtering for None images 2022-04-04 21:49:45 +02:00
Matthias Bisping
5c23898280 added log messages to all pipelien components; converting pipelien output to list for REST transport; refactoring; added e2e test (flask + pipeline)... but hangs 2022-04-02 02:44:30 +02:00
Matthias Bisping
45a07c620a fixed chaining bug that lead to greedy evaluation 2022-03-30 00:53:34 +02:00
Matthias Bisping
ade318c7b7 made classifier accept tupls of images in addition to np.arrays; added pipeline (wip) 2022-03-29 22:00:34 +02:00
Matthias Bisping
7340fb6dda replaced string keys for metadata fields with enum members 2022-03-29 20:29:44 +02:00
Matthias Bisping
e818b05472 applied black 2022-03-28 16:39:34 +02:00
Matthias Bisping
b818ee4724 fixed misaligned metadata and images 2022-03-28 16:38:46 +02:00
Julius Unverfehrt
9461be29d5 add ParsablePDFImageExtractor test 2022-03-28 15:42:54 +02:00
Matthias Bisping
643ab99bd3 added parsable pdf image extractor 2022-03-28 11:27:05 +02:00