68 Commits

Author SHA1 Message Date
Matthias Bisping
3f0bbf0fc7 Refactoring 2023-01-10 11:59:01 +01:00
Matthias Bisping
2fec39eda6 Add docstring 2023-01-10 11:31:13 +01:00
Matthias Bisping
16cc0007ed Refactoring 2023-01-10 11:30:36 +01:00
Matthias Bisping
3d83489819 Refactoring: Make single pass rectangle merging stateless 2023-01-10 11:14:15 +01:00
Matthias Bisping
3134021596 Add typehints 2023-01-10 10:20:07 +01:00
Matthias Bisping
3cb857d830 Refactoring: Move 2023-01-10 10:19:49 +01:00
Matthias Bisping
194102939e Refactoring
- add typehints
- other minor refactorings
2023-01-10 10:10:08 +01:00
Matthias Bisping
77f85e9de1 Refactoring
Various
2023-01-09 17:22:01 +01:00
Matthias Bisping
c00081b2bc Refactoring: Move 2023-01-09 17:01:36 +01:00
Matthias Bisping
619f67f1fd Refactoring
Various
2023-01-09 16:51:58 +01:00
Matthias Bisping
a97f8def7c Refactor metrics 2023-01-09 16:22:52 +01:00
Matthias Bisping
65e9735bd9 Refactor metrics 2023-01-09 15:53:53 +01:00
Matthias Bisping
689be75478 Refactoring 2023-01-09 15:47:12 +01:00
Matthias Bisping
acf46a7a48 [WIP] Refactoring meta-detection 2023-01-09 15:40:32 +01:00
Matthias Bisping
0f11441b20 [WIP] Refactoring meta-detection 2023-01-09 15:32:51 +01:00
Matthias Bisping
fa1fa15cc8 [WIP] Refactoring meta-detection 2023-01-09 15:05:00 +01:00
Matthias Bisping
17c40c996a [WIP] Refactoring meta-detection 2023-01-09 14:44:22 +01:00
Matthias Bisping
99af2943b5 [WIP] Refactoring meta-detection 2023-01-09 14:33:27 +01:00
Matthias Bisping
0e6cb495e8 [WIP] Refactoring meta-detection 2023-01-09 14:29:22 +01:00
Matthias Bisping
012e705e70 [WIP] Refactoring meta-detection 2023-01-09 14:22:18 +01:00
Matthias Bisping
94e9210faf Refactoring
Various
2023-01-09 11:21:43 +01:00
Matthias Bisping
06d6863cc5 Format docstrings 2023-01-04 18:50:27 +01:00
Matthias Bisping
cd5457840b Refactoring
Various
2023-01-04 18:13:54 +01:00
Matthias Bisping
eee2f0e256 Refactoring
Rename module
2023-01-04 17:40:43 +01:00
Matthias Bisping
9d2f166fbf Refactoring
Various
2023-01-04 17:36:06 +01:00
Matthias Bisping
97fb4b645d Refactoring
Remove more code that is not adhering to separation of concerns from Rectangle class
2023-01-04 16:49:44 +01:00
Matthias Bisping
00e53fb54d Refactoring
Remove code that is not adhering to separation of concerns from Rectangle class
2023-01-04 16:29:43 +01:00
Matthias Bisping
4be91de036 Refactoring
Further clean up Rectangle class
2023-01-04 15:26:39 +01:00
Matthias Bisping
8c6b940364 Refactoring
Clean up Rectangle class
2023-01-04 14:57:47 +01:00
Matthias Bisping
cdb12baccd Format docstrings 2023-01-04 13:57:51 +01:00
Matthias Bisping
ac84494613 Refactoring 2023-01-04 13:32:57 +01:00
Matthias Bisping
77f565c652 Fix
Fix a typehint
Fix a bug that would happen when a generator is passed
2023-01-04 12:06:28 +01:00
Matthias Bisping
47e657aaa3 Refactoring
Clean up and prove correctness of intersection computation
2023-01-04 12:05:57 +01:00
Matthias Bisping
b592497b75 Refactoring 2023-01-04 10:58:24 +01:00
Matthias Bisping
8260ae58f9 Refactoring
Make adjacency checking code clean
2023-01-04 10:11:46 +01:00
Matthias Bisping
068f75d35b Apply black 2023-01-04 10:11:28 +01:00
Matthias Bisping
7bbe459208 Adjust type hints for new lower python version 2023-01-02 15:46:35 +01:00
lillian locarnini
95cab33f19 Pull request #29: Evaluate layout detection
Merge in RR/cv-analysis from evaluate_layout_detection to master

Squashed commit of the following:

commit 8ec2f69fc61d1e15bd502b0a2c1f720cbec2b34e
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Aug 23 15:07:21 2022 +0200

    repaired is_not_included() logic (did drop the outer rectangle, not the included)

commit 97be081d1e60989313924ceac0bfb3062229411e
Merge: 2c28fa2 2b5c4f1
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Aug 23 14:28:14 2022 +0200

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into evaluate_layout_detection

commit 2c28fa280b7eff922c715245fffe69702c7e6742
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Aug 23 13:50:17 2022 +0200

    del print statements

commit c60121fc4faebc5de556ec0ab7a3af4f815f7ce1
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Aug 22 10:51:52 2022 +0200

    few changes to connect_rects.py

commit a99719905d58cbe856fa020177abd7e317c1d072
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Thu Aug 18 08:37:12 2022 +0200

    layout parsing improved with connect_rects.py

commit d693688a0f0d63395cfd36645de7b3417f64de30
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Aug 2 09:31:19 2022 +0200

    removed vizlogger instances
2022-08-23 15:09:51 +02:00
Julius Unverfehrt
59a0a61708 Pull request #25: Pdf2image
Merge in RR/cv-analysis from pdf2image to master

Squashed commit of the following:

commit 1353f54d2dceb0a79b1f81bfa2c035f5a454275a
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Aug 10 09:07:31 2022 +0200

    add deRotation and transformation vie rectanglePlus

commit 51459dbf57a86e3eac66ec0da02de40dc1b68796
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 9 08:53:50 2022 +0200

    add derotation and to pdf coords transformation to cv-analysis output

commit 733991e2f5a4664205b2f7cc756cebcbc9ee3930
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Aug 8 15:15:13 2022 +0200

    update pipline with detrotation logic WIP
2022-08-10 09:17:59 +02:00
Julius Unverfehrt
016abe46de Pull request #23: Add pdf2image module
Merge in RR/cv-analysis from add-pdf2image-module to master

Squashed commit of the following:

commit 13355e2dd006fae9ee05c2d00acbbc8b38fd1e8e
Merge: eaf4627 edbda58
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 13:35:27 2022 +0200

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into add-pdf2image-module

commit eaf462768787642889d496203034d017c4ec959b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 13:26:58 2022 +0200

    update build scripts

commit d429c713f4e5e74afca81c2354e8125bf389b865
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 13:11:07 2022 +0200

    purge target

commit 349b81c5db724bf70d6f31b58ded2b5414216bfe
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 13:07:58 2022 +0200

    Revert "extinguish target"

    This reverts commit d2bd4cefde0648d2487839b0344509b984435273.

commit d2bd4cefde0648d2487839b0344509b984435273
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 12:57:50 2022 +0200

    extinguish target

commit 5f6cc713db31e3e16c8e7f13a59804c86b5d77d7
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 11:58:52 2022 +0200

    refactor

commit 576019378a39b580b816d9eb7957774f1faf48b9
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 11:52:04 2022 +0200

    add test for adjustesd server analysis pipeline logic

commit bdf0121929d6941cbba565055f37df7970925c79
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 11:30:17 2022 +0200

    update analysis pipline logic to use imported pdf2image

commit f7cef98d5e6d7b95517bbd047dd3e958acebb3d8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 2 11:04:34 2022 +0200

    add pdf2image as git submodule
2022-08-02 13:36:50 +02:00
Isaac Riley
edbda58837 Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis 2022-08-02 12:13:02 +02:00
Isaac Riley
beb40da3b1 Pull request #22: add single-cell filtering to table parsing and increase tolerance parameter to 7; refactor postprocessing to use the Rectangles data structure
Merge in RR/cv-analysis from remove_isolated to master

Squashed commit of the following:

commit 2613ed1615d1b69b3e4f2acea197993a91d00561
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Aug 2 10:17:33 2022 +0200

    add single-cell filtering to table parsing and increase tolerance parameter to 7; refactored postprocessing to use the Rectangles data structure
2022-08-02 10:54:13 +02:00
Isaac Riley
b33dcd83a5 Revert "Pull request #21: move rotation logic to before cv-analysis, so that cv-analysis only needs to operate on portrait images and matrix rotation logic can be dropped"
This reverts commit de921e308f7e0c6d5686b14ca132910bce0bad17.
2022-07-29 08:50:06 +02:00
Julius Unverfehrt
de921e308f Pull request #21: move rotation logic to before cv-analysis, so that cv-analysis only needs to operate on portrait images and matrix rotation logic can be dropped
Merge in RR/cv-analysis from rotation-logic-refactor to master

Squashed commit of the following:

commit 684dd140cbfc9fbebe9beb8c13b52a2d131c9932
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 27 14:22:58 2022 +0200

    move rotation logic to before cv-analysis, so that cv-analysis only needs to operate on portrait images and matrix rotation logic can be dropped
2022-07-27 14:28:21 +02:00
Isaac Riley
9d98945ff9 Pull request #20: New pyinfra
Merge in RR/cv-analysis from new_pyinfra to master

Squashed commit of the following:

commit f7a01a90aad1c402ac537de5bdf15df628ad54df
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 27 10:40:59 2022 +0200

    fix typo

commit ff4d549fac5b612c2d391ae85823c5eca1e91916
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 27 10:34:04 2022 +0200

    adjust build scripts for new pyinfra

commit ecd70f60d46406d8b6cc7f36a1533d706c917ca8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 27 09:42:55 2022 +0200

    simplify logging by using default configurations

commit 20193c14c940eed2b0a7a72058167e26064119d0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jul 26 17:16:57 2022 +0200

    tidy-up, refactor config logic to not dependent on external files

commit d8069cd4d404a570bb04a04278161669d1c83332
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Jul 26 15:14:59 2022 +0200

    update pyinfra

commit c3bc11037cca9baf016043ab997c566f5b4a2586
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Jul 26 15:09:14 2022 +0200

    repair tests

commit 6f4e4f2863ee16ae056c1d432f663858c5f10221
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Jul 26 14:52:38 2022 +0200

    updated server logic to work with new pyinfra; update scripts for pyinfra as submodule

commit 2a18dba81de5ee84d0bdf0e77f478693e8d8aef4
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Jul 26 14:10:41 2022 +0200

    formatting

commit d87ce9328de9aa2341228af9b24473d5e583504e
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Tue Jul 26 14:10:11 2022 +0200

    make server logic compatible with new pyinfra
2022-07-27 10:50:10 +02:00
Julius Unverfehrt
a871fa3bd3 Pull request #19: Refactor evaluate
Merge in RR/cv-analysis from refactor-evaluate to master

Squashed commit of the following:

commit cde03a492452610322f8b7d3eb804a51afb76d81
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 12:37:36 2022 +0200

    add optional show analysis metadata dict

commit fb8bb9e2afa7767f2560f865516295be65f97f20
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 12:13:18 2022 +0200

    add script to evaluate runtime per page for all cv-analysis operations for multiple PDFs

commit 721e823e2ec38aae3fea51d01e2135fc8f228d94
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 10:30:31 2022 +0200

    refactor

commit a453753cfa477e162e5902ce191ded61cb678337
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 10:19:24 2022 +0200

    add logic to transform result coordinates accordingly to page rotation, update annotation script to use this logic

commit 71c09758d0fb763a2c38c6871e1d9bf51f2e7c41
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 15:57:49 2022 +0200

    introduce pipeline for image conversion, analysis and result formatting

commit aef252a41b9658dd0c4f55aa2d9f84de933586e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 15:57:38 2022 +0200

    introduce pipeline for image conversion, analysis and result formatting
2022-07-22 15:11:40 +02:00
Julius Unverfehrt
e7b28f5bda Pull request #18: Remove pil
Merge in RR/cv-analysis from remove_pil to master

Squashed commit of the following:

commit 83c8d88f3d48404251470176c70979ee75ae068b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 10:51:51 2022 +0200

    remove deprecated server tests

commit cebc03b5399ac257a74036b41997201f882f5b74
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 10:51:08 2022 +0200

    remove deprecated server tests

commit ce2845b0c51f001b7b5b8b195d6bf7e034ec4e39
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 17:05:00 2022 +0200

    repair tests to work without pillow WIP

commit 023fdab8322f28359a24c63e32635a3d0deccbe4
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:40:36 2022 +0200

    fixed typo

commit 33850ca83a175f74789ae6b9bebd057ed84b7fb3
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:38:37 2022 +0200

    fixed import from refactored open_img.py

commit dbc6d345f074e538948e2c4f94ebed8a5ef520bc
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:32:42 2022 +0200

    removed PIL from production code, now inly in scripts
2022-07-21 13:25:00 +02:00
Julius Unverfehrt
a2451b9103 Pull request #17: Add pdf2array func
Merge in RR/cv-analysis from add-pdf2array-func to master

Squashed commit of the following:

commit 6e6e9a509ede0abf28fb93a2042960efcc9453bd
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 09:12:01 2022 +0200

    update script with layout parsing, refactor pdf2array

commit 191bc71f58aa5c07b0cadbdb7067cd72c3d8858b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 09:10:06 2022 +0200

    update script with layout parsing, refactor pdf2array

commit 25201bbb4151a23784193181272d379232877d2f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 08:33:20 2022 +0200

    add pdf2array functionality
2022-07-20 11:01:55 +02:00
Julius Unverfehrt
ce9e92876c Pull request #16: Add table parsing fixtures
Merge in RR/cv-analysis from add_table_parsing_fixtures to master

Squashed commit of the following:

commit cfc89b421b61082c8e92e1971c9d0bf4490fa07e
Merge: a7ecb05 73c66a8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jul 11 12:19:01 2022 +0200

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into add_table_parsing_fixtures

commit a7ecb05b7d8327f0c7429180f63a380b61b06bc3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jul 11 12:02:07 2022 +0200

    refactor

commit 466f217e5a9ee5c54fd38c6acd28d54fc38ff9bb
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 11 10:24:14 2022 +0200

    deleted unused imports and unused lines of code

commit c58955c8658d0631cdd1c24c8556d399e3fd9990
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 11 10:16:01 2022 +0200

    black reformatted files

commit f8bcb10a00ff7f0da49b80c1609b17997411985a
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Jul 5 15:15:00 2022 +0200

    reformat files

commit 432e8a569fd70bd0745ce0549c2bfd2f2e907763
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Jul 5 15:08:22 2022 +0200

    added better test for generic pages with table WIP as thicker lines create inconsistent results.
    added test for patchy tables which does not work yet

commit 2aac9ebf5c76bd963f8c136fe5dd4c2d7681b469
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 4 16:56:29 2022 +0200

    added new fixtures for table_parsing_test.py

commit 37606cac0301b13e99be2c16d95867477f29e7c4
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Fri Jul 1 16:02:44 2022 +0200

    added separate file for table parsing fixtures, where fixtures for generic tables were added. WIP tests for generic table fixtures
2022-07-11 12:25:16 +02:00
Julius Unverfehrt
048d5df22b Pull request #14: add processing logs (on debug only to prevent log flood since cv-analysis works on pages)
Merge in RR/cv-analysis from add-logs to master

Squashed commit of the following:

commit d03755c56a60191cd57e176da80a7dd235874755
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 14:42:51 2022 +0200

    disable image logging for production

commit 05186b6025fc1020a959ea04be552c8ea79716a2
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 14:34:44 2022 +0200

    add processing logs (on debug only to prevent log flood since cv-analysis works on pages)
2022-07-07 15:18:06 +02:00