28 Commits

Author SHA1 Message Date
iriley
b43033e6bf chore: repo housekeeping: adapt pre-commit and versioning script 2024-04-29 13:20:14 +02:00
iriley
5d13d8b3d0 chore: formatting and linting 2024-04-29 12:09:44 +02:00
Julius Unverfehrt
ea25b57dd9 update pdf2image module 2022-08-10 14:17:57 +02:00
Julius Unverfehrt
59a0a61708 Pull request #25: Pdf2image
Merge in RR/cv-analysis from pdf2image to master

Squashed commit of the following:

commit 1353f54d2dceb0a79b1f81bfa2c035f5a454275a
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Aug 10 09:07:31 2022 +0200

    add deRotation and transformation vie rectanglePlus

commit 51459dbf57a86e3eac66ec0da02de40dc1b68796
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Aug 9 08:53:50 2022 +0200

    add derotation and to pdf coords transformation to cv-analysis output

commit 733991e2f5a4664205b2f7cc756cebcbc9ee3930
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Aug 8 15:15:13 2022 +0200

    update pipline with detrotation logic WIP
2022-08-10 09:17:59 +02:00
Isaac Riley
29c8d204e1 small fix to annotate.py 2022-07-26 13:03:34 +02:00
Isaac Riley
dbc6d345f0 removed PIL from production code, now inly in scripts 2022-07-20 16:32:42 +02:00
Julius Unverfehrt
fc8a9e15f8 Pull request #12: Diff font sizes on page
Merge in RR/cv-analysis from diff-font-sizes-on-page to master

Squashed commit of the following:

commit d1b32a3e8fadd45d38040e1ba96672ace240ae29
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 30 14:43:30 2022 +0200

    add tests for figure detection first iteration

commit c38a7701afaad513320f157fe7188b3f11a682ac
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 30 14:26:08 2022 +0200

    update text tests with new test cases

commit ccc0c1a177c7d69c9575ec0267a492c3eef008e3
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Wed Jun 29 23:09:24 2022 +0200

    added fixture for different scaled text on page and parameter for different font style

commit 5f36a634caad2849e673de7d64abb5b6c3a6055f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 17:03:52 2022 +0200

    add pdf2pdf annotate script for figure detection

commit 7438c170371e166e82ab19f9dfdf1bddd89b7bb3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:24:52 2022 +0200

    optimize algorithm

commit 93bf8820f856d3815bab36b13c0df189c45d01e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:11:15 2022 +0200

    black

commit 59c639eec7d3f9da538b0ad6cd6215456c92eb58
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:10:39 2022 +0200

    add tests for figure detection pipeline

commit bada688d88231843e9d299d255d9c4e0d5ca9788
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 13:34:36 2022 +0200

    refactor tests

commit 614388a18b46d670527727c11f63e8174aed3736
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 13:34:14 2022 +0200

    introduce pipeline logic for figure detection

commit 7195f892d543294829aebe80e260b4395b89cb36
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:58:41 2022 +0200

    update reqs

commit 4408e7975853196c5e363dd2ddf62e15fe6f4944
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:56:16 2022 +0200

    add figure detection test

commit 5ff472c2d96238ca2bc1d2368d3d02e62db98713
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:56:09 2022 +0200

    add figure detection test

commit 66c1307e57c84789d64cb8e41d8e923ac98eebde
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 10:36:50 2022 +0200

    refactor draw boxes to work as intended on inversed image

commit 00a39050d051ae43b2a8f2c4efd6bfbd2609dead
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 10:36:11 2022 +0200

    refactor module structure

commit f8af01894c387468334a332e75f7dbf545a91f86
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 17:07:47 2022 +0200

    add: figure detection now agnostic to input image background color, refactor tests

commit 3bc63da783bced571d53b29b6d82648c9f93e886
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 14:31:15 2022 +0200

    add text removal tests

commit 6e794a7cee3fd7633aa5084839775877b0f8794c
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 12:12:27 2022 +0200

    figure detection tests WIP

commit f8b20d4c9845de6434142e3dab69ce467fbc7a75
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jun 24 15:39:37 2022 +0200

    add tests for figure_detection WIP

commit f2a52a07a5e261962214dff40ba710c93993f6fb
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Fri Jun 24 14:28:44 2022 +0200

    added third test case "figure_and_text"

commit 8f45c88278cdcd32a121ea8269c8eca816bffd0b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jun 24 13:25:17 2022 +0200

    add tests for figure_detection
2022-06-30 14:50:58 +02:00
Isaac Riley
0d9d577187 reformat 2022-06-13 13:04:15 +02:00
Isaac Riley
c62ab08b98 ready for integration with pyinfra 2022-06-13 12:59:00 +02:00
llocarnini
90dfacab21 deleted function for processing testfiles 2022-05-24 09:32:48 +02:00
llocarnini
c4c85ace6d added locations and changed names for test_files 2022-05-24 09:31:29 +02:00
llocarnini
179ad20165 minor changes, refactoring and testfiles added 2022-05-17 09:17:24 +02:00
llocarnini
17f5b22443 Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into fig-detection-scanned-pdfs
 Conflicts:
	cv_analysis/figure_detection.py
	cv_analysis/layout_parsing.py
	cv_analysis/table_parsing.py
	scripts/annotate.py
2022-04-22 10:24:09 +02:00
llocarnini
11a2465789 few corrections for including smaller figures 2022-04-22 10:12:28 +02:00
Isaac Riley
88bb8dbddf added visual logger for development 2022-04-21 15:10:35 +02:00
Isaac Riley
0b96980cc5 keyword 'show' to fix annotation script without causing problems for non-script usage 2022-04-11 09:44:47 +02:00
Isaac Riley
8730b34018 change name from vidocp to cv-analysis 2022-03-23 13:46:57 +01:00
Matthias Bisping
c9b2f6bf29 Pull request #9: Refactoring
Merge in RR/vidocp from refactoring to master

Squashed commit of the following:

commit 36a62a13e51148d2420cb12930e84d78629db6b0
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:54:53 2022 +0100

    refactoring

commit e652da1fa88a048f9a5211b4e8c0b96074fb5849
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:53:17 2022 +0100

    refactoring

commit d9567da428c81f9cd7971a657281df0a90166810
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:47:18 2022 +0100

    refactoring

commit 9d30009dceec0357db6499bfaffae8ce97718ee0
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:45:53 2022 +0100

    refactoring

commit e8863d67aaaff138fb088c4e496a91b6354cc059
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:42:45 2022 +0100

    refactoring

commit 89a99d3586db4fbafa743a45bdd02eaf0c1f341f
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:39:49 2022 +0100

    refactoring

commit aa66b6865b00b0490b9e7695a6bae386e6f96723
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:31:21 2022 +0100

    refactoring

commit 98d77cb522a08821c3a13ae2cffbe7239c654762
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:27:55 2022 +0100

    refactoring

commit fed3a7e4f1b8b7ca4e14f9e495459c26490fb50b
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:26:16 2022 +0100

    refactoring

commit 504cafbd5d4bba183d9943b36c60548aae34e402
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:25:44 2022 +0100

    renaming

commit c9780a57e5a048529d36958ba678eddb11759cef
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:24:41 2022 +0100

    removed obsolete import

commit d555e86475e82024f8e1a5fc5b0ac70faa091ee1
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 14:24:04 2022 +0100

    refactored figure detection once
2022-02-06 14:55:38 +01:00
Matthias Bisping
8432cfe514 Pull request #8: figure detection
Merge in RR/vidocp from text_removal to master

Squashed commit of the following:

commit b65374c512ce9ba07fa522d591c83db3de5d7d55
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:03:12 2022 +0100

    readme updated

commit 1c1f7a395a00fa505cf19e1ad87d8c34faa6ef5b
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:00:46 2022 +0100

    figure detection version 1 completed

commit f257660823ef8682e9fedda9921ad946ef2ade76
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:37:03 2022 +0100

    wip

commit 2e89b28f4a69da80570597c823b3b7a591788d0a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:23:56 2022 +0100

    wip
2022-02-06 01:04:15 +01:00
Matthias Bisping
bb5707dc89 Pull request #6: added layout parsing logic
Merge in RR/vidocp from layout_detection_version_2 to master

Squashed commit of the following:

commit d443e95ad8143bed3efc74d9e38640498d8d16bf
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 20:16:13 2022 +0100

    readme updated

commit 953ad696932454ce851544ed016f9e64bcc12080
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 20:14:59 2022 +0100

    added layot parsing logic
2022-02-05 20:17:14 +01:00
Matthias Bisping
00748a8ac0 Pull request #5: Table parsing version 2
Merge in RR/vidocp from table_parsing_version_2 to master

Squashed commit of the following:

commit af136ca10cf96f99699e409000ff598ce90c192e
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:34:01 2022 +0100

    readme updated

commit 13ca7b1b03cb2bf7b3c8ef5821c1f8fa9ec532a0
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:32:11 2022 +0100

    drawing color standardized

commit 654e961c62ddc0f512074e8238d7fa88f0ea227e
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:22:57 2022 +0100

    refactoring

commit 964c17a36f7bbc1376dfe68f4ea90462d676e215
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:07:16 2022 +0100

    readme updated

commit 4470969b35bb76e68cc41947fa02e63100b30ce9
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:05:35 2022 +0100

    readme updated

commit a6c6bdb1e71a778a3c21a628cfb30acc5bc6086f
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:05:21 2022 +0100

    readme updated

commit e178793dd69b720adefe7533312314e4c405f975
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:03:45 2022 +0100

    readme updated

commit 443163864bab56930c2ef735c0aaafddd2561ead
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 17:59:03 2022 +0100

    implememted clean solution for parsing open tables. still needs final refactoring.
2022-02-05 19:32:47 +01:00
Matthias Bisping
3d4b924426 Pull request #4: Restructuring and renaming of module
Merge in RR/vidocp from poly_to_rects_segmentation to master

Squashed commit of the following:

commit 3dffe067ef0bb4796eab22007eb6970b29f47822
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 16:10:28 2022 +0100

    readme updated

commit 448517205259134a8427b48d86d0d5331b726487
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 16:09:35 2022 +0100

    restructured dirs

commit 058c2971631c71d520b1a94ea75e249f9234ad87
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:57:08 2022 +0100

    renaming

commit 4e64a3d07f1dad76775955639157ec7b60e6ad38
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:46:03 2022 +0100

    readme updated

commit 728bedb13a2769b4652fd674ef26988efebcc7dc
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:33:42 2022 +0100

    added DVC

commit e2d5594afd6683d8207007d3a85d178dd0a3e546
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 14:49:09 2022 +0100

    renaming
2022-02-05 16:14:24 +01:00
Matthias Bisping
f5f60c57da applied black 2022-02-05 14:45:11 +01:00
Matthias Bisping
8dc2685d9a renaming 2022-02-05 14:44:32 +01:00
Matthias Bisping
8c88fc594d renaming; readme 2022-02-05 14:42:00 +01:00
Julius Unverfehrt
baaee6f5b7 Pull request #3: Layout detection happy little accident
Merge in RR/table_parsing from layout_detetciton_happy_little_accident to master

Squashed commit of the following:

commit eb4452c9a488df16085a16eba08b7a182274d331
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:08:12 2022 +0100

    Empty line added

commit d2fedf9a2f982af2157a408077654d388ca6cc6d
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:07:20 2022 +0100

    Empty line added

commit 638d14a4b6c7b4d34222fd3b4cbb8ce79bb32ef0
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:05:43 2022 +0100

    Quickfix typo

commit 0271d2ba2e51227aa53e128bf857394e5b5b2d48
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 15:57:06 2022 +0100

    black

commit c95c4ad3f3d01857e7dd1dde0802ed7f2a5837c1
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 15:53:42 2022 +0100

    Refactored layout_detection prototype

commit 766bd0b916b532885e44a13581f100ffaa39bb55
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 13:16:36 2022 +0100

    reset table_parsing to table parsing functionality, moved layout detection accident to layout_detection

commit 7c8955f56dfae2aef814caf4cbc6e903406994ba
Merge: 9a065a0 af5c6d0
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 13:11:28 2022 +0100

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/table_parsing into layout_detetciton_happy_little_accident

commit 9a065a0e7f62823a3b18e301d12c80b1a74f0b3e
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 16:45:09 2022 +0100

    Made Bob proud
2022-02-04 16:08:39 +01:00
Matthias Bisping
b00b914caf improved box detection 2022-02-04 12:49:51 +01:00
Matthias Bisping
ed0c38e32d added box detection logic to find previous redactions 2022-02-03 16:26:23 +01:00