60 Commits

Author SHA1 Message Date
llocarnini
d70781f4aa changed tolerance in adjacent1 function in postprocessing.y from 2 to 4
added function so vertical and horizontal components do not overlap the layout box of the table
2022-02-17 16:45:55 +01:00
llocarnini
57ca47f38d different approaches to isolate line components of tables in scanned pdf files. 2022-02-16 12:37:17 +01:00
llocarnini
c2faf7d00b adjusted isolation of vertical and horizontal components to be more robust to scanned pages; work in progress 2022-02-14 11:04:04 +01:00
llocarnini
885fc22f9d added changes to parse scanned pdfs 2022-02-11 15:59:54 +01:00
llocarnini
07907d45dd some changes to fix some minor bugs in table_parsing.py and post_processing.py 2022-02-10 10:56:03 +01:00
llocarnini
4964c8f5a1 some changes to fix some minor bugs in table_parsing.py and post_processing.py 2022-02-10 10:22:22 +01:00
Matthias Bisping
f7d3e39692 nix dolles 2022-02-08 15:05:12 +01:00
Matthias Bisping
87cecadb44 applied black 2022-02-06 21:27:39 +01:00
Matthias Bisping
295666c28f added todo comments 2022-02-06 21:25:01 +01:00
Matthias Bisping
90b8613bf8 filtering non-tables by bounding rect check WIP 2022-02-06 21:03:40 +01:00
Matthias Bisping
36284f9a78 removed obsolete lines 2022-02-06 20:01:00 +01:00
Matthias Bisping
0fc6cf8008 fixed bug in adjaceny test 2022-02-06 20:00:38 +01:00
Matthias Bisping
106b333dca filtering for connected cells... but does not quite work yet 2022-02-06 16:44:07 +01:00
Matthias Bisping
36a62a13e5 refactoring 2022-02-06 14:54:53 +01:00
Matthias Bisping
e652da1fa8 refactoring 2022-02-06 14:53:17 +01:00
Matthias Bisping
d9567da428 refactoring 2022-02-06 14:47:18 +01:00
Matthias Bisping
9d30009dce refactoring 2022-02-06 14:45:53 +01:00
Matthias Bisping
e8863d67aa refactoring 2022-02-06 14:42:45 +01:00
Matthias Bisping
89a99d3586 refactoring 2022-02-06 14:39:49 +01:00
Matthias Bisping
aa66b6865b refactoring 2022-02-06 14:31:21 +01:00
Matthias Bisping
98d77cb522 refactoring 2022-02-06 14:27:55 +01:00
Matthias Bisping
fed3a7e4f1 refactoring 2022-02-06 14:26:16 +01:00
Matthias Bisping
504cafbd5d renaming 2022-02-06 14:25:44 +01:00
Matthias Bisping
c9780a57e5 removed obsolete import 2022-02-06 14:24:41 +01:00
Matthias Bisping
d555e86475 refactored figure detection once 2022-02-06 14:24:04 +01:00
Matthias Bisping
8432cfe514 Pull request #8: figure detection
Merge in RR/vidocp from text_removal to master

Squashed commit of the following:

commit b65374c512ce9ba07fa522d591c83db3de5d7d55
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:03:12 2022 +0100

    readme updated

commit 1c1f7a395a00fa505cf19e1ad87d8c34faa6ef5b
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 01:00:46 2022 +0100

    figure detection version 1 completed

commit f257660823ef8682e9fedda9921ad946ef2ade76
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:37:03 2022 +0100

    wip

commit 2e89b28f4a69da80570597c823b3b7a591788d0a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sun Feb 6 00:23:56 2022 +0100

    wip
2022-02-06 01:04:15 +01:00
Matthias Bisping
b82a294610 Pull request #7: Layout detection version 3
Merge in RR/vidocp from layout_detection_version_3 to master

Squashed commit of the following:

commit 262b1c14c0b8b164221d39fd286b20914d1a8e6a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 22:56:10 2022 +0100

    comment

commit 975dcdaae2b0e9bfcb075fe1c87adc48175c0d93
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 22:50:41 2022 +0100

    applied black

commit 49ba3b5f318a1b5d6bb39c0b53de5e237a87da96
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 22:48:44 2022 +0100

    improved layout parsing logic: filtering of included rects

commit d78ac24c10793f72b569c3c827834400b730888a
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 22:36:49 2022 +0100

    improved layout parsing logic: filtering of overlaps, no sub-text regions
2022-02-05 22:58:51 +01:00
Matthias Bisping
7ceb534fa2 readme updated 2022-02-05 20:31:13 +01:00
Matthias Bisping
8a9b493c94 readme updated 2022-02-05 20:29:23 +01:00
Matthias Bisping
e62a88ff9a fixed import 2022-02-05 20:27:29 +01:00
Matthias Bisping
bb5707dc89 Pull request #6: added layout parsing logic
Merge in RR/vidocp from layout_detection_version_2 to master

Squashed commit of the following:

commit d443e95ad8143bed3efc74d9e38640498d8d16bf
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 20:16:13 2022 +0100

    readme updated

commit 953ad696932454ce851544ed016f9e64bcc12080
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 20:14:59 2022 +0100

    added layot parsing logic
2022-02-05 20:17:14 +01:00
Matthias Bisping
00748a8ac0 Pull request #5: Table parsing version 2
Merge in RR/vidocp from table_parsing_version_2 to master

Squashed commit of the following:

commit af136ca10cf96f99699e409000ff598ce90c192e
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:34:01 2022 +0100

    readme updated

commit 13ca7b1b03cb2bf7b3c8ef5821c1f8fa9ec532a0
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:32:11 2022 +0100

    drawing color standardized

commit 654e961c62ddc0f512074e8238d7fa88f0ea227e
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:22:57 2022 +0100

    refactoring

commit 964c17a36f7bbc1376dfe68f4ea90462d676e215
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:07:16 2022 +0100

    readme updated

commit 4470969b35bb76e68cc41947fa02e63100b30ce9
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:05:35 2022 +0100

    readme updated

commit a6c6bdb1e71a778a3c21a628cfb30acc5bc6086f
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:05:21 2022 +0100

    readme updated

commit e178793dd69b720adefe7533312314e4c405f975
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 18:03:45 2022 +0100

    readme updated

commit 443163864bab56930c2ef735c0aaafddd2561ead
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 17:59:03 2022 +0100

    implememted clean solution for parsing open tables. still needs final refactoring.
2022-02-05 19:32:47 +01:00
Matthias Bisping
224360c823 readme updated 2022-02-05 16:23:42 +01:00
Matthias Bisping
9507bffd89 readme updated 2022-02-05 16:20:23 +01:00
Matthias Bisping
e454938c1e readme updated 2022-02-05 16:20:05 +01:00
Matthias Bisping
5491f511e2 readme updated 2022-02-05 16:19:27 +01:00
Matthias Bisping
5fec1db1e6 readme updated 2022-02-05 16:18:01 +01:00
Matthias Bisping
d5f445b5e1 readme updated 2022-02-05 16:17:39 +01:00
Matthias Bisping
3d4b924426 Pull request #4: Restructuring and renaming of module
Merge in RR/vidocp from poly_to_rects_segmentation to master

Squashed commit of the following:

commit 3dffe067ef0bb4796eab22007eb6970b29f47822
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 16:10:28 2022 +0100

    readme updated

commit 448517205259134a8427b48d86d0d5331b726487
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 16:09:35 2022 +0100

    restructured dirs

commit 058c2971631c71d520b1a94ea75e249f9234ad87
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:57:08 2022 +0100

    renaming

commit 4e64a3d07f1dad76775955639157ec7b60e6ad38
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:46:03 2022 +0100

    readme updated

commit 728bedb13a2769b4652fd674ef26988efebcc7dc
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 15:33:42 2022 +0100

    added DVC

commit e2d5594afd6683d8207007d3a85d178dd0a3e546
Author: Matthias Bisping <matthias.bisping@iqser.com>
Date:   Sat Feb 5 14:49:09 2022 +0100

    renaming
2022-02-05 16:14:24 +01:00
Matthias Bisping
512d217b05 renaming 2022-02-05 14:47:28 +01:00
Matthias Bisping
f5f60c57da applied black 2022-02-05 14:45:11 +01:00
Matthias Bisping
8dc2685d9a renaming 2022-02-05 14:44:32 +01:00
Matthias Bisping
41b9583ba6 readme fix 2022-02-05 14:43:01 +01:00
Matthias Bisping
cfa3dbf31c readme fix 2022-02-05 14:42:34 +01:00
Matthias Bisping
8c88fc594d renaming; readme 2022-02-05 14:42:00 +01:00
Matthias Bisping
c435d25ac0 typo 2022-02-05 14:31:46 +01:00
Matthias Bisping
2aced51dfd removed debug prints 2022-02-04 17:44:02 +01:00
Julius Unverfehrt
baaee6f5b7 Pull request #3: Layout detection happy little accident
Merge in RR/table_parsing from layout_detetciton_happy_little_accident to master

Squashed commit of the following:

commit eb4452c9a488df16085a16eba08b7a182274d331
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:08:12 2022 +0100

    Empty line added

commit d2fedf9a2f982af2157a408077654d388ca6cc6d
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:07:20 2022 +0100

    Empty line added

commit 638d14a4b6c7b4d34222fd3b4cbb8ce79bb32ef0
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 16:05:43 2022 +0100

    Quickfix typo

commit 0271d2ba2e51227aa53e128bf857394e5b5b2d48
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 15:57:06 2022 +0100

    black

commit c95c4ad3f3d01857e7dd1dde0802ed7f2a5837c1
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 15:53:42 2022 +0100

    Refactored layout_detection prototype

commit 766bd0b916b532885e44a13581f100ffaa39bb55
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 13:16:36 2022 +0100

    reset table_parsing to table parsing functionality, moved layout detection accident to layout_detection

commit 7c8955f56dfae2aef814caf4cbc6e903406994ba
Merge: 9a065a0 af5c6d0
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Fri Feb 4 13:11:28 2022 +0100

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/table_parsing into layout_detetciton_happy_little_accident

commit 9a065a0e7f62823a3b18e301d12c80b1a74f0b3e
Author: Julius Unverfehrt <Julius.Unverfehrt@iqser.com>
Date:   Thu Feb 3 16:45:09 2022 +0100

    Made Bob proud
2022-02-04 16:08:39 +01:00
Matthias Bisping
af5c6d0b34 Pull request #2: improved box detection
Merge in RR/table_parsing from box_detection_version_2 to master

* commit 'b00b914caf48e9c471f580b033278ba4f6c76150':
  improved box detection
2022-02-04 12:54:07 +01:00
Matthias Bisping
b00b914caf improved box detection 2022-02-04 12:49:51 +01:00