321 Commits

Author SHA1 Message Date
Isaac Riley
1618909d8e Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis 2022-07-26 13:13:29 +02:00
Isaac Riley
29c8d204e1 small fix to annotate.py 2022-07-26 13:03:34 +02:00
Julius Unverfehrt
a871fa3bd3 Pull request #19: Refactor evaluate
Merge in RR/cv-analysis from refactor-evaluate to master

Squashed commit of the following:

commit cde03a492452610322f8b7d3eb804a51afb76d81
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 12:37:36 2022 +0200

    add optional show analysis metadata dict

commit fb8bb9e2afa7767f2560f865516295be65f97f20
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 12:13:18 2022 +0200

    add script to evaluate runtime per page for all cv-analysis operations for multiple PDFs

commit 721e823e2ec38aae3fea51d01e2135fc8f228d94
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 10:30:31 2022 +0200

    refactor

commit a453753cfa477e162e5902ce191ded61cb678337
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 22 10:19:24 2022 +0200

    add logic to transform result coordinates accordingly to page rotation, update annotation script to use this logic

commit 71c09758d0fb763a2c38c6871e1d9bf51f2e7c41
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 15:57:49 2022 +0200

    introduce pipeline for image conversion, analysis and result formatting

commit aef252a41b9658dd0c4f55aa2d9f84de933586e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 15:57:38 2022 +0200

    introduce pipeline for image conversion, analysis and result formatting
master_26
2022-07-22 15:11:40 +02:00
Julius Unverfehrt
e7b28f5bda Pull request #18: Remove pil
Merge in RR/cv-analysis from remove_pil to master

Squashed commit of the following:

commit 83c8d88f3d48404251470176c70979ee75ae068b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 10:51:51 2022 +0200

    remove deprecated server tests

commit cebc03b5399ac257a74036b41997201f882f5b74
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 21 10:51:08 2022 +0200

    remove deprecated server tests

commit ce2845b0c51f001b7b5b8b195d6bf7e034ec4e39
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 17:05:00 2022 +0200

    repair tests to work without pillow WIP

commit 023fdab8322f28359a24c63e32635a3d0deccbe4
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:40:36 2022 +0200

    fixed typo

commit 33850ca83a175f74789ae6b9bebd057ed84b7fb3
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:38:37 2022 +0200

    fixed import from refactored open_img.py

commit dbc6d345f074e538948e2c4f94ebed8a5ef520bc
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date:   Wed Jul 20 16:32:42 2022 +0200

    removed PIL from production code, now inly in scripts
2022-07-21 13:25:00 +02:00
Isaac Riley
023fdab832 fixed typo 2022-07-20 16:40:36 +02:00
Isaac Riley
33850ca83a fixed import from refactored open_img.py 2022-07-20 16:38:37 +02:00
Isaac Riley
dbc6d345f0 removed PIL from production code, now inly in scripts 2022-07-20 16:32:42 +02:00
Julius Unverfehrt
a2451b9103 Pull request #17: Add pdf2array func
Merge in RR/cv-analysis from add-pdf2array-func to master

Squashed commit of the following:

commit 6e6e9a509ede0abf28fb93a2042960efcc9453bd
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 09:12:01 2022 +0200

    update script with layout parsing, refactor pdf2array

commit 191bc71f58aa5c07b0cadbdb7067cd72c3d8858b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 09:10:06 2022 +0200

    update script with layout parsing, refactor pdf2array

commit 25201bbb4151a23784193181272d379232877d2f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 20 08:33:20 2022 +0200

    add pdf2array functionality
master_24
2022-07-20 11:01:55 +02:00
Julius Unverfehrt
ce9e92876c Pull request #16: Add table parsing fixtures
Merge in RR/cv-analysis from add_table_parsing_fixtures to master

Squashed commit of the following:

commit cfc89b421b61082c8e92e1971c9d0bf4490fa07e
Merge: a7ecb05 73c66a8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jul 11 12:19:01 2022 +0200

    Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into add_table_parsing_fixtures

commit a7ecb05b7d8327f0c7429180f63a380b61b06bc3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jul 11 12:02:07 2022 +0200

    refactor

commit 466f217e5a9ee5c54fd38c6acd28d54fc38ff9bb
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 11 10:24:14 2022 +0200

    deleted unused imports and unused lines of code

commit c58955c8658d0631cdd1c24c8556d399e3fd9990
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 11 10:16:01 2022 +0200

    black reformatted files

commit f8bcb10a00ff7f0da49b80c1609b17997411985a
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Jul 5 15:15:00 2022 +0200

    reformat files

commit 432e8a569fd70bd0745ce0549c2bfd2f2e907763
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Tue Jul 5 15:08:22 2022 +0200

    added better test for generic pages with table WIP as thicker lines create inconsistent results.
    added test for patchy tables which does not work yet

commit 2aac9ebf5c76bd963f8c136fe5dd4c2d7681b469
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Mon Jul 4 16:56:29 2022 +0200

    added new fixtures for table_parsing_test.py

commit 37606cac0301b13e99be2c16d95867477f29e7c4
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Fri Jul 1 16:02:44 2022 +0200

    added separate file for table parsing fixtures, where fixtures for generic tables were added. WIP tests for generic table fixtures
master_23 remove_pil_2
2022-07-11 12:25:16 +02:00
Julius Unverfehrt
73c66a85c6 Pull request #15: Refactor logging
Merge in RR/cv-analysis from refactor-logging to master

Squashed commit of the following:

commit 2ef2ad4a590b5732649945695303dbc98f1c4918
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 16:43:56 2022 +0200

    update pyinfra

commit 8b4f833c66953ae39fd1d7270add4d10a61a6685
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 16:37:19 2022 +0200

    adjust logs
master_22
2022-07-11 09:36:57 +02:00
Julius Unverfehrt
048d5df22b Pull request #14: add processing logs (on debug only to prevent log flood since cv-analysis works on pages)
Merge in RR/cv-analysis from add-logs to master

Squashed commit of the following:

commit d03755c56a60191cd57e176da80a7dd235874755
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 14:42:51 2022 +0200

    disable image logging for production

commit 05186b6025fc1020a959ea04be552c8ea79716a2
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 14:34:44 2022 +0200

    add processing logs (on debug only to prevent log flood since cv-analysis works on pages)
2022-07-07 15:18:06 +02:00
Julius Unverfehrt
f37b6d7d8e Pull request #13: Add pdf coord conversion
Merge in RR/cv-analysis from add-pdf-coord-conversion to master

Squashed commit of the following:

commit f56b7b45feb78142b032ef0faae2ca8dd020e6c5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 11:26:46 2022 +0200

    update pyinfra

commit 9086ef0a2059688fb8dd5559cda831bbbd36362b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 11:21:53 2022 +0200

    update inpout metadata keys

commit 55f147a5848e22ea62242ea883a0ce53ef1c04a5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jul 7 09:16:16 2022 +0200

    update to new input metadata signature

commit df4652fb027f734f2613e4adb7bc5b17edee62e9
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 6 16:55:36 2022 +0200

    refactor

commit e52c674085a9c7411c55a2e0993aa34622284317
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jul 6 16:15:21 2022 +0200

    update build script, refactor

commit 1f874aea591f25544aaa3f39a4e38fa50a24615e
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jul 5 17:01:15 2022 +0200

    add rotation formatter

commit b78a69741287a4cd38a90ace98f67e8f1b803737
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jul 5 09:26:27 2022 +0200

    refactor

commit b3155b8e072530f99114f3ee9135e73afc8f85cb
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 1 15:06:45 2022 +0200

    made assertion robust to floating point precision

commit 4169102a6b5053500a3db2d789d265c2c77d56a4
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 1 15:06:01 2022 +0200

    improve banner

commit dea74593d925c802489e5400297b48a9729038f0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 1 14:28:08 2022 +0200

    introduce derotation logic for rectangles from rotated pdfs, introduce continious option for coordinates in Rectangle class

commit d07e1dc2731ea7ae9887cc02bb98155bf1565a0d
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 1 10:39:38 2022 +0200

    introduce table parsing formatter to convert pixel values to inches

commit 67ff6730dd7073a0fc9e9698904325dea9537c5b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jul 1 08:06:42 2022 +0200

    fixed duplicate logging

commit 6c025409415329028f697bb99986cd0912c7ed54
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 30 17:10:32 2022 +0200

    add pyinfra mock script
0.1.1 master_18
2022-07-07 11:35:12 +02:00
Julius Unverfehrt
fc8a9e15f8 Pull request #12: Diff font sizes on page
Merge in RR/cv-analysis from diff-font-sizes-on-page to master

Squashed commit of the following:

commit d1b32a3e8fadd45d38040e1ba96672ace240ae29
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 30 14:43:30 2022 +0200

    add tests for figure detection first iteration

commit c38a7701afaad513320f157fe7188b3f11a682ac
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 30 14:26:08 2022 +0200

    update text tests with new test cases

commit ccc0c1a177c7d69c9575ec0267a492c3eef008e3
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Wed Jun 29 23:09:24 2022 +0200

    added fixture for different scaled text on page and parameter for different font style

commit 5f36a634caad2849e673de7d64abb5b6c3a6055f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 17:03:52 2022 +0200

    add pdf2pdf annotate script for figure detection

commit 7438c170371e166e82ab19f9dfdf1bddd89b7bb3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:24:52 2022 +0200

    optimize algorithm

commit 93bf8820f856d3815bab36b13c0df189c45d01e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:11:15 2022 +0200

    black

commit 59c639eec7d3f9da538b0ad6cd6215456c92eb58
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 16:10:39 2022 +0200

    add tests for figure detection pipeline

commit bada688d88231843e9d299d255d9c4e0d5ca9788
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 13:34:36 2022 +0200

    refactor tests

commit 614388a18b46d670527727c11f63e8174aed3736
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 13:34:14 2022 +0200

    introduce pipeline logic for figure detection

commit 7195f892d543294829aebe80e260b4395b89cb36
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:58:41 2022 +0200

    update reqs

commit 4408e7975853196c5e363dd2ddf62e15fe6f4944
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:56:16 2022 +0200

    add figure detection test

commit 5ff472c2d96238ca2bc1d2368d3d02e62db98713
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 11:56:09 2022 +0200

    add figure detection test

commit 66c1307e57c84789d64cb8e41d8e923ac98eebde
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 10:36:50 2022 +0200

    refactor draw boxes to work as intended on inversed image

commit 00a39050d051ae43b2a8f2c4efd6bfbd2609dead
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 28 10:36:11 2022 +0200

    refactor module structure

commit f8af01894c387468334a332e75f7dbf545a91f86
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 17:07:47 2022 +0200

    add: figure detection now agnostic to input image background color, refactor tests

commit 3bc63da783bced571d53b29b6d82648c9f93e886
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 14:31:15 2022 +0200

    add text removal tests

commit 6e794a7cee3fd7633aa5084839775877b0f8794c
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 27 12:12:27 2022 +0200

    figure detection tests WIP

commit f8b20d4c9845de6434142e3dab69ce467fbc7a75
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jun 24 15:39:37 2022 +0200

    add tests for figure_detection WIP

commit f2a52a07a5e261962214dff40ba710c93993f6fb
Author: llocarnini <lillian.locarnini@iqser.com>
Date:   Fri Jun 24 14:28:44 2022 +0200

    added third test case "figure_and_text"

commit 8f45c88278cdcd32a121ea8269c8eca816bffd0b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Fri Jun 24 13:25:17 2022 +0200

    add tests for figure_detection
master_17
2022-06-30 14:50:58 +02:00
Julius Unverfehrt
3ae4d81bb9 update dependencies master_16 2022-06-23 16:54:13 +02:00
Julius Unverfehrt
618880241c update dependencies 2022-06-23 16:46:26 +02:00
Julius Unverfehrt
956e673701 update dependencies 2022-06-23 16:37:46 +02:00
Julius Unverfehrt
a0abae195c update dependencies 2022-06-23 16:30:53 +02:00
Julius Unverfehrt
6d1ca4d6a3 Pull request #11: Integrate new pyinfra
Merge in RR/cv-analysis from integrate-new-pyinfra to master

Squashed commit of the following:

commit f27b7eb342838b7a235a062a04363dc417f859ad
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 14:24:03 2022 +0200

    refactor table test

commit 9f57cc7d72bffc106c852041666b2f11eb6eacc3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 14:07:37 2022 +0200

    debug bamboo

commit 30911cc5a34559a8b622634ddf974a9860481d17
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 13:22:04 2022 +0200

    track test data with dvc

commit 501460c3c99482879ae585872bd67fd67693c47a
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 13:19:39 2022 +0200

    untrack test data

commit f65ade167802901a6f402618c062df0120279df3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 12:02:43 2022 +0200

    refactor&extend tests

commit 8c9dc41ddeda5b0f630a267e328d1c09f69bdb04
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 09:36:26 2022 +0200

    debug bamboo

commit f0b38130502475cf9bfa8632d3b0eb3a84b32b7d
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 09:27:42 2022 +0200

    debug bamboo

commit 0f188b4eb5293cf2bc4024fb397f161ad3b867bd
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 09:23:38 2022 +0200

    update build script

commit 281e13d822790deefa3d1a4f2519d300d84cded3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 09:21:31 2022 +0200

    refactor tests

commit e90e84cb3b13b2903611985cc9eb3b5b7bf0262e
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 08:54:29 2022 +0200

    parametrize analysis_fn for server logic, refactor tests

commit 20734bcd14fec489e80ea6900dba64de4b190398
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Thu Jun 23 08:53:16 2022 +0200

    oursource tests from module

commit cd2c41762df1a231f2ed1d43c3b71d2443530ffa
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 14:26:36 2022 +0200

    add tests for analyse server logic

commit 16497ac4ec8b0d7064f6d8dd887c189f0d955a1d
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 11:36:34 2022 +0200

    debug build script

commit 45688c1c6d9b738cce519edcdc044aae3b800cd1
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 11:33:13 2022 +0200

    debug build script

commit 0576140916c0cd9d290dd02225621e5360665d71
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 10:51:51 2022 +0200

    update tests

commit fcbecdde95cef46bce46545af65d040cc918447b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 10:04:30 2022 +0200

    rename operations, update requirements

commit 7b40f6d643bb332fd7dd0867d64f17db16ede5bb
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Wed Jun 22 10:03:48 2022 +0200

    adjust deployment scripts

commit b66f937d2e0abc79e68bce6ee058bc0bd5cb86e5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Tue Jun 21 13:32:44 2022 +0200

    refactor server logic, use operation2function logic for pyinfra server

commit 5e7247f85cacaa6c0643796a98f13642db3e59e1
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 20 17:23:11 2022 +0200

    add server logic for pyinfra 2

commit eecb985fed76af9404bd99f0104508efe7d75e35
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date:   Mon Jun 20 16:24:05 2022 +0200

    add server logic for pyinfra 2.0.0

... and 3 more commits
2022-06-23 14:45:08 +02:00
Julius Unverfehrt
0858a69364 update planspec in order to add pyinfra as subrepo to bamboo, since it cant't be updated on other branches 2022-06-22 12:43:58 +02:00
Isaac Riley
268329a57f add pyinfra_compat.py 2022-06-20 13:48:16 +02:00
Isaac Riley
b66a7f15e1 added pyinfra_compat file, usage: from cv_analysis.pyinfra_compat import analyze_byteslist; page_results = analyze_byteslist(img_bytes_list) 2022-06-14 09:09:00 +02:00
Isaac Riley
0d9d577187 reformat 2022-06-13 13:04:15 +02:00
Isaac Riley
c62ab08b98 ready for integration with pyinfra 2022-06-13 12:59:00 +02:00
Isaac Riley
01803d452a Merge branch 'fig-detection-scanned-pdfs' master_8 2022-05-24 17:07:09 +02:00
llocarnini
f5a75f3949 changes in export_example_pages.py as well as removing unused imports in table_parsing.py 2022-05-24 16:20:52 +02:00
llocarnini
e6a173053b Merge remote-tracking branch 'origin/fig-detection-scanned-pdfs' into fig-detection-scanned-pdfs 2022-05-24 09:33:18 +02:00
llocarnini
90dfacab21 deleted function for processing testfiles 2022-05-24 09:32:48 +02:00
llocarnini
c4c85ace6d added locations and changed names for test_files 2022-05-24 09:31:29 +02:00
Isaac Riley
a4626e635a removed problematic dvc file 2022-05-24 08:19:25 +02:00
Isaac Riley
3f33ab4f3d resolve a DVC conflict 2022-05-24 08:01:42 +02:00
llocarnini
179ad20165 minor changes, refactoring and testfiles added 2022-05-17 09:17:24 +02:00
llocarnini
0e30e97f80 Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into fig-detection-scanned-pdfs
 Conflicts:
	cv_analysis/figure_detection.py
	cv_analysis/layout_parsing.py
	cv_analysis/table_parsing.py
	scripts/annotate.py
2022-05-04 09:33:14 +02:00
Isaac Riley
21d1f087c8 fixed show parameter, for development only master_7 2022-04-27 11:27:38 +02:00
llocarnini
98ed9a4220 Merge remote-tracking branch 'origin/fig-detection-scanned-pdfs' into fig-detection-scanned-pdfs 2022-04-27 11:12:43 +02:00
llocarnini
2c39ffbcdd changed kernel and iteration for better text removal 2022-04-27 11:12:23 +02:00
Isaac Riley
81fe5139c2 fixed tests, passed (still need to extend tests) 2022-04-27 10:52:35 +02:00
Isaac Riley
41e5f55ea7 got changes to table parsing from other branch 2022-04-27 09:18:57 +02:00
Isaac Riley
b806c3c13d fix for table parsing when no outer line is present 2022-04-27 09:15:15 +02:00
Isaac Riley
4ac1cce0e8 reformatting 2022-04-26 16:01:57 +02:00
llocarnini
19fe6965fb added line in display so the visual logger doesn't open too many plots
changes to fig_detection_with_layout.py so tables are getting parsed as well

reusage of adding external contour in table_parsing.py
2022-04-26 11:19:27 +02:00
Isaac Riley
9327fb7231 fixed json format and refactored service functions 2022-04-22 11:22:16 +02:00
llocarnini
17f5b22443 Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into fig-detection-scanned-pdfs
 Conflicts:
	cv_analysis/figure_detection.py
	cv_analysis/layout_parsing.py
	cv_analysis/table_parsing.py
	scripts/annotate.py
2022-04-22 10:24:09 +02:00
llocarnini
11a2465789 few corrections for including smaller figures 2022-04-22 10:12:28 +02:00
Isaac Riley
88bb8dbddf added visual logger for development 2022-04-21 15:10:35 +02:00
Isaac Riley
0ea556a7e0 slightly refactored table parsing and deleted unneeded file 2022-04-21 09:17:12 +02:00
llocarnini
3669b6b341 fig_detection_with_layout.py: approach to label the content of a page through layout detection, table parsing for detected tables needs to be added and overall codes needs to be reviewed
layout_parsing.py added condition so fig_detection_with_layout.py works
table_parsing.py uncommented line for better table parsing
text.py changed kernel sizes
2022-04-20 09:43:30 +02:00
llocarnini
420e484896 the thresholds deciding weather a countour is likely a primary text structure can be set better, as text structures are not always removed. this leads to over detection of figures 2022-04-12 16:48:29 +02:00
Isaac Riley
0b96980cc5 keyword 'show' to fix annotation script without causing problems for non-script usage 2022-04-11 09:44:47 +02:00
Isaac Riley
64258ed6e1 fixed hyphen/underscore confusion in cv_analysis master_5 2022-03-23 14:42:39 +01:00
Isaac Riley
80b0ca4ec5 tiny change to test build server 2022-03-23 14:35:00 +01:00