iriley
3b8d6eda04
fix: maping of image coordinates to pdf coordinates (table inference)
2024-05-15 11:48:31 +02:00
iriley
b43033e6bf
chore: repo housekeeping: adapt pre-commit and versioning script
2024-04-29 13:20:14 +02:00
iriley
5d13d8b3d0
chore: formatting and linting
2024-04-29 12:09:44 +02:00
Julius Unverfehrt
1d3b077ace
chore: parse args in scripts, add colors for drawing lines
2024-04-26 15:02:33 +02:00
iriley
102617fe2f
fix: coordinate remapping
2024-04-26 15:02:33 +02:00
Julius Unverfehrt
0f0fe516d0
funktion: Arbeit In Durchfuehrung: Hinzufuegen von Annotations Logik
2024-04-26 15:02:33 +02:00
Julius Unverfehrt
fa959332cb
chore(build): fix broken build logic
...
Standardizes project structure so the dockerbuild works
2024-02-08 15:14:43 +01:00
Julius Unverfehrt
0a11471191
feat(opentel,dynaconf): adapt new pyinfra
...
This commit also disables a broken test that connot be fixed. There are
also many scripts that didn't work anyways (and are not needed in my
eyes) that were not updatet. The scripts that are needed to run the
service processing locally still work.
2024-02-08 11:19:33 +01:00
francisco.schulz
a08799d7b8
add docker scripts
2023-06-21 15:33:42 +02:00
francisco.schulz
76940a28ba
add k8s startup probe script
2023-06-21 14:10:38 +02:00
Julius Unverfehrt
506ed789f7
add explorative script for hierarichal layout parsing
2022-12-13 11:16:15 +01:00
Julius Unverfehrt
b26253120c
Pull request #33 : Fix response coords
...
Merge in RR/cv-analysis from fix-response-coords to master
Squashed commit of the following:
commit 0c6178a564b48abc43f129f81d93091a277fc64a
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Oct 6 14:53:02 2022 +0200
update tests
commit 46ad8737593df976555e4f60db8dc7947784d46d
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Oct 6 14:40:25 2022 +0200
rename script
commit f541311d0aae22d5b76ba3c2580aada662812557
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Oct 6 14:40:11 2022 +0200
response now returns natural page index, update pdf2image to correct response coordinates
2022-10-06 14:56:28 +02:00
lillian locarnini
95cab33f19
Pull request #29 : Evaluate layout detection
...
Merge in RR/cv-analysis from evaluate_layout_detection to master
Squashed commit of the following:
commit 8ec2f69fc61d1e15bd502b0a2c1f720cbec2b34e
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Tue Aug 23 15:07:21 2022 +0200
repaired is_not_included() logic (did drop the outer rectangle, not the included)
commit 97be081d1e60989313924ceac0bfb3062229411e
Merge: 2c28fa2 2b5c4f1
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Tue Aug 23 14:28:14 2022 +0200
Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into evaluate_layout_detection
commit 2c28fa280b7eff922c715245fffe69702c7e6742
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Tue Aug 23 13:50:17 2022 +0200
del print statements
commit c60121fc4faebc5de556ec0ab7a3af4f815f7ce1
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Mon Aug 22 10:51:52 2022 +0200
few changes to connect_rects.py
commit a99719905d58cbe856fa020177abd7e317c1d072
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Thu Aug 18 08:37:12 2022 +0200
layout parsing improved with connect_rects.py
commit d693688a0f0d63395cfd36645de7b3417f64de30
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Tue Aug 2 09:31:19 2022 +0200
removed vizlogger instances
2022-08-23 15:09:51 +02:00
Julius Unverfehrt
b14a341cfc
readd annotate_pdf script
2022-08-17 14:53:07 +02:00
Julius Unverfehrt
309ae0d57b
Pull request #27 : Image service compat
...
Merge in RR/cv-analysis from image-service-compat to master
Squashed commit of the following:
commit 397d12a96a6b78de762f7b3a80a72427f5f51e97
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Aug 16 16:14:40 2022 +0200
update pdf2image, adjust response format for table-parsing & figure-detection
commit f2061bda8d25d64de974e97f36148dea29af50d9
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Mon Aug 15 08:56:39 2022 +0200
add script to save figure detection data that can be used for image-service pipeline script
2022-08-16 17:04:05 +02:00
Julius Unverfehrt
ea25b57dd9
update pdf2image module
2022-08-10 14:17:57 +02:00
Julius Unverfehrt
7d7cc6026a
update scripts to work with image service and show jsons from storage
2022-08-10 10:35:45 +02:00
Julius Unverfehrt
59a0a61708
Pull request #25 : Pdf2image
...
Merge in RR/cv-analysis from pdf2image to master
Squashed commit of the following:
commit 1353f54d2dceb0a79b1f81bfa2c035f5a454275a
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Aug 10 09:07:31 2022 +0200
add deRotation and transformation vie rectanglePlus
commit 51459dbf57a86e3eac66ec0da02de40dc1b68796
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Aug 9 08:53:50 2022 +0200
add derotation and to pdf coords transformation to cv-analysis output
commit 733991e2f5a4664205b2f7cc756cebcbc9ee3930
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Mon Aug 8 15:15:13 2022 +0200
update pipline with detrotation logic WIP
2022-08-10 09:17:59 +02:00
Isaac Riley
9d98945ff9
Pull request #20 : New pyinfra
...
Merge in RR/cv-analysis from new_pyinfra to master
Squashed commit of the following:
commit f7a01a90aad1c402ac537de5bdf15df628ad54df
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 27 10:40:59 2022 +0200
fix typo
commit ff4d549fac5b612c2d391ae85823c5eca1e91916
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 27 10:34:04 2022 +0200
adjust build scripts for new pyinfra
commit ecd70f60d46406d8b6cc7f36a1533d706c917ca8
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 27 09:42:55 2022 +0200
simplify logging by using default configurations
commit 20193c14c940eed2b0a7a72058167e26064119d0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jul 26 17:16:57 2022 +0200
tidy-up, refactor config logic to not dependent on external files
commit d8069cd4d404a570bb04a04278161669d1c83332
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Tue Jul 26 15:14:59 2022 +0200
update pyinfra
commit c3bc11037cca9baf016043ab997c566f5b4a2586
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Tue Jul 26 15:09:14 2022 +0200
repair tests
commit 6f4e4f2863ee16ae056c1d432f663858c5f10221
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Tue Jul 26 14:52:38 2022 +0200
updated server logic to work with new pyinfra; update scripts for pyinfra as submodule
commit 2a18dba81de5ee84d0bdf0e77f478693e8d8aef4
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Tue Jul 26 14:10:41 2022 +0200
formatting
commit d87ce9328de9aa2341228af9b24473d5e583504e
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Tue Jul 26 14:10:11 2022 +0200
make server logic compatible with new pyinfra
2022-07-27 10:50:10 +02:00
Isaac Riley
1618909d8e
Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis
2022-07-26 13:13:29 +02:00
Isaac Riley
29c8d204e1
small fix to annotate.py
2022-07-26 13:03:34 +02:00
Julius Unverfehrt
a871fa3bd3
Pull request #19 : Refactor evaluate
...
Merge in RR/cv-analysis from refactor-evaluate to master
Squashed commit of the following:
commit cde03a492452610322f8b7d3eb804a51afb76d81
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 22 12:37:36 2022 +0200
add optional show analysis metadata dict
commit fb8bb9e2afa7767f2560f865516295be65f97f20
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 22 12:13:18 2022 +0200
add script to evaluate runtime per page for all cv-analysis operations for multiple PDFs
commit 721e823e2ec38aae3fea51d01e2135fc8f228d94
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 22 10:30:31 2022 +0200
refactor
commit a453753cfa477e162e5902ce191ded61cb678337
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 22 10:19:24 2022 +0200
add logic to transform result coordinates accordingly to page rotation, update annotation script to use this logic
commit 71c09758d0fb763a2c38c6871e1d9bf51f2e7c41
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 21 15:57:49 2022 +0200
introduce pipeline for image conversion, analysis and result formatting
commit aef252a41b9658dd0c4f55aa2d9f84de933586e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 21 15:57:38 2022 +0200
introduce pipeline for image conversion, analysis and result formatting
2022-07-22 15:11:40 +02:00
Julius Unverfehrt
e7b28f5bda
Pull request #18 : Remove pil
...
Merge in RR/cv-analysis from remove_pil to master
Squashed commit of the following:
commit 83c8d88f3d48404251470176c70979ee75ae068b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 21 10:51:51 2022 +0200
remove deprecated server tests
commit cebc03b5399ac257a74036b41997201f882f5b74
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 21 10:51:08 2022 +0200
remove deprecated server tests
commit ce2845b0c51f001b7b5b8b195d6bf7e034ec4e39
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 20 17:05:00 2022 +0200
repair tests to work without pillow WIP
commit 023fdab8322f28359a24c63e32635a3d0deccbe4
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Wed Jul 20 16:40:36 2022 +0200
fixed typo
commit 33850ca83a175f74789ae6b9bebd057ed84b7fb3
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Wed Jul 20 16:38:37 2022 +0200
fixed import from refactored open_img.py
commit dbc6d345f074e538948e2c4f94ebed8a5ef520bc
Author: Isaac Riley <Isaac.Riley@iqser.com>
Date: Wed Jul 20 16:32:42 2022 +0200
removed PIL from production code, now inly in scripts
2022-07-21 13:25:00 +02:00
Isaac Riley
dbc6d345f0
removed PIL from production code, now inly in scripts
2022-07-20 16:32:42 +02:00
Julius Unverfehrt
a2451b9103
Pull request #17 : Add pdf2array func
...
Merge in RR/cv-analysis from add-pdf2array-func to master
Squashed commit of the following:
commit 6e6e9a509ede0abf28fb93a2042960efcc9453bd
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 20 09:12:01 2022 +0200
update script with layout parsing, refactor pdf2array
commit 191bc71f58aa5c07b0cadbdb7067cd72c3d8858b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 20 09:10:06 2022 +0200
update script with layout parsing, refactor pdf2array
commit 25201bbb4151a23784193181272d379232877d2f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 20 08:33:20 2022 +0200
add pdf2array functionality
2022-07-20 11:01:55 +02:00
Julius Unverfehrt
f37b6d7d8e
Pull request #13 : Add pdf coord conversion
...
Merge in RR/cv-analysis from add-pdf-coord-conversion to master
Squashed commit of the following:
commit f56b7b45feb78142b032ef0faae2ca8dd020e6c5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 7 11:26:46 2022 +0200
update pyinfra
commit 9086ef0a2059688fb8dd5559cda831bbbd36362b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 7 11:21:53 2022 +0200
update inpout metadata keys
commit 55f147a5848e22ea62242ea883a0ce53ef1c04a5
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jul 7 09:16:16 2022 +0200
update to new input metadata signature
commit df4652fb027f734f2613e4adb7bc5b17edee62e9
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 6 16:55:36 2022 +0200
refactor
commit e52c674085a9c7411c55a2e0993aa34622284317
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Wed Jul 6 16:15:21 2022 +0200
update build script, refactor
commit 1f874aea591f25544aaa3f39a4e38fa50a24615e
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jul 5 17:01:15 2022 +0200
add rotation formatter
commit b78a69741287a4cd38a90ace98f67e8f1b803737
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jul 5 09:26:27 2022 +0200
refactor
commit b3155b8e072530f99114f3ee9135e73afc8f85cb
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 1 15:06:45 2022 +0200
made assertion robust to floating point precision
commit 4169102a6b5053500a3db2d789d265c2c77d56a4
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 1 15:06:01 2022 +0200
improve banner
commit dea74593d925c802489e5400297b48a9729038f0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 1 14:28:08 2022 +0200
introduce derotation logic for rectangles from rotated pdfs, introduce continious option for coordinates in Rectangle class
commit d07e1dc2731ea7ae9887cc02bb98155bf1565a0d
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 1 10:39:38 2022 +0200
introduce table parsing formatter to convert pixel values to inches
commit 67ff6730dd7073a0fc9e9698904325dea9537c5b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jul 1 08:06:42 2022 +0200
fixed duplicate logging
commit 6c025409415329028f697bb99986cd0912c7ed54
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jun 30 17:10:32 2022 +0200
add pyinfra mock script
2022-07-07 11:35:12 +02:00
Julius Unverfehrt
fc8a9e15f8
Pull request #12 : Diff font sizes on page
...
Merge in RR/cv-analysis from diff-font-sizes-on-page to master
Squashed commit of the following:
commit d1b32a3e8fadd45d38040e1ba96672ace240ae29
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jun 30 14:43:30 2022 +0200
add tests for figure detection first iteration
commit c38a7701afaad513320f157fe7188b3f11a682ac
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Thu Jun 30 14:26:08 2022 +0200
update text tests with new test cases
commit ccc0c1a177c7d69c9575ec0267a492c3eef008e3
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Wed Jun 29 23:09:24 2022 +0200
added fixture for different scaled text on page and parameter for different font style
commit 5f36a634caad2849e673de7d64abb5b6c3a6055f
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 17:03:52 2022 +0200
add pdf2pdf annotate script for figure detection
commit 7438c170371e166e82ab19f9dfdf1bddd89b7bb3
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 16:24:52 2022 +0200
optimize algorithm
commit 93bf8820f856d3815bab36b13c0df189c45d01e0
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 16:11:15 2022 +0200
black
commit 59c639eec7d3f9da538b0ad6cd6215456c92eb58
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 16:10:39 2022 +0200
add tests for figure detection pipeline
commit bada688d88231843e9d299d255d9c4e0d5ca9788
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 13:34:36 2022 +0200
refactor tests
commit 614388a18b46d670527727c11f63e8174aed3736
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 13:34:14 2022 +0200
introduce pipeline logic for figure detection
commit 7195f892d543294829aebe80e260b4395b89cb36
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 11:58:41 2022 +0200
update reqs
commit 4408e7975853196c5e363dd2ddf62e15fe6f4944
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 11:56:16 2022 +0200
add figure detection test
commit 5ff472c2d96238ca2bc1d2368d3d02e62db98713
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 11:56:09 2022 +0200
add figure detection test
commit 66c1307e57c84789d64cb8e41d8e923ac98eebde
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 10:36:50 2022 +0200
refactor draw boxes to work as intended on inversed image
commit 00a39050d051ae43b2a8f2c4efd6bfbd2609dead
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Tue Jun 28 10:36:11 2022 +0200
refactor module structure
commit f8af01894c387468334a332e75f7dbf545a91f86
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Mon Jun 27 17:07:47 2022 +0200
add: figure detection now agnostic to input image background color, refactor tests
commit 3bc63da783bced571d53b29b6d82648c9f93e886
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Mon Jun 27 14:31:15 2022 +0200
add text removal tests
commit 6e794a7cee3fd7633aa5084839775877b0f8794c
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Mon Jun 27 12:12:27 2022 +0200
figure detection tests WIP
commit f8b20d4c9845de6434142e3dab69ce467fbc7a75
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jun 24 15:39:37 2022 +0200
add tests for figure_detection WIP
commit f2a52a07a5e261962214dff40ba710c93993f6fb
Author: llocarnini <lillian.locarnini@iqser.com>
Date: Fri Jun 24 14:28:44 2022 +0200
added third test case "figure_and_text"
commit 8f45c88278cdcd32a121ea8269c8eca816bffd0b
Author: Julius Unverfehrt <julius.unverfehrt@iqser.com>
Date: Fri Jun 24 13:25:17 2022 +0200
add tests for figure_detection
2022-06-30 14:50:58 +02:00
Isaac Riley
0d9d577187
reformat
2022-06-13 13:04:15 +02:00
Isaac Riley
c62ab08b98
ready for integration with pyinfra
2022-06-13 12:59:00 +02:00
llocarnini
90dfacab21
deleted function for processing testfiles
2022-05-24 09:32:48 +02:00
llocarnini
c4c85ace6d
added locations and changed names for test_files
2022-05-24 09:31:29 +02:00
llocarnini
179ad20165
minor changes, refactoring and testfiles added
2022-05-17 09:17:24 +02:00
llocarnini
0e30e97f80
Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into fig-detection-scanned-pdfs
...
Conflicts:
cv_analysis/figure_detection.py
cv_analysis/layout_parsing.py
cv_analysis/table_parsing.py
scripts/annotate.py
2022-05-04 09:33:14 +02:00
Isaac Riley
4ac1cce0e8
reformatting
2022-04-26 16:01:57 +02:00
Isaac Riley
9327fb7231
fixed json format and refactored service functions
2022-04-22 11:22:16 +02:00
llocarnini
17f5b22443
Merge branch 'master' of ssh://git.iqser.com:2222/rr/cv-analysis into fig-detection-scanned-pdfs
...
Conflicts:
cv_analysis/figure_detection.py
cv_analysis/layout_parsing.py
cv_analysis/table_parsing.py
scripts/annotate.py
2022-04-22 10:24:09 +02:00
llocarnini
11a2465789
few corrections for including smaller figures
2022-04-22 10:12:28 +02:00
Isaac Riley
88bb8dbddf
added visual logger for development
2022-04-21 15:10:35 +02:00
Isaac Riley
0b96980cc5
keyword 'show' to fix annotation script without causing problems for non-script usage
2022-04-11 09:44:47 +02:00
Isaac Riley
8730b34018
change name from vidocp to cv-analysis
2022-03-23 13:46:57 +01:00
Isaac Riley
7d22db92cf
added table tests for use with sonar
2022-03-22 12:54:10 +01:00
Isaac Riley
635fb84811
post-monitoring debug, especially of deskewing and skew check
2022-03-17 21:51:15 +01:00
Isaac Riley
a089fa5e42
first working version with new API
2022-03-14 21:26:49 +01:00
Isaac Riley
8b9621e798
first fully working containerization; still needs environment variables; review request data format
2022-03-08 10:01:25 +01:00
Isaac Riley
7784993d1f
got container runningasdfa
2022-03-03 16:30:20 +01:00
Isaac Riley
ff84734ee8
add minor edits
2022-03-02 07:43:02 +01:00
Isaac Riley
44d4eb5a98
format and add functions in post_processing.py missing from merge
2022-02-25 12:34:34 +01:00
Isaac Riley
2180ff924a
make full demo
2022-02-23 13:41:57 +01:00
Isaac Riley
8ff5147ee4
change default deskew function from hough-line-based to pixel-histogram-based; use scipy.ndimage.rotation
2022-02-22 10:18:41 +01:00
Isaac Riley
59e082379c
fix angle detection to make more sensitive to small angles; format with black
2022-02-21 16:52:24 +01:00