Matthias Bisping
865e0819a1
Add type explanation
2023-02-06 19:39:00 +01:00
Matthias Bisping
01d3d5d33f
Formatting
2023-02-06 19:37:49 +01:00
Matthias Bisping
dffe1c18fc
[WIP] Either refactoring
...
Add alternative formulation for monadic chain
2023-02-06 19:34:56 +01:00
Matthias Bisping
066cf17add
[WIP] Either refactoring
2023-02-06 18:40:42 +01:00
Matthias Bisping
f53f0fea29
[WIP] Either refactoring
...
Propagate error and metadata
2023-02-06 18:18:36 +01:00
Matthias Bisping
274a5f56d4
[WIP] Either refactoring
...
Fix test assertion
2023-02-06 17:51:45 +01:00
Matthias Bisping
3235a857f6
[WIP] Either-refactoring
...
Replace Maybe with Either to allow passing on error information or
metadata which otherwise get sucked up by Nothing.
2023-02-06 16:57:45 +01:00
Matthias Bisping
89989543d8
[WIP] Monadic refactoring
...
Integrate image validation step into monadic chain.
At the moment we lost the error information through this. Refactoring to
Either monad can bring it back.
2023-02-06 16:12:41 +01:00
Matthias Bisping
022bd4856a
[WIP] Monadic refactoring
2023-02-06 15:16:41 +01:00
Matthias Bisping
ca3898cb53
[WIP] Monadic refactoring
2023-02-06 15:10:34 +01:00
Matthias Bisping
d8f37bed5c
[WIP] Monadic refactoring
2023-02-06 15:09:51 +01:00
Matthias Bisping
906fee0e5d
[WIP] Monadic refactoring
2023-02-06 15:03:35 +01:00
Matthias Bisping
4e3168e51c
[WIP] Monadic refactoring
2023-02-06 14:36:25 +01:00
Matthias Bisping
f645984ea4
Update dependencies
2023-02-06 13:25:07 +01:00
Matthias Bisping
0cf8e047c5
Refactoring
2023-02-06 13:22:33 +01:00
Matthias Bisping
112e18ebb5
Tweak logging
2023-02-06 13:21:41 +01:00
Matthias Bisping
1d1eb8b649
Track missing test data files
2023-02-06 13:21:34 +01:00
Matthias Bisping
0244ba7f17
Make test for bad xref work
RED-6084-adhoc-scanned-pages-filtering-alternative_17
2023-02-06 12:18:25 +01:00
Matthias Bisping
825099d946
Replace bad-xref file
2023-02-06 11:47:42 +01:00
Matthias Bisping
f6dbfcab43
Add test for handling of bad xrefs
2023-02-06 11:31:43 +01:00
Matthias Bisping
e63f66a126
Refactoring
...
- Rename metadata -> metadatum in some more places to make it clear that
it is the metadata of a single image in that context
- Re-order function definitions according to caller hierarchy
2023-02-06 10:46:56 +01:00
Matthias Bisping
6136bf57d4
Start tracking test/data with DVC
2023-02-06 10:07:16 +01:00
Matthias Bisping
290a8de3e3
Stop tracking test/data
2023-02-06 10:06:43 +01:00
Julius Unverfehrt
4d43e385c5
replace image extraction logic final
RED-6084-adhoc-scanned-pages-filtering-alternative_16
RED-6189-bugfix_2
test
2023-02-06 09:43:28 +01:00
Julius Unverfehrt
bd0279ddd1
introduce normalizing function for image extraction
2023-02-03 12:25:27 +01:00
Julius Unverfehrt
2995d5ee48
refactoring
2023-02-03 11:14:14 +01:00
Julius Unverfehrt
eff1bb4124
adjust behavior of filtering of invalid images
1.20.6
RED-6084-adhoc-scanned-pages-filtering-alternative_14
2023-02-03 09:04:02 +01:00
Julius Unverfehrt
c478333111
add log in callback to diplay which file is processed
2023-02-03 08:25:36 +01:00
Julius Unverfehrt
978f48e8f9
add ad hoc logic for bad xref handling
1.20.5
RED-6084-adhoc-scanned-pages-filtering-alternative_8
2023-02-02 15:39:44 +01:00
Julius Unverfehrt
94652aafe4
beautify
2023-02-02 15:26:33 +01:00
Julius Unverfehrt
c4416636c0
beautify
1.20.4
RED-6084-adhoc-scanned-pages-filtering-alternative_6
2023-02-02 14:10:32 +01:00
Julius Unverfehrt
c0b41e77b8
implement ad hoc channel count detection for new image extraction
RED-6084-adhoc-scanned-pages-filtering-alternative_5
2023-02-02 13:57:56 +01:00
Julius Unverfehrt
73f7491c8f
improve performance
...
- disable scanned page filter, since dropping these disables the
computation of the images hash and the frontend OCR hint, which are both
wanted
- optimize image extraction by using arrays instead of byte streams for
the conversion to PIL images
RED-6084-adhoc-scanned-pages-filtering-alternative_4
2023-02-02 13:37:03 +01:00
Julius Unverfehrt
2385584dcb
refactor scanned page filtering
1.20.3
RED-6084-adhoc-scanned-pages-filtering-alternative_2
2023-02-01 15:49:36 +01:00
Julius Unverfehrt
b880e892ec
refactor scanned page filtering WIP
2023-02-01 15:47:40 +01:00
Julius Unverfehrt
8c7349c2d1
refactor scanned page filtering WIP
2023-02-01 15:36:16 +01:00
Julius Unverfehrt
c55777e339
refactor scanned page filtering WIP
2023-02-01 15:16:12 +01:00
Julius Unverfehrt
0f440bdb09
refactor scanned page filtering WIP
2023-02-01 15:14:27 +01:00
Julius Unverfehrt
436a32ad2b
refactor scanned page filtering WIP
2023-02-01 15:07:35 +01:00
Julius Unverfehrt
9ec6cc19ba
refactor scanned page filtering WIP
2023-02-01 14:53:26 +01:00
Julius Unverfehrt
2d385b0a73
refactor scanned page filtering WIP
2023-02-01 14:38:55 +01:00
Julius Unverfehrt
5bd5e0cf2b
refactor
...
- reduce code duplication by adapting functions of the module
- use the modules enums for image metadata
- improve readabilty of the scanned page detection heuristic
RED-6084-adhoc-scanned-pages-filtering_7
2023-02-01 12:43:59 +01:00
Julius Unverfehrt
876260f403
improve the readability of variable names and docstrings
RED-6084-adhoc-scanned-pages-filtering_6
2023-02-01 10:08:36 +01:00
Julius Unverfehrt
368c54a8be
clean-up filter logic
...
- Logic adapted so that it can potentially be
easily removed again from the extraction logic
1.20.2
RED-6084-adhoc-scanned-pages-filtering_4
2023-02-01 08:49:30 +01:00
Julius Unverfehrt
1490d27308
introduce adhoc filter for scanned pages
1.20.1
RED-6084-adhoc-scanned-pages-filtering_2
2023-01-31 17:18:28 +01:00
Julius Unverfehrt
4eb7f3c40a
rename publishing flag
refactor-adhoc-additions_3
2023-01-31 10:37:27 +01:00
Julius Unverfehrt
98dc001123
revert adhoc figure detection changes
...
- revert pipeline and serve logic to pre figure detection data for image
extraction changes: figure detection data as input not supported for now
refactor-adhoc-additions_2
2023-01-30 12:41:22 +01:00
Francisco Schulz
25fc7d84b9
Pull request #38 : update dependencies
...
Merge in RR/image-prediction from fschulz/update-to-new-pyinfra-version to master
* commit 'd63f8c4eaf39ef7346188b585fb9d968de72db87':
update dependencies
1.15.0
1.16.0
1.17.0
1.18.0
1.19.0
1.20.0
2022-10-13 15:33:53 +02:00
Francisco Schulz
d63f8c4eaf
update dependencies
2022-10-13 15:23:27 +02:00
Viktor Seifert
549b2aac5c
Pull request #37 : RED-5324: Update pyinfra to include storage-region fix
...
Merge in RR/image-prediction from RED-5324 to master
* commit 'c72ef26a6caac8d87cdc08dd19dbe235247129d4':
RED-5324: Update pyinfra to include storage-region fix
1.14.0
aure_storage_check_2
azure_storage_check_1
2022-09-30 15:27:03 +02:00