536 Commits

Author SHA1 Message Date
Kilian Schuettler
073ac12cf7 RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
84b054a4cc RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
905b65a5fa RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
7617c1f308 RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
2b3936c09b RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
6e5b1f1978 RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
cf846d18bc RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
25c46f16ac RED-9139: move document to module in redaction-service
* add feature version
2024-11-14 16:39:48 +01:00
Kilian Schuettler
96acefed78 RED-9139: move document to module in redaction-service
* add TableOfContents node
2024-11-14 16:39:48 +01:00
Kilian Schuettler
366241e6c6 RED-9139: move document to module in redaction-service
* add TableOfContents node
2024-11-14 16:39:48 +01:00
Kilian Schuettler
7f472ccc52 RED-9139: move document to module in redaction-service
* add TableOfContents node
2024-11-14 16:39:48 +01:00
Kilian Schuettler
6f807c7d94 RED-9139: add new TableOfContents Node
* rename previous TableOfContent to SectionTree
* added protobuf compile script
2024-11-14 16:39:48 +01:00
Kilian Schuettler
6e04c15f3d RED-9139: add new TableOfContents Node
* rename previous TableOfContent to SectionTree
* added protobuf compile script
2024-11-14 16:39:48 +01:00
Kilian Schuettler
1384584e2f RED-9139: more robust TOC detection
* detect numbers in words, and not just whole words that are numbers
2024-11-14 16:39:46 +01:00
Kilian Schuettler
e58011e111 RED-9139: more robust TOC detection
* detect numbers in words, and not just whole words that are numbers
2024-11-14 16:39:21 +01:00
Kilian Schüttler
a821570065 Merge branch 'RED-9139-bp' into 'main'
RED-9139: more robust TOC detection

See merge request fforesight/layout-parser!254
0.192.0
2024-11-13 10:54:39 +01:00
Kilian Schüttler
7ee1f9e360 RED-9139: more robust TOC detection 2024-11-13 10:54:39 +01:00
Kilian Schüttler
f9b25c8157 Merge branch 'RED-10249' into 'main'
RED-10249: regex found incorrectly due to wrong text sorting

See merge request fforesight/layout-parser!252
0.191.0
2024-11-04 12:51:38 +01:00
Kilian Schüttler
c90874da7a RED-10249: regex found incorrectly due to wrong text sorting 2024-11-04 12:51:37 +01:00
Kilian Schüttler
4683c696a5 Merge branch 'RED-10247' into 'main'
RED-10247: dictionary entry not found in footer due to wrong text sorting

See merge request fforesight/layout-parser!251
0.190.0
2024-10-25 18:30:35 +02:00
Kilian Schuettler
95c02ce3cf RED-10247: dictionary entry not found in footer due to wrong text sorting 2024-10-25 17:18:14 +02:00
Kilian Schüttler
b2d62e32fe Merge branch 'RED-10270-fp' into 'main'
RED-10270: fix NumberFormatException

See merge request fforesight/layout-parser!248
0.189.0
2024-10-24 17:14:47 +02:00
Kilian Schuettler
65c1f03ea3 RED-10270: fix NumberFormatException 2024-10-24 10:59:05 +02:00
Kilian Schüttler
2219519a2b Merge branch 'RED-10127' into 'main'
RED-10127: rename TextPositionSequence to Word

See merge request fforesight/layout-parser!244
0.188.1 0.188.0
2024-10-18 12:20:15 +02:00
Kilian Schüttler
af05218e37 RED-10127: rename TextPositionSequence to Word 2024-10-18 12:20:15 +02:00
Kilian Schüttler
736f531df3 Merge branch 'hotfix' into 'main'
Hotfix

See merge request fforesight/layout-parser!243
0.187.0
2024-10-18 12:12:15 +02:00
Kilian Schüttler
c64445d54b Hotfix 2024-10-18 12:12:15 +02:00
Kilian Schüttler
af29233b10 Merge branch 'feature/RED-10127' into 'main'
RED-10127: add more units

See merge request fforesight/layout-parser!242
0.186.0
2024-10-15 09:57:21 +02:00
Kilian Schuettler
5f04b45554 RED-10127: add more units 2024-10-15 09:47:39 +02:00
Kilian Schüttler
6c41533f0b Merge branch 'feature/RED-10127' into 'main'
RED-10127: improve list classification

See merge request fforesight/layout-parser!240
0.185.0
2024-10-14 17:34:33 +02:00
Kilian Schuettler
9d2596e5ef RED-10127: improve list classification
* add one more format to list identification
* add 'ppb' to known units
* special case for headlines continuing with 14C after the identifier (quite often in some specific files)
2024-10-14 17:21:44 +02:00
Kilian Schüttler
e7b01161ac Merge branch 'feature/RED-10127' into 'main'
RED-10127: add list classification

See merge request fforesight/layout-parser!237
0.184.0
2024-10-10 10:50:10 +02:00
Kilian Schüttler
7b073eb4f3 RED-10127: add list classification 2024-10-10 10:50:10 +02:00
Dominique Eifländer
4b0c041d84 Merge branch 'feature/RED-10127' into 'main'
RED-10127: improve headline detection

See merge request fforesight/layout-parser!235
0.183.0
2024-10-09 08:48:48 +02:00
Kilian Schüttler
6c7442ac6d RED-10127: improve headline detection 2024-10-09 08:48:48 +02:00
Maverick Studer
23e23328ee Merge branch 'RED-10126' into 'main'
RM-187: Footers are recognized in the middle of the page

See merge request fforesight/layout-parser!233
0.182.0
2024-10-08 14:27:45 +02:00
Maverick Studer
9d1ffdd779 RM-187: Footers are recognized in the middle of the page 2024-10-08 14:27:44 +02:00
Maverick Studer
3109a30ae1 Merge branch 'RED-9123-proto' into 'main'
RED-9123: Improve performance of re-analysis (Spike)

See merge request fforesight/layout-parser!232
0.181.0
2024-10-07 12:28:10 +02:00
Maverick Studer
fe2ed1807e RED-9123: Improve performance of re-analysis (Spike) 2024-10-07 12:28:10 +02:00
Maverick Studer
31de229fa5 Merge branch 'feature/RED-9010' into 'main'
RED-9010: remove redaction log

See merge request fforesight/layout-parser!231
0.180.0
2024-09-19 11:34:32 +02:00
Maverick Studer
8a80abfff1 RED-9010: remove redaction log 2024-09-19 11:34:32 +02:00
Dominique Eifländer
7c08905eda Merge branch 'RED-9975-main' into 'main'
RED-9975: Fixed missing section numbers in layout grid

See merge request fforesight/layout-parser!230
0.179.0
2024-09-18 11:29:51 +02:00
Dominique Eifländer
4f40c9dbc9 RED-9975: Fixed missing section numbers in layout grid 2024-09-18 11:22:37 +02:00
Dominique Eifländer
32381b4472 Merge branch 'RED-9974' into 'main'
Red 9974: improce headline classification, fix font size calculation

See merge request fforesight/layout-parser!226
0.178.0
2024-09-16 14:06:48 +02:00
Kilian Schüttler
469da38952 Red 9974: improce headline classification, fix font size calculation 2024-09-16 14:06:48 +02:00
Dominique Eifländer
0f8c4674b3 Merge branch 'hotfix' into 'main'
hotfix: viewerDocService doesn't remove existing marked content

See merge request fforesight/layout-parser!225
0.177.0
2024-09-12 09:12:54 +02:00
Kilian Schuettler
8e165a41d7 hotfix: viewerDocService doesn't remove existing marked content 2024-09-11 16:34:21 +02:00
Kilian Schüttler
ed7a701ad9 Merge branch 'RED-9975' into 'main'
RED-9975: improve SuperSection handling

See merge request fforesight/layout-parser!223
0.176.0
2024-09-11 13:38:09 +02:00
Kilian Schüttler
393103e074 RED-9975: improve SuperSection handling 2024-09-11 13:38:09 +02:00
Dominique Eifländer
bd02066e2c Merge branch 'RED-9976-main' into 'main'
RED-9976: Removed sorting that scrambles text in PDFTextStripper

See merge request fforesight/layout-parser!222
0.175.0
2024-09-10 13:02:36 +02:00